This package provides a novel clustering algorithm and toolkit RCSL (Rank Constrained Similarity Learning) to accurately identify various cell types using scRNA-seq data from a complex tissue. RCSL considers both lo-cal similarity and global similarity among the cells to discern the subtle differences among cells of the same type as well as larger differences among cells of different types. RCSL uses Spearman’s rank correlations of a cell’s expression vector with those of other cells to measure its global similar-ity, and adaptively learns neighbour representation of a cell as its local similarity. The overall similar-ity of a cell to other cells is a linear combination of its global similarity and local similarity.
Interface for multiple data sources, such as the `EDDS` API <https://evds2.tcmb.gov.tr/index.php?/evds/userDocs> of the Central Bank of the Republic of Türkiye and the `FRED` API <https://fred.stlouisfed.org/docs/api/fred/> of the Federal Reserve Bank. Both data providers require API keys for access, which users can easily obtain by creating accounts on their respective websites. The package provides caching ability with the selection of periods to increase the speed and efficiency of requests. It combines datasets requested from different sources, helping users when the data has common frequencies. While combining data frames whenever possible, it also keeps all requested data available as separate data frames to increase efficiency.
This package provides a fast dimensionality reduction method scalable to large numbers of samples. Landmark Multi-Dimensional Scaling (LMDS) is an extension of classical Torgerson MDS, but rather than calculating a complete distance matrix between all pairs of samples, only the distances between a set of landmarks and the samples are calculated.
This package provides tools to perform model selection alongside estimation under Linear, Logistic, Negative binomial, Quantile, and Skew-Normal regression. Under the spike-and-slab method, a probability for each possible model is estimated with the posterior mean, credibility interval, and standard deviation of coefficients and parameters under the most probable model.
An implementation of a variety of escalation with overdose control designs introduced by Babb, Rogatko and Zacks (1998) <doi:10.1002/(SICI)1097-0258(19980530)17:10%3C1103::AID-SIM793%3E3.0.CO;2-9>. It calculates the next dose as a clinical trial proceeds and performs simulations to obtain operating characteristics.
This package contains the core functions associated with Fast Regularized Canonical Correlation Analysis. Please see the following for details: Raul Cruz-Cano, Mei-Ling Ting Lee, Fast regularized canonical correlation analysis, Computational Statistics & Data Analysis, Volume 70, 2014, Pages 88-100, ISSN 0167-9473 <doi:10.1016/j.csda.2013.09.020>.
Lints are code patterns that are not optimal because they are inefficient, forget corner cases, or are less readable. flir provides a small set of functions to detect those lints and automatically fix them. It builds on astgrepr', which itself uses the Rust crate ast-grep to parse and navigate R code.
Wrapper for Stan that offers a number of in-built models to implement a hierarchical Bayesian longitudinal model for repeat observation data. Model choice selects the differential equation that is fit to the observations. Single and multi-individual models are available. O'Brien et al. (2024) <doi:10.1111/2041-210X.14463>.
Allows to estimate and test high-dimensional mediation effects based on advanced mediator screening and penalized regression techniques. Methods used in the package refer to Zhang H, Zheng Y, Hou L, Liu L, HIMA: An R Package for High-Dimensional Mediation Analysis. Journal of Data Science. (2025). <doi:10.6339/25-JDS1192>.
Focuses on data processing and visualization in hydrology and climate forecasting. Main function includes data extraction, data downscaling, data resampling, gap filler of precipitation, bias correction of forecasting data, flexible time series plot, and spatial map generation. It is a good pre- processing and post-processing tool for hydrological and hydraulic modellers.
This package implements approaches of non-parametric smooth test to compare simultaneously K(K>1) copulas and non-parametric clustering of multivariate populations with arbitrary sizes. See Yves I. Ngounou Bakam and Denys Pommeret (2022) <arXiv:2112.05623> and Yves I. Ngounou Bakam and Denys Pommeret (2022) <arXiv:2211.06338>.
This package provides mailmerge methods for reading spreadsheets of addresses and other relevant information to create standardized but customizable letters. Provides a method for mapping US ZIP codes, including those of letter recipients. Provides a method for parsing and processing html code from online job postings of the American Political Science Association.
An implementation of Multi-Task Logistic Regression (MTLR) for R. This package is based on the method proposed by Yu et al. (2011) which utilized MTLR for generating individual survival curves by learning feature weights which vary across time. This model was further extended to account for left and interval censored data.
This package provides a complete toolkit to process the Munich ChronoType Questionnaire (MCTQ) for its three versions (standard, micro, and shift). MCTQ is a quantitative and validated tool to assess chronotypes using peoples sleep behavior, originally presented by Till Roenneberg, Anna Wirz-Justice, and Martha Merrow (2003, <doi:10.1177/0748730402239679>).
Uses a kernel smoothing approach to calculate Mutual Information for comparisons between all types of variables including continuous vs continuous, continuous vs discrete and discrete vs discrete. Uses a nonparametric bias correction giving Bias Corrected Mutual Information (BCMI). Implemented efficiently in Fortran 95 with OpenMP and suited to large genomic datasets.
Estimates ordered probit switching regression models - a Heckman type selection model with an ordinal selection and continuous outcomes. Different model specifications are allowed for each treatment/regime. For more details on the method, see Wang & Mokhtarian (2024) <doi:10.1016/j.tra.2024.104072> or Chiburis & Lokshin (2007) <doi:10.1177/1536867X0700700202>.
This work is an extension of the state space model for Poisson count data, Poisson-Gamma model, towards a semiparametric specification. Just like the generalized additive models (GAM), cubic splines are used for covariate smoothing. The semiparametric models are fitted by an iterative process that combines maximization of likelihood and backfitting algorithm.
This package provides a multiway method to decompose a tensor (array) of any order, as a generalisation of SVD also supporting non-identity metrics and penalisations. 2-way SVD with these extensions is also available. The package includes also some other multiway methods: PCAn (Tucker-n) and PARAFAC/CANDECOMP with these extensions.
This package provides functions to estimate the density and size of a spatially distributed animal population sampled with an array of passive detectors, such as traps, or by searching polygons or transects. Models incorporating distance-dependent detection are fitted by maximizing the likelihood. Tools are included for data manipulation and model selection.
Sparse redundancy analysis for high dimensional (biomedical) data. Directional multivariate analysis to express the maximum variance in the predicted data set by a linear combination of variables of the predictive data set. Implemented in a partial least squares framework, for more details see Csala et al. (2017) <doi:10.1093/bioinformatics/btx374>.
We described a novel Topology-based pathway enrichment analysis, which integrated the global position of the nodes and the topological property of the pathways in Kyoto Encyclopedia of Genes and Genomes Database. We also provide some functions to obtain the latest information about pathways to finish pathway enrichment analysis using this method.
Calculate point estimates of and valid confidence intervals for nonparametric, algorithm-agnostic variable importance measures in high and low dimensions, using flexible estimators of the underlying regression functions. For more information about the methods, please see Williamson et al. (Biometrics, 2020), Williamson et al. (JASA, 2021), and Williamson and Feng (ICML, 2020).
Frequentist sequential meta-analysis based on Trial Sequential Analysis (TSA) in programmed in Java by the Copenhagen Trial Unit (CTU). The primary function is the calculation of group sequential designs for meta-analysis to be used for planning and analysis of both prospective and retrospective sequential meta-analyses to preserve type-I-error control under sequential testing. RTSA includes tools for sample size and trial size calculation for meta-analysis and core meta-analyses methods such as fixed-effect and random-effects models and forest plots. TSA is described in Wetterslev et. al (2008) <doi:10.1016/j.jclinepi.2007.03.013>. The methods for deriving the group sequential designs are based on Jennison and Turnbull (1999, ISBN:9780849303166).
Low-rank matrix decompositions are fundamental tools and widely used for data analysis, dimension reduction, and data compression. Classically, highly accurate deterministic matrix algorithms are used for this task. However, the emergence of large-scale data has severely challenged our computational ability to analyze big data. The concept of randomness has been demonstrated as an effective strategy to quickly produce approximate answers to familiar problems such as the singular value decomposition (SVD). This package provides several randomized matrix algorithms such as the randomized singular value decomposition (rsvd), randomized principal component analysis (rpca), randomized robust principal component analysis (rrpca), randomized interpolative decomposition (rid), and the randomized CUR decomposition (rcur). In addition several plot functions are provided.