This package contains the core functions associated with Fast Regularized Canonical Correlation Analysis. Please see the following for details: Raul Cruz-Cano, Mei-Ling Ting Lee, Fast regularized canonical correlation analysis, Computational Statistics & Data Analysis, Volume 70, 2014, Pages 88-100, ISSN 0167-9473 <doi:10.1016/j.csda.2013.09.020>.
Wrapper for Stan that offers a number of in-built models to implement a hierarchical Bayesian longitudinal model for repeat observation data. Model choice selects the differential equation that is fit to the observations. Single and multi-individual models are available. O'Brien et al. (2024) <doi:10.1111/2041-210X.14463>.
Allows to estimate and test high-dimensional mediation effects based on advanced mediator screening and penalized regression techniques. Methods used in the package refer to Zhang H, Zheng Y, Hou L, Liu L, HIMA: An R Package for High-Dimensional Mediation Analysis. Journal of Data Science. (2025). <doi:10.6339/25-JDS1192>.
Focuses on data processing and visualization in hydrology and climate forecasting. Main function includes data extraction, data downscaling, data resampling, gap filler of precipitation, bias correction of forecasting data, flexible time series plot, and spatial map generation. It is a good pre- processing and post-processing tool for hydrological and hydraulic modellers.
Efficient sampling from high-dimensional truncated Gaussian distributions, or multivariate truncated normal (MTN). Techniques include zigzag Hamiltonian Monte Carlo as in Akihiko Nishimura, Zhenyu Zhang and Marc A. Suchard (2024) <doi:10.1080/01621459.2024.2395587>, and harmonic Monte in Ari Pakman and Liam Paninski (2014) <doi:10.1080/10618600.2013.788448>.
This package implements approaches of non-parametric smooth test to compare simultaneously K(K>1) copulas and non-parametric clustering of multivariate populations with arbitrary sizes. See Yves I. Ngounou Bakam and Denys Pommeret (2022) <arXiv:2112.05623> and Yves I. Ngounou Bakam and Denys Pommeret (2022) <arXiv:2211.06338>.
Uses a kernel smoothing approach to calculate Mutual Information for comparisons between all types of variables including continuous vs continuous, continuous vs discrete and discrete vs discrete. Uses a nonparametric bias correction giving Bias Corrected Mutual Information (BCMI). Implemented efficiently in Fortran 95 with OpenMP and suited to large genomic datasets.
This package provides mailmerge methods for reading spreadsheets of addresses and other relevant information to create standardized but customizable letters. Provides a method for mapping US ZIP codes, including those of letter recipients. Provides a method for parsing and processing html code from online job postings of the American Political Science Association.
This package provides a complete toolkit to process the Munich ChronoType Questionnaire (MCTQ) for its three versions (standard, micro, and shift). MCTQ is a quantitative and validated tool to assess chronotypes using peoples sleep behavior, originally presented by Till Roenneberg, Anna Wirz-Justice, and Martha Merrow (2003, <doi:10.1177/0748730402239679>).
An implementation of Multi-Task Logistic Regression (MTLR) for R. This package is based on the method proposed by Yu et al. (2011) which utilized MTLR for generating individual survival curves by learning feature weights which vary across time. This model was further extended to account for left and interval censored data.
Estimates ordered probit switching regression models - a Heckman type selection model with an ordinal selection and continuous outcomes. Different model specifications are allowed for each treatment/regime. For more details on the method, see Wang & Mokhtarian (2024) <doi:10.1016/j.tra.2024.104072> or Chiburis & Lokshin (2007) <doi:10.1177/1536867X0700700202>.
This package provides a multiway method to decompose a tensor (array) of any order, as a generalisation of SVD also supporting non-identity metrics and penalisations. 2-way SVD with these extensions is also available. The package includes also some other multiway methods: PCAn (Tucker-n) and PARAFAC/CANDECOMP with these extensions.
This work is an extension of the state space model for Poisson count data, Poisson-Gamma model, towards a semiparametric specification. Just like the generalized additive models (GAM), cubic splines are used for covariate smoothing. The semiparametric models are fitted by an iterative process that combines maximization of likelihood and backfitting algorithm.
Sparse redundancy analysis for high dimensional (biomedical) data. Directional multivariate analysis to express the maximum variance in the predicted data set by a linear combination of variables of the predictive data set. Implemented in a partial least squares framework, for more details see Csala et al. (2017) <doi:10.1093/bioinformatics/btx374>.
This package provides functions to estimate the density and size of a spatially distributed animal population sampled with an array of passive detectors, such as traps, or by searching polygons or transects. Models incorporating distance-dependent detection are fitted by maximizing the likelihood. Tools are included for data manipulation and model selection.
The goal of tosr is to create the Tree of Science from Web of Science (WoS) and Scopus data. It can read files from both sources at the same time. More information can be found in Valencia-Hernández (2020) <https://revistas.unal.edu.co/index.php/ingeinv/article/view/77718>.
We described a novel Topology-based pathway enrichment analysis, which integrated the global position of the nodes and the topological property of the pathways in Kyoto Encyclopedia of Genes and Genomes Database. We also provide some functions to obtain the latest information about pathways to finish pathway enrichment analysis using this method.
Calculate point estimates of and valid confidence intervals for nonparametric, algorithm-agnostic variable importance measures in high and low dimensions, using flexible estimators of the underlying regression functions. For more information about the methods, please see Williamson et al. (Biometrics, 2020), Williamson et al. (JASA, 2021), and Williamson and Feng (ICML, 2020).
Frequentist sequential meta-analysis based on Trial Sequential Analysis (TSA) in programmed in Java by the Copenhagen Trial Unit (CTU). The primary function is the calculation of group sequential designs for meta-analysis to be used for planning and analysis of both prospective and retrospective sequential meta-analyses to preserve type-I-error control under sequential testing. RTSA includes tools for sample size and trial size calculation for meta-analysis and core meta-analyses methods such as fixed-effect and random-effects models and forest plots. TSA is described in Wetterslev et. al (2008) <doi:10.1016/j.jclinepi.2007.03.013>. The methods for deriving the group sequential designs are based on Jennison and Turnbull (1999, ISBN:9780849303166).
Low-rank matrix decompositions are fundamental tools and widely used for data analysis, dimension reduction, and data compression. Classically, highly accurate deterministic matrix algorithms are used for this task. However, the emergence of large-scale data has severely challenged our computational ability to analyze big data. The concept of randomness has been demonstrated as an effective strategy to quickly produce approximate answers to familiar problems such as the singular value decomposition (SVD). This package provides several randomized matrix algorithms such as the randomized singular value decomposition (rsvd), randomized principal component analysis (rpca), randomized robust principal component analysis (rrpca), randomized interpolative decomposition (rid), and the randomized CUR decomposition (rcur). In addition several plot functions are provided.
Modified Poisson, logistic and least-squares regression analyses for binary outcomes of Zou (2004) <doi:10.1093/aje/kwh090>, Noma (2025)<Forthcoming>, and Cheung (2007) <doi:10.1093/aje/kwm223> have been standard multivariate analysis methods to estimate risk ratio and risk difference in clinical and epidemiological studies. This R package involves an easy-to-handle function to implement these analyses by simple commands. Missing data analysis tools (multiple imputation) are also involved. In addition, recent studies have shown the ordinary robust variance estimator possibly has serious bias under small or moderate sample size situations for these methods. This package also provides computational tools to calculate alternative accurate confidence intervals. Also, standard computational tools for target trial emulation are included.
This package provides an R API to the Open Source Geometry Engine (GEOS) library and a vector format with which to efficiently store GEOS geometries. High-performance functions to extract information from, calculate relationships between, and transform geometries are provided. Finally, facilities to import and export geometry vectors to other spatial formats are provided.
This package is for genomic regions processing using command line tools such as BEDTools, BEDOPS and Tabix. These tools offer scalable and efficient utilities to perform genome arithmetic e.g indexing, formatting and merging. The bedr package's API enhances access to these tools as well as offers additional utilities for genomic regions processing.
Content-preserving transformations transformations of PDF files such as split, combine, and compress. This package interfaces directly to the qpdf C++ API and does not require any command line utilities. Note that qpdf does not read actual content from PDF files: to extract text and data you need the pdftools package.