Simulate multivariate data with arbitrary marginal distributions. bigsimr is a package for simulating high-dimensional multivariate data with a target correlation and arbitrary marginal distributions via Gaussian copula. It utilizes the Julia package Bigsimr.jl for its core routines.
Explore and normalize American campaign finance data. Created by the Investigative Reporting Workshop to facilitate work on The Accountability Project, an effort to collect public data into a central, standard database that is more easily searched: <https://publicaccountability.org/>.
Features the marginal parametric and semi-parametric proportional hazards mixture cure models for analyzing clustered survival data with a possible cure fraction. A reference is Yi Niu and Yingwei Peng (2014) <doi:10.1016/j.jmva.2013.09.003>.
Uses jackknife and bootstrap methods to quantify the sampling uncertainty in goodness-of-fit statistics. Full details are in Clark et al. (2021), "The abuse of popular performance metrics in hydrologic modeling", Water Resources Research, <doi:10.1029/2020WR029001>.
An implementation of the International Association for the Properties of Water (IAPWS) Formulation 1995 for the Thermodynamic Properties of Ordinary Water Substance for General and Scientific Use and on the releases for viscosity, conductivity, surface tension and melting pressure.
Data-driven approach for Exploratory Factor Analysis (EFA) that uses Model Implied Instrumental Variables (MIIVs). The method starts with a one factor model and arrives at a suggested model with enhanced interpretability that allows cross-loadings and correlated errors.
The companion package provides all original data sets and functions that are used in the book "Model-Based Clustering and Classification for Data Science" by Charles Bouveyron, Gilles Celeux, T. Brendan Murphy and Adrian E. Raftery (2019, ISBN:9781108644181).
Imputation of incomplete continuous or categorical datasets; Missing values are imputed with a principal component analysis (PCA), a multiple correspondence analysis (MCA) model or a multiple factor analysis (MFA) model; Perform multiple imputation with and in PCA or MCA.
This package provides modules as an organizational unit for source code. Modules enforce to be more rigorous when defining dependencies and have a local search path. They can be used as a sub unit within packages or in scripts.
This package provides methods for modeling moderator variables in cross-sectional, temporal, and multi-level networks. Includes model selection techniques and a variety of plotting functions. Implements the methods described by Swanson (2020) <https://www.proquest.com/openview/d151ab6b93ad47e3f0d5e59d7b6fd3d3>.
This package provides tools to process legacy format summary redistricting data files produced by the United States Census Bureau pursuant to P.L. 94-171. These files are generally available earlier but are difficult to work with as-is.
Data from All the World's Primates relational SQL database and other tabular datasets are made available via drivers and connection functions. Additionally we provide several functions and examples to facilitate the merging and aggregation of these tabular inputs.
This package provides a collection of miscellaneous functions for passive acoustics. Much of the content here is adapted to R from code written by other people. If you have any ideas of functions to add, please contact Taiki Sakai.
This package performs random-effect multiple interval mapping (REMIM) in full-sib families of autopolyploid species based on restricted maximum likelihood (REML) estimation and score statistics, as described in Pereira et al. (2020) <doi:10.1534/genetics.120.303080>.
Execute multi-step SQL workflows by leveraging specially formatted comments to define and control execution. This enables users to mix queries, commands, and metadata within a single script. Results are returned as named objects for use in downstream workflows.
This package implements the methodological developments found in Hermes (2025) <doi:10.48550/arXiv.2503.02786>
, and allows for the statistical modeling of data consisting of multiple users that provide an ordinal rating for one or multiple items.
We provide functions for estimation and inference of locally-stationary time series using the sieve methods and bootstrapping procedure. In addition, it also contains functions to generate Daubechies and Coiflet wavelet by Cascade algorithm and to process data visualization.
Generate objects that simulate survival times. Random values for the distributions are generated using the method described by Bender (2003) <https://epub.ub.uni-muenchen.de/id/eprint/1716> and Leemis (1987) in Operations Research, 35(6), 892รข 894.
Sparse-group boosting to be used in conjunction with the mboost for modeling grouped data. Applicable to all sparse-group lasso type problems where within-group and between-group sparsity is desired. Interprets and visualizes individual variables and groups.
Core parts of the C API of R are wrapped in a C++ namespace via a set of inline functions giving a tidier representation of the underlying data structures and functionality using a header-only implementation without additional dependencies.
This package implements D-vine quantile regression models with parametric or nonparametric pair-copulas. See Kraus and Czado (2017) <doi:10.1016/j.csda.2016.12.009> and Schallhorn et al. (2017) <doi:10.48550/arXiv.1705.08310>
.
This package provides an integrated web interface for doing microarray analysis using several of the Bioconductor packages. It is intended to be deployed as a centralized bioinformatics resource for use by many users. Currently only Affymetrix oligonucleotide analysis is supported.
This package provides functions to detect and correct for batch effects in DNA methylation data. The core function is based on latent factor models and can also be used to predict missing values in any other matrix containing real numbers.
This package implements transcript quantification import from Salmon and alevin with automatic attachment of transcript ranges and release information, and other associated metadata. De novo transcriptomes can be linked to the appropriate sources with linkedTxomes and shared for computational reproducibility.