Feature Ordering by Integrated R square Dependence (FORD) is a variable selection algorithm based on the new measure of dependence: Integrated R2 Dependence Coefficient (IRDC). For more information, see the paper: Azadkia and Roudaki (2025),"A New Measure Of Dependence: Integrated R2" <doi:10.48550/arXiv.2505.18146>.
This package provides a set of functions that facilitate basic data manipulation and cleaning for statistical analysis including functions for finding and fixing duplicate rows and columns, missing values, outliers, and special characters in column and row names and functions for checking data consistency, distribution, quality, reliability, and structure.
Some methods for the inference and clustering of univariate and multivariate functional data, using a generalization of Mahalanobis distance, along with some functions useful for the analysis of functional data. For further details, see Martino A., Ghiglietti, A., Ieva, F. and Paganoni A. M. (2017) <arXiv:1708.00386>.
This package provides a collection of intuitive and user-friendly functions for computing confidence intervals for common statistical tasks, including means, differences in means, proportions, and odds ratios. The package also includes tools for linear regression analysis and several real-world datasets intended for teaching and applied statistical inference.
Computes measures of multivariate kurtosis, matrices of fourth-order moments and cumulants, kurtosis-based projection pursuit. Franceschini, C. and Loperfido, N. (2018, ISBN:978-3-319-73905-2). "An Algorithm for Finding Projections with Extreme Kurtosis". Loperfido, N. (2017,ISSN:0024-3795). "A New Kurtosis Matrix, with Statistical Applications".
Spatial and spatio-temporal modelling of point patterns using the log-Gaussian Cox process. Bayesian inference for spatial, spatiotemporal, multivariate and aggregated point processes using Markov chain Monte Carlo. See Benjamin M. Taylor, Tilman M. Davies, Barry S. Rowlingson, Peter J. Diggle (2015) <doi:10.18637/jss.v063.i07>.
This package provides a simple in-memory, LRU cache that can be wrapped around any function to memoize it. The cache is keyed on a hash of the input data (using digest') or on pointer equivalence. Also includes a generic hashmap object that can key on any object type.
Multivariate Analysis methods and data sets used in John Marden's book Multivariate Statistics: Old School (2015) <ISBN:978-1456538835>. This also serves as a companion package for the STAT 571: Multivariate Analysis course offered by the Department of Statistics at the University of Illinois at Urbana-Champaign ('UIUC').
Compute and select tuning parameters for the MRCE estimator proposed by Rothman, Levina, and Zhu (2010) <doi:10.1198/jcgs.2010.09188>. This estimator fits the multiple output linear regression model with a sparse estimator of the error precision matrix and a sparse estimator of the regression coefficient matrix.
This package provides functions for MultiDimensional Feature Selection (MDFS): calculating multidimensional information gains, scoring variables, finding important variables, plotting selection results. This package includes an optional CUDA implementation that speeds up information gain calculation using NVIDIA GPGPUs. R. Piliszek et al. (2019) <doi:10.32614/RJ-2019-019>.
This package provides methods for detecting signals related to (adverse event, medical product e.g. drugs, vaccines) pairs, a data generation function for simulating pharmacovigilance datasets, and various utility functions. For more details please see Liu A., Mukhopadhyay R., and Markatou M. <doi:10.48550/arXiv.2410.01168>.
Facilitates the automatic detection of acoustic signals, providing functions to diagnose and optimize the performance of detection routines. Detections from other software can also be explored and optimized. This package has been peer-reviewed by rOpenSci. Araya-Salas et al. (2022) <doi:10.1101/2022.12.13.520253>.
Fits two-dimensional data by means of orthogonal nonlinear least-squares using Levenberg-Marquardt minimization and provides functionality for fit diagnostics and plotting. Delivers the same results as the ODRPACK Fortran implementation described in Boggs et al. (1989) <doi:10.1145/76909.76913>, but is implemented in pure R.
Principal component of explained variance (PCEV) is a statistical tool for the analysis of a multivariate response vector. It is a dimension- reduction technique, similar to Principal component analysis (PCA), that seeks to maximize the proportion of variance (in the response vector) being explained by a set of covariates.
This package provides a tool for automatic generation of sibling items from a parent item model defined by the user. It is an implementation of the process automatic item generation (AIG) focused on generating quantitative multiple-choice type of items (see Embretson, Kingston (2018) <doi:10.1111/jedm.12166>).
This package provides methods and feature set definitions for feature or gene set enrichment analysis in transcriptional and metabolic profiling data. Package includes tests for enrichment based on ranked lists of features, functions for visualisation and multivariate functional analysis. See Zyla et al (2019) <doi:10.1093/bioinformatics/btz447>.
Calculate optimal Zhong's two-/three-stage Phase II designs (see Zhong (2012) <doi:10.1016/j.cct.2012.07.006>). Generate Target Toxicity decision table for Phase I dose-finding (two-/three-stage). This package also allows users to run dose-finding simulations based on customized decision table.
This package provides function, wget_set(), to change the method (default to wget -c') using in download.file(). Using wget -c allowing continued downloading, which is especially useful for slow internet connection and for downloading large files. User can run wget_unset() to restore previous setting.
The lumi package provides an integrated solution for the Illumina microarray data analysis. It includes functions of Illumina BeadStudio (GenomeStudio) data input, quality control, BeadArray-specific variance stabilization, normalization and gene annotation at the probe level. It also includes the functions of processing Illumina methylation microarrays, especially Illumina Infinium methylation microarrays.
This package implements the differential equations associated to different versions of Allometric Trophic Models (ATN) to estimate the temporal dynamics of species biomasses in food webs. It offers several features to generate synthetic food webs and to parametrise models as well as a wrapper to the ODE solver deSolve.
Whole-genome regression methods on Bayesian framework fitted via EM or Gibbs sampling, single step (<doi:10.1534/g3.119.400728>), univariate and multivariate (<doi:10.1186/s12711-022-00730-w>, <doi:10.1093/genetics/iyae179>), with optional kernel term and sampling techniques (<doi:10.1186/s12859-017-1582-3>).
Support for import from and export to the CSVY file format. CSVY is a file format that combines the simplicity of CSV (comma-separated values) with the metadata of other plain text and binary formats (JSON, XML, Stata, etc.) by placing a YAML header on top of a regular CSV.
This package provides methods for fitting and inspection of Bayesian Multinomial Logistic Normal Models using MAP estimation and Laplace Approximation as developed in Silverman et. Al. (2022) <https://www.jmlr.org/papers/v23/19-882.html>. Key functionality is implemented in C++ for scalability. fido replaces the previous package stray'.
Social Relations Analysis with roles ("Family SRM") are computed, using a structural equation modeling approach. Groups ranging from three members up to an unlimited number of members are supported and the mean structure can be computed. Means and variances can be compared between different groups of families and between roles.