To perform main effect matrix factor model (MEFM) estimation for a given matrix time series as described in Lam and Cen (2024) <doi:10.48550/arXiv.2406.00128>. Estimation of traditional matrix factor models is also supported. Supplementary functions for testing MEFM over factor models are included.
Subsampling based variable selection for low dimensional generalized linear models. The methods repeatedly subsample the data minimizing an information criterion (AIC/BIC) over a sequence of nested models for each subsample. Marinela Capanu, Mihai Giurcanu, Colin B Begg, Mithat Gonen, Subsampling based variable selection for generalized linear models.
In linear LS regression, calculate for a given design matrix the multiplier K of coefficient standard errors such that the confidence intervals [b - K*SE(b), b + K*SE(b)] have a guaranteed coverage probability for all coefficient estimates b in any submodels after performing arbitrary model selection.
Fit quantile regression neural network models with optional left censoring, partial monotonicity constraints, generalized additive model constraints, and the ability to fit multiple non-crossing quantile functions following Cannon (2011) <doi:10.1016/j.cageo.2010.07.005> and Cannon (2018) <doi:10.1007/s00477-018-1573-6>.
This package implements L0-constrained Multi-Task Learning and domain generalization algorithms. The algorithms are coded in Julia allowing for fast implementations of the coordinate descent and local combinatorial search algorithms. For more details, see a preprint of the paper: Loewinger et al., (2022) <arXiv:2212.08697>.
Covers k-table control analysis using multivariate control charts for qualitative variables using fundamentals of multiple correspondence analysis and multiple factor analysis. The graphs can be shown in a flat or interactive way, in the same way all the outputs can be shown in an interactive shiny panel.
This package provides methods for faster extraction (about 5x faster in a few test cases) of variance-covariance matrices and standard errors from models. Methods in the stats package tend to rely on the summary method, which may waste time computing other summary statistics which are summarily ignored.
Lossless webp images are 26% smaller in size compared to PNG. Lossy webp images are 25-34% smaller in size compared to JPEG. This package reads and writes webp images into a 3 (rgb) or 4 (rgba) channel bitmap array using conventions from the jpeg and png packages.
Interface to the World Inequality Database (WID) API <https://wid.world>. Downloads distributional national accounts data with filters for country, year, percentile, age group, and population type. Includes code validation and reference tables. Independent implementation unaffiliated with the World Inequality Lab (WIL) or the Paris School of Economics.
MIME types are shorthand descriptors for file contents and can be determined from "magic" bytes in file headers, file contents or intuited from file extensions. Tools are provided to perform curated "magic" tests as well as mapping MIME types from a database of over 1,500 extension mappings.
The tRNA package allows tRNA sequences and structures to be accessed and used for subsetting. In addition, it provides visualization tools to compare feature parameters of multiple tRNA sets and correlate them to additional data. The tRNA package uses GRanges objects as inputs requiring only few additional column data sets.
This package implements a method to analyze single-cell RNA-seq data utilizing flexible Dirichlet Process mixture models. Genes with differential distributions of expression are classified into several interesting patterns of differences between two conditions. The package also includes functions for simulating data with these patterns from negative binomial distributions.
This package provides a Bayesian model averaging approach to causal effect estimation based on the BCEE algorithm. Currently supports binary or continuous exposures and outcomes. For more details, see Talbot et al. (2015) <doi:10.1515/jci-2014-0035> Talbot and Beaudoin (2022) <doi:10.1515/jci-2021-0023>.
Package for Breed Wheat Genomic Selection Pipeline. The R package BWGS is developed by Louis Gautier Tran <louis.gautier.tran@gmail.com> and Gilles Charmet <gilles.charmet@inra.fr>. This repository is forked from original repository <https://forgemia.inra.fr/umr-gdec/bwgs> and modified as a R package.
Converts customer transaction data (ID, purchase date) into a R6 class called customer. The class stores various customer analytics calculations at the customer level. The package also contains functionality to convert data in the R6 class to data.frames that can serve as inputs for various customer analytics models.
This package contains a function, also called cchs', that calculates Estimator III of Borgan et al (2000), <DOI:10.1023/A:1009661900674>. This estimator is for fitting a Cox proportional hazards model to data from a case-cohort study where the subcohort was selected by stratified simple random sampling.
Discover causality for bivariate categorical data. This package aims to enable users to discover causality for bivariate observational categorical data. See Ni, Y. (2022) <arXiv:2209.08579> "Bivariate Causal Discovery for Categorical Data via Classification with Optimal Label Permutation. Advances in Neural Information Processing Systems 35 (in press)".
Diff, patch and merge for data frames. Document changes in data sets and use them to apply patches. Changes to data can be made visible by using render_diff(). The V8 package is used to wrap the daff.js JavaScript library which is included in the package.
The truncated factor model is a statistical model designed to handle specific data structures in data analysis. DTFM is a powerful tool designed to efficiently process and analyze distributed datasets. The philosophy of the package is described in Guo et al. (2023) <doi:10.1007/s00180-022-01270-z>.
This package provides a programmatic interface to Health Canada's Drug Product Database (DPD) REST API for querying information about drugs approved for use in Canada. More information on the DPD can be found in the API guide (<https://health-products.canada.ca/api/documentation/dpd-documentation-en.html>).
This package provides a collection of functions to perform Detrended Fluctuation Analysis (DFA) and Detrended Cross-Correlation Analysis (DCCA). This package implements the results presented in Prass, T.S. and Pumi, G. (2019). "On the behavior of the DFA and DCCA in trend-stationary processes" <arXiv:1910.10589>.
Feature Ordering by Integrated R square Dependence (FORD) is a variable selection algorithm based on the new measure of dependence: Integrated R2 Dependence Coefficient (IRDC). For more information, see the paper: Azadkia and Roudaki (2025),"A New Measure Of Dependence: Integrated R2" <doi:10.48550/arXiv.2505.18146>.
This package provides a set of functions that facilitate basic data manipulation and cleaning for statistical analysis including functions for finding and fixing duplicate rows and columns, missing values, outliers, and special characters in column and row names and functions for checking data consistency, distribution, quality, reliability, and structure.
Some methods for the inference and clustering of univariate and multivariate functional data, using a generalization of Mahalanobis distance, along with some functions useful for the analysis of functional data. For further details, see Martino A., Ghiglietti, A., Ieva, F. and Paganoni A. M. (2017) <arXiv:1708.00386>.