This package implements several Approximate Bayesian Computation (ABC) algorithms for performing parameter estimation, model selection, and goodness-of-fit. Cross-validation tools are also available for measuring the accuracy of ABC estimates, and to calculate the misclassification probabilities of different models.
This package provides generation and estimation of censored factor models for high-dimensional data with censored errors (normal, t, logistic). Includes Sparse Orthogonal Principal Components (SOPC), and evaluation metrics. Based on Guo G. (2023) <doi:10.1007/s00180-022-01270-z>.
For ordinal rating data, estimate and test models within the family of CUB models and their extensions (where CUB stands for Combination of a discrete Uniform and a shifted Binomial distributions); Simulation routines, plotting facilities and fitting measures are also provided.
Multivariate conditional and marginal densities, moments, cumulative distribution functions as well as binary choice and sample selection models based on the Hermite polynomial approximation which was proposed and described by A. Gallant and D. W. Nychka (1987) <doi:10.2307/1913241>.
Calculate B-spline basis functions with a given set of knots and order, or a B-spline function with a given set of knots and order and set of de Boor points (coefficients), or the integral of a B-spline function.
Assist in the estimation of the Intraclass Correlation Coefficient (ICC) from variance components of a one-way analysis of variance and also estimate the number of individuals or groups necessary to obtain an ICC estimate with a desired confidence interval width.
This package creates modules inline or from a file. Modules can contain any R object and be nested. Each module have their own scope and package "search path" that does not interfere with one another or the user's working environment.
This package implements the Bayesian online changepoint detection method by Adams and MacKay (2007) <arXiv:0710.3742> for univariate or multivariate data. Gaussian and Poisson probability models are implemented. Provides post-processing functions with alternative ways to extract changepoints.
Estimate the positron emission tomography (PET) neuroreceptor occupancies from the total volumes of distribution of a set of regions of interest. Fitting methods include the simple reference region', ordinary least squares (sometimes known as occupancy plot), and restricted maximum likelihood estimation'.
Given k populations (can be in thousands), what is the probability that a given subset of size t contains the true top t populations? This package finds this probability and offers three tuning parameters (G, d, L) to relax the definition.
Fitting and testing probabilistic knowledge structures, especially the basic local independence model (BLIM, Doignon & Flamagne, 1999) and the simple learning model (SLM), using the minimum discrepancy maximum likelihood (MDML) method (Heller & Wickelmaier, 2013 <doi:10.1016/j.endm.2013.05.145>).
This package provides permutation methods for testing in high-dimensional linear models. The tests are often robust against heteroscedasticity and non-normality and usually perform well under anti-sparsity. See Hemerik, Thoresen and Finos (2021) <doi:10.1080/00949655.2020.1836183>.
Interfaces with the Hugging Face tokenizers library to provide implementations of today's most used tokenizers such as the Byte-Pair Encoding algorithm <https://huggingface.co/docs/tokenizers/index>. It's extremely fast for both training new vocabularies and tokenizing texts.
Statistical estimation of revealed preference models from data collected on bipartite matchings. The models are for matchings within a bipartite population where individuals have utility for people based on known and unknown characteristics. People can form a partnership or remain unpartnered. The model represents both the availability of potential partners of different types and preferences of individuals for such people. The software estimates preference parameters based on sample survey data on partnerships and population composition. The simulation of matchings and goodness-of-fit are considered. See Goyal, Handcock, Jackson, Rendall and Yeung (2022) <doi:10.1093/jrsssa/qnad031>.
This package provides a modified implementation of stepwise regression that greedily searches the space of interactions among features in order to build polynomial regression models. Furthermore, the hypothesis tests conducted are valid-post model selection due to the use of a revisiting procedure that implements an alpha-investing rule. As a result, the set of rejected sequential hypotheses is proven to control the marginal false discover rate. When not searching for polynomials, the package provides a statistically valid algorithm to run and terminate stepwise regression. For more information, see Johnson, Stine, and Foster (2019) <arXiv:1510.06322>.
This package provides tools to render DOT diagram markup language in R and also provides the possibility to export the graphs in PostScript and SVG (Scalable Vector Graphics) formats. In addition, it supports literate programming packages such as knitr and rmarkdown.
This package provides a Scheme-inspired Lisp dialect embedded in R, with macros, tail-call optimization, and seamless interoperability with R functions and data structures. (The name arl is short for An R Lisp.') Implemented in pure R with no compiled code.
Causal inference for a binary treatment and continuous outcome using Bayesian Causal Forests. See Hahn, Murray and Carvalho (2020) <doi:10.1214/19-BA1195> for additional information. This implementation relies on code originally accompanying Pratola et. al. (2013) <arXiv:1309.1906>.
An algorithm of optimal subset selection, related to Covariance matrices, observation matrices and Response vectors (COR) to select the optimal subsets in distributed estimation. The philosophy of the package is described in Guo G. (2024) <doi:10.1007/s11222-024-10471-z>.
Facilitates the aggregation of species geographic ranges from vector or raster spatial data, and that enables the calculation of various morphological and phylogenetic community metrics across geography. Citation: Title, PO, DL Swiderski and ML Zelditch (2022) <doi:10.1111/2041-210X.13914>.
This package provides tools to compute the Generalized Measure of Correlation (GMC), a dependence measure accounting for nonlinearity and asymmetry in the relationship between variables. Based on the method proposed by Zheng, Shi, and Zhang (2012) <doi:10.1080/01621459.2012.710509>.
The philosophy in the package is described in Stasny (1988) <doi:10.2307/1391558> and Guti?rrez, A., Trujillo, L. & Silva, N. (2014), <ISSN:1492-0921> to estimate the gross flows under complex surveys using a Markov chain approach with non response.
This package provides functions for the estimation, plotting, predicting and cross-validation of hierarchical feature regression models as described in Pfitzinger (2024). Cluster Regularization via a Hierarchical Feature Regression. Econometrics and Statistics (in press). <doi:10.1016/j.ecosta.2024.01.003>.
We use the ISR to handle with PCA-based missing data with high correlation, and the DISR to handle with distributed PCA-based missing data. The philosophy of the package is described in Guo G. (2024) <doi:10.1080/03610918.2022.2091779>.