Efficiently estimates treatment effects in settings with randomized staggered rollouts, using tools proposed by Roth and Sant'Anna (2023) <doi:10.48550/arXiv.2102.01291>
.
Model Selection Based on Combined Penalties. This package implements a stepwise forward variable selection algorithm based on a penalized likelihood criterion that combines the L0 with L2 or L1 norms.
Tree-structured modelling of categorical predictors (Tutz and Berger (2018), <doi:10.1007/s11634-017-0298-6>) or measurement units (Berger and Tutz (2018), <doi:10.1080/10618600.2017.1371030>).
Non-parametric test, originally proposed by Stute (1997) <https://www.jstor.org/stable/2242560>, that the expectation of a dependent variable Y given an independent variable D is linear in D.
This package produces LaTeX
code, HTML/CSS code and ASCII text for well-formatted tables that hold regression analysis results from several models side-by-side, as well as summary statistics.
This package can automatically extract statistical null-hypothesis significant testing (NHST) results from articles and recompute the p-values based on the reported test statistic and degrees of freedom to detect possible inconsistencies.
Interface for data stream clustering algorithms implemented in the MOA (Massive Online Analysis) framework (Albert Bifet, Geoff Holmes, Richard Kirkby, Bernhard Pfahringer (2010). MOA: Massive Online Analysis, Journal of Machine Learning Research 11: 1601-1604).
Datasets for the textbook Stat2: Modeling with Regression and ANOVA (second edition). The package also includes data for the first edition, Stat2: Building Models for a World of Data and a few functions for plotting diagnostics.
This package performs simulation and inference of diffusion processes on circle. Stochastic correlation models based on circular diffusion models are provided. For details see Majumdar, S. and Laha, A.K. (2024) "Diffusion on the circle and a stochastic correlation model" <doi:10.48550/arXiv.2412.06343>
.
This package provides fitting functions and other tools for decision confidence and metacognition researchers, including meta-d'/d', often considered to be the gold standard to measure metacognitive efficiency, and information-theoretic measures of metacognition. Also allows to fit several static models of decision making and confidence.
Integration of two data sources referred to the same target population which share a number of variables. Some functions can also be used to impute missing values in data sets through hot deck imputation methods. Methods to perform statistical matching when dealing with data from complex sample surveys are available too.
This package provides indices and tools for directed acyclic graphs (DAGs), particularly DAG representations of intermittent streams. A detailed introduction to the package can be found in the publication: "Non-perennial stream networks as directed acyclic graphs: The R-package streamDAG
" (Aho et al., 2023) <doi:10.1016/j.envsoft.2023.105775>, and in the introductory package vignette.
Fits semiparametric linear and multilevel models with non-parametric additive Bayesian additive regression tree (BART; Chipman, George, and McCulloch
(2010) <doi:10.1214/09-AOAS285>) components and Stan (Stan Development Team (2021) <https://mc-stan.org/>) sampled parametric ones. Multilevel models can be expressed using lme4 syntax (Bates, Maechler, Bolker, and Walker (2015) <doi:10.18637/jss.v067.i01>).
Flexible stochastic tree ensemble software. Robust implementations of Bayesian Additive Regression Trees (BART) Chipman, George, McCulloch
(2010) <doi:10.1214/09-AOAS285> for supervised learning and Bayesian Causal Forests (BCF) Hahn, Murray, Carvalho (2020) <doi:10.1214/19-BA1195> for causal inference. Enables model serialization and parallel sampling and provides a low-level interface for custom stochastic forest samplers.
This is a collection of various kinds of data with broad uses for teaching. My students, and academics like me who teach the same topics I teach, should find this useful if their teaching workflow is also built around the R programming language. The applications are multiple but mostly cluster on topics of statistical methodology, international relations, and political economy.
Work with containers over the Docker API. Rather than using system calls to interact with a docker client, using the API directly means that we can receive richer information from docker. The interface in the package is automatically generated using the OpenAPI
(a.k.a., swagger') specification, and all return values are checked in order to make them type stable.
These are miscellaneous functions that I find useful for my research and teaching. The contents include themes for plots, functions for simulating quantities of interest from regression models, functions for simulating various forms of fake data for instructional/research purposes, and many more. All told, the functions provided here are broadly useful for data organization, data presentation, data recoding, and data simulation.
Import data from the STATcube REST API or from the open data portal of Statistics Austria. This package includes a client for API requests as well as parsing utilities for data which originates from STATcube'. Documentation about STATcubeR
is provided by several vignettes included in the package as well as on the public pkgdown page at <https://statistikat.github.io/STATcubeR/>
.
This package provides functions related to multivariate measures of independence and ICA: -estimate independent components by minimizing distance covariance; -conduct a test of mutual independence based on distance covariance; -estimate independent components via infomax (a popular method but generally performs poorer than mdcovica, ProDenICA
, and/or fastICA
, but is useful for comparisons); -order indepedent components by skewness; -match independent components from multiple estimates; -other functions useful in ICA.
Computes confidence intervals for variance using the Chi-Square distribution, without requiring raw data. Wikipedia (2025) <https://en.wikipedia.org/wiki/Chi-squared_distribution>. All-in-One Chi Distribution CI provides functions to calculate confidence intervals for the population variance based on a chi-squared distribution, utilizing a sample variance and sample size. It offers only a simple all-in-one method for quick calculations to find the CI for Chi Distribution.
The steepness package computes steepness as a property of dominance hierarchies. Steepness is defined as the absolute slope of the straight line fitted to the normalized David's scores. The normalized David's scores can be obtained on the basis of dyadic dominance indices corrected for chance or by means of proportions of wins. Given an observed sociomatrix, it computes hierarchy's steepness and estimates statistical significance by means of a randomization test.
This package contains statistical methods to analyze graphs, such as graph parameter estimation, model selection based on the Graph Information Criterion, statistical tests to discriminate two or more populations of graphs, correlation between graphs, and clustering of graphs. References: Takahashi et al. (2012) <doi:10.1371/journal.pone.0049949>, Fujita et al. (2017) <doi:10.3389/fnins.2017.00066>, Fujita et al. (2017) <doi:10.1016/j.csda.2016.11.016>, Fujita et al. (2019) <doi:10.1093/comnet/cnz028>.
It estimates the parameters of a censored or missing data in spatio-temporal models using the SAEM algorithm (Delyon et al., 1999). This algorithm is a stochastic approximation of the widely used EM algorithm and an important tool for models in which the E-step does not have an analytic form. Besides the expressions obtained to estimate the parameters to the proposed model, we include the calculations for the observed information matrix using the method developed by Louis (1982). To examine the performance of the fitted model, case-deletion measure are provided.
For making Trellis-type conditioning plots without strip labels. This is useful for displaying the structure of results from factorial designs and other studies when many conditioning variables would clutter the display with layers of redundant strip labels. Settings of the variables are encoded by layout and spacing in the trellis array and decoded by a separate legend. The functionality is implemented by a single S3 generic strucplot()
function that is a wrapper for the Lattice package's xyplot()
function. This allows access to all Lattice graphics capabilities in the usual way.