These are miscellaneous functions that I find useful for my research and teaching. The contents include themes for plots, functions for simulating quantities of interest from regression models, functions for simulating various forms of fake data for instructional/research purposes, and many more. All told, the functions provided here are broadly useful for data organization, data presentation, data recoding, and data simulation.
Import data from the STATcube REST API or from the open data portal of Statistics Austria. This package includes a client for API requests as well as parsing utilities for data which originates from STATcube'. Documentation about STATcubeR is provided by several vignettes included in the package as well as on the public pkgdown page at <https://statistikat.github.io/STATcubeR/>.
This package provides functions related to multivariate measures of independence and ICA: -estimate independent components by minimizing distance covariance; -conduct a test of mutual independence based on distance covariance; -estimate independent components via infomax (a popular method but generally performs poorer than mdcovica, ProDenICA, and/or fastICA, but is useful for comparisons); -order indepedent components by skewness; -match independent components from multiple estimates; -other functions useful in ICA.
Computes confidence intervals for variance using the Chi-Square distribution, without requiring raw data. Wikipedia (2025) <https://en.wikipedia.org/wiki/Chi-squared_distribution>. All-in-One Chi Distribution CI provides functions to calculate confidence intervals for the population variance based on a chi-squared distribution, utilizing a sample variance and sample size. It offers only a simple all-in-one method for quick calculations to find the CI for Chi Distribution.
The steepness package computes steepness as a property of dominance hierarchies. Steepness is defined as the absolute slope of the straight line fitted to the normalized David's scores. The normalized David's scores can be obtained on the basis of dyadic dominance indices corrected for chance or by means of proportions of wins. Given an observed sociomatrix, it computes hierarchy's steepness and estimates statistical significance by means of a randomization test.
It estimates the parameters of spatio-temporal models with censored or missing data using the SAEM algorithm (Delyon et al., 1999). This algorithm is a stochastic approximation of the widely used EM algorithm and is particularly valuable for models in which the E-step lacks a closed-form expression. It also provides a function to compute the observed information matrix using the method developed by Louis (1982). To assess the performance of the fitted model, case-deletion diagnostics are provided.
This package contains statistical methods to analyze graphs, such as graph parameter estimation, model selection based on the Graph Information Criterion, statistical tests to discriminate two or more populations of graphs, correlation between graphs, and clustering of graphs. References: Takahashi et al. (2012) <doi:10.1371/journal.pone.0049949>, Fujita et al. (2017) <doi:10.3389/fnins.2017.00066>, Fujita et al. (2017) <doi:10.1016/j.csda.2016.11.016>, Fujita et al. (2019) <doi:10.1093/comnet/cnz028>.
This package implements the S-type estimators, novel robust estimators for general linear regression models, addressing challenges such as outlier contamination and leverage points. This package introduces robust regression techniques to provide a robust alternative to classical methods and includes diagnostic tools for assessing model fit and performance. The methodology is based on the study, "Comparison of the Robust Methods in the General Linear Regression Model" by Sazak and Mutlu (2023). This package is designed for statisticians and applied researchers seeking advanced tools for robust regression analysis.
For making Trellis-type conditioning plots without strip labels. This is useful for displaying the structure of results from factorial designs and other studies when many conditioning variables would clutter the display with layers of redundant strip labels. Settings of the variables are encoded by layout and spacing in the trellis array and decoded by a separate legend. The functionality is implemented by a single S3 generic strucplot() function that is a wrapper for the Lattice package's xyplot() function. This allows access to all Lattice graphics capabilities in the usual way.
Detrending multivariate time-series to approximate stationarity when dealing with intensive longitudinal data, prior to Vector Autoregressive (VAR) or multilevel-VAR estimation. Classical VAR assumes weak stationarity (constant first two moments), and deterministic trends inflate spurious autocorrelation, biasing Granger-causality and impulse-response analyses. All functions operate on raw panel data and write detrended columns back to the data set, but differ in the level at which the trend is estimated. See, for instance, Wang & Maxwell (2015) <doi:10.1037/met0000030>; Burger et al. (2022) <doi:10.4324/9781003111238-13>; Epskamp et al. (2018) <doi:10.1177/2167702617744325>.
This package provides functionalities for performing stability analysis of genotype by environment interaction (GEI) to identify superior and stable genotypes across diverse environments. It implements Eberhart and Russellâ s ANOVA method (1966)(<doi:10.2135/cropsci1966.0011183X000600010011x>), Finlay and Wilkinsonâ s Joint Linear Regression method (1963) (<doi:10.1071/AR9630742>), Wrickeâ s Ecovalence (1962, 1964), Shuklaâ s stability variance parameter (1972) (<doi:10.1038/hdy.1972.87>), Kangâ s simultaneous selection for high yield and stability (1991) (<doi:10.2134/agronj1991.00021962008300010037x>), Additive Main Effects and Multiplicative Interaction (AMMI) method and Genotype plus Genotypes by Environment (GGE) Interaction methods.
The cartogram heatmaps generated by the included methods are an alternative to choropleth maps for the United States and are based on work by the Washington Post graphics department in their report on "The states most threatened by trade" (<http://www.washingtonpost.com/wp-srv/special/business/states-most-threatened-by-trade/>). "State bins" preserve as much of the geographic placement of the states as possible but have the look and feel of a traditional heatmap. Functions are provided that allow for use of a binned, discrete scale, a continuous scale or manually specified colors depending on what is needed for the underlying data.
This package implements confidence interval and sample size methods that are especially useful in psychological research. The methods can be applied in 1-group, 2-group, paired-samples, and multiple-group designs and to a variety of parameters including means, medians, proportions, slopes, standardized mean differences, standardized linear contrasts of means, plus several measures of correlation and association. Confidence interval and sample size functions are given for single parameters as well as differences, ratios, and linear contrasts of parameters. The sample size functions can be used to approximate the sample size needed to estimate a parameter or function of parameters with desired confidence interval precision or to perform a variety of hypothesis tests (directional two-sided, equivalence, superiority, noninferiority) with desired power. For details see: Statistical Methods for Psychologists, Volumes 1 â 4, <https://dgbonett.sites.ucsc.edu/>.
This package provides functions in this package provide solution to classical problem in survey methodology - an optimum sample allocation in stratified sampling. In this context, the optimum allocation is in the classical Tschuprow-Neyman's sense and it satisfies additional lower or upper bounds restrictions imposed on sample sizes in strata. There are few different algorithms available to use, and one them is based on popular sample allocation method that applies Neyman allocation to recursively reduced set of strata. This package also provides the function that computes a solution to the minimum cost allocation problem, which is a minor modification of the classical optimum sample allocation. This problem lies in the determination of a vector of strata sample sizes that minimizes total cost of the survey, under assumed fixed level of the stratified estimator's variance. As in the case of the classical optimum allocation, the problem of minimum cost allocation can be complemented by imposing upper-bounds constraints on sample sizes in strata.
This package provides string parsing functionalities for generating plotnames, filenames and paths.
This package provides utilities to create or suppress start-up messages.
Allows the creation and manipulation of C++ std::vector's in R.
Support for reading and writing files in StatDataML---an XML-based data exchange format.
This package provides an extendable, performant and multithreaded alt-string implementation backed by C++ vectors and strings.
This package provides density, probability and quantile functions, and random number generation for (skew) stable distributions, using the parametrizations of Nolan.
R Codes and Datasets for Stroup, W. W. (2012). Generalized Linear Mixed Models Modern Concepts, Methods and Applications, CRC Press.
This package provides a graphical user interface for cross-sectional network modeling with the statnet software suite <https://github.com/statnet>.
The <http://standartox.uni-landau.de> database offers cleaned, harmonized and aggregated ecotoxicological test data, which can be used for assessing effects and risks of chemical concentrations found in the environment.
Estimation of model parameters for marked Hawkes process. Accounts for missing data in the estimation of the parameters. Technical details found in (Tucker et al., 2019 <DOI:10.1016/j.spasta.2018.12.004>).