This package provides methods for inference using stacked multiple imputations augmented with weights. The vignette provides example R code for implementation in general multiple imputation settings. For additional details about the estimation algorithm, we refer the reader to Beesley, Lauren J and Taylor, Jeremy M G (2020) â A stacked approach for chained equations multiple imputation incorporating the substantive modelâ <doi:10.1111/biom.13372>, and Beesley, Lauren J and Taylor, Jeremy M G (2021) â Accounting for not-at-random missingness through imputation stackingâ <arXiv:2101.07954>
.
This package provides a pilot matching design to automatically stratify and match large datasets. The manual_stratify()
function allows users to manually stratify a dataset based on categorical variables of interest, while the auto_stratify()
function does automatically by allocating a held-aside (pilot) data set, fitting a prognostic score (see Hansen (2008) <doi:10.1093/biomet/asn004>) on the pilot set, and stratifying the data set based on prognostic score quantiles. The strata_match()
function then does optimal matching of the data set in parallel within strata.
This package provides tools for testing, monitoring and dating structural changes in (linear) regression models. It features tests/methods from the generalized fluctuation test framework as well as from the F test (Chow test) framework. This includes methods to fit, plot and test fluctuation processes (e.g., CUSUM, MOSUM, recursive/moving estimates) and F statistics, respectively. It is possible to monitor incoming data online using fluctuation processes. Finally, the breakpoints in regression models with structural changes can be estimated together with confidence intervals. Emphasis is always given to methods for visualizing the data.
Data in multidimensional systems is obtained from operational systems and is transformed to adapt it to the new structure. Frequently, the operations to be performed aim to transform a flat table into a star schema. Transformations can be carried out using professional extract, transform and load tools or tools intended for data transformation for end users. With the tools mentioned, this transformation can be carried out, but it requires a lot of work. The main objective of this package is to define transformations that allow obtaining stars from flat tables easily. In addition, it includes basic data cleaning, dimension enrichment, incremental data refresh and query operations, adapted to this context.
This package provides a lightweight tool that provides a reproducible workflow for selecting and executing appropriate statistical analysis in one-way or two-way experimental designs. The package automatically checks for data normality, conducts parametric (ANOVA) or non-parametric (Kruskal-Wallis) tests, performs post-hoc comparisons with Compact Letter Displays (CLD), and generates publication-ready boxplots, faceted plots, and heatmaps. It is designed for researchers seeking fast, automated statistical summaries and visualization. Based on established statistical methods including Shapiro and Wilk (1965) <doi:10.2307/2333709>, Kruskal and Wallis (1952) <doi:10.1080/01621459.1952.10483441>, Tukey (1949) <doi:10.2307/3001913>, Fisher (1925) <ISBN:0050021702>, and Wickham (2016) <ISBN:978-3-319-24277-4>.
Statistical performance measures used in the econometric literature to evaluate conditional covariance/correlation matrix estimates (MSE, MAE, Euclidean distance, Frobenius distance, Stein distance, asymmetric loss function, eigenvalue loss function and the loss function defined in Eq. (4.6) of Engle et al. (2016) <doi:10.2139/ssrn.2814555>). Additionally, compute Eq. (3.1) and (4.2) of Li et al. (2016) <doi:10.1080/07350015.2015.1092975> to compare the factor loading matrix. The statistical performance measures implemented have been previously used in, for instance, Laurent et al. (2012) <doi:10.1002/jae.1248>, Amendola et al. (2015) <doi:10.1002/for.2322> and Becker et al. (2015) <doi:10.1016/j.ijforecast.2013.11.007>.
This package provides an efficient method to recover the missing block of an approximately low-rank matrix. Current literature on matrix completion focuses primarily on independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured missingness by design [Cai T, Cai TT, Zhang A (2016) <doi:10.1080/01621459.2015.1021005>]. Specifically, our proposed method aims at efficient matrix recovery when a subset of the rows and columns of an approximately low-rank matrix are observed. The main function in our package, smc.FUN()
, is for recovery of the missing block A22 of an approximately low-rank matrix A given the other blocks A11, A12, A21.
This package creates and fits staged event tree probability models, which are probabilistic graphical models capable of representing asymmetric conditional independence statements for categorical variables. Includes functions to create, plot and fit staged event trees from data, as well as many efficient structure learning algorithms. References: Carli F, Leonelli M, Riccomagno E, Varando G (2022). <doi: 10.18637/jss.v102.i06>. Collazo R. A., Görgen C. and Smith J. Q. (2018, ISBN:9781498729604). Görgen C., Bigatti A., Riccomagno E. and Smith J. Q. (2018) <arXiv:1705.09457>
. Thwaites P. A., Smith, J. Q. (2017) <arXiv:1510.00186>
. Barclay L. M., Hutton J. L. and Smith J. Q. (2013) <doi:10.1016/j.ijar.2013.05.006>. Smith J. Q. and Anderson P. E. (2008) <doi:10.1016/j.artint.2007.05.004>.
The C++ header files of the Stan project are provided by this package. There is a shared object containing part of the CVODES
library, but it is not accessible from R. r-stanheaders
is only useful for developers who want to utilize the LinkingTo
directive of their package's DESCRIPTION file to build on the Stan library without incurring unnecessary dependencies.
The Stan project develops a probabilistic programming language that implements full or approximate Bayesian statistical inference via Markov Chain Monte Carlo or variational methods and implements (optionally penalized) maximum likelihood estimation via optimization. The Stan library includes an advanced automatic differentiation scheme, templated statistical and linear algebra functions that can handle the automatically differentiable scalar types (and doubles, ints, etc.), and a parser for the Stan language. The r-rstan
package provides user-facing R functions to parse, compile, test, estimate, and analyze Stan models.
This package provides a small collection of data on graduate statistics programs from the United States.
This package provides functions for creating, displaying, and evaluating stopping rules for safety monitoring in clinical studies.
An interface to explore trends in Twitter data using the Storywrangler Application Programming Interface (API), which can be found here: <https://github.com/janeadams/storywrangler>.
Explore and analyse the genealogy of textual or musical traditions, from their variants, with various stemmatological methods, mainly the disagreement-based algorithms suggested by Camps and Cafiero (2015) <doi:10.1484/M.LECTIO-EB.5.102565>.
This package provides drop-in replacements for functions from the stringr package, with the same user interface. These functions have no external dependencies and can be copied directly into your package code using the staticimports package.
Collection of stepwise procedures to conduct multiple hypotheses testing. The details of the stepwise algorithm can be found in Romano and Wolf (2007) <DOI:10.1214/009053606000001622> and Hsu, Kuan, and Yen (2014) <DOI:10.1093/jjfinec/nbu014>.
Fast multi-trait and multi-trail Genome Wide Association Studies (GWAS) following the method described in Zhou and Stephens. (2014), <doi:10.1038/nmeth.2848>. One of a series of statistical genetic packages for streamlining the analysis of typical plant breeding experiments developed by Biometris.
This package provides tools for Genotype by Environment Interaction (GEI) analysis, using statistical models and visualizations to assess genotype performance across environments. It helps researchers explore interaction effects, stability, and adaptability in multi-environment trials, identifying the best-performing genotypes in different conditions. Which Win Where!
This package provides a comprehensive logging framework for R applications that provides hierarchical logging levels, database integration, and contextual logging capabilities. The package supports SQLite storage for persistent logs, provides colour-coded console output for better readability, includes parallel processing support, and implements structured error reporting with JSON formatting.
Provide various functions and tools to help fit models for estimating treatment effects in stepped wedge cluster randomized trials. Implements methods described in Kenny, Voldal, Xia, and Heagerty (2022) "Analysis of stepped wedge cluster randomized trials in the presence of a time-varying treatment effect", <doi:10.1002/sim.9511>.
This package provides functions for stratified sampling and assigning custom labels to data, ensuring randomness within groups. The package supports various sampling methods such as stratified, cluster, and systematic sampling. It allows users to apply transformations and customize the sampling process. This package can be useful for statistical analysis and data preparation tasks.
This package provides functions to perform stepwise split regularized regression. The approach first uses a stepwise algorithm to split the variables into the models with a goodness of fit criterion, and then regularization is applied to each model. The weights of the models in the ensemble are determined based on a criterion selected by the user.
This package provides tools for power and sample size calculation as well as design diagnostics for longitudinal mixed model settings, with a focus on stepped wedge designs. All calculations are oracle estimates i.e. assume random effect variances to be known (or guessed) in advance. The method is introduced in Hussey and Hughes (2007) <doi:10.1016/j.cct.2006.05.007>, extensions are discussed in Li et al. (2020) <doi:10.1177/0962280220932962>.
An open source platform for validation and process control. Tools to analyze data from internal validation of forensic short tandem repeat (STR) kits are provided. The tools are developed to provide the necessary data to conform with guidelines for internal validation issued by the European Network of Forensic Science Institutes (ENFSI) DNA Working Group, and the Scientific Working Group on DNA Analysis Methods (SWGDAM). A front-end graphical user interface is provided. More information about each function can be found in the respective help documentation.
This package provides methods of Fundamental Analysis for Valuation of Equity included here serve as a quick reference for undergraduate courses on Stock Valuation and Chartered Financial Analyst Levels 1 and 2 Readings on Equity Valuation. Jerald E. Pinto (â Equity Asset Valuation (4th Edition)â , 2020, ISBN: 9781119628194). Chartered Financial Analyst Institute ("Chartered Financial Analyst Program Curriculum 2020 Level I Volumes 1-6. (Vol. 4, pp. 445-491)", 2019, ISBN: 9781119593577). Chartered Financial Analyst Institute ("Chartered Financial Analyst Program Curriculum 2020 Level II Volumes 1-6. (Vol. 4, pp. 197-447)", 2019, ISBN: 9781119593614).