Find all hierarchical models of specified generalized linear model with information criterion (AIC, BIC, or AICc) within specified cutoff of minimum value. Alternatively, find all such graphical models. Use branch and bound algorithm so we do not have to fit all models.
Fitting hidden Markov models of learning under the cognitive diagnosis framework. The estimation of the hidden Markov diagnostic classification model, the first order hidden Markov model, the reduced-reparameterized unified learning model, and the joint learning model for responses and response times.
Convert irregularly spaced longitudinal data into regular intervals for further analysis, and perform clustering using advanced machine learning techniques. The package is designed for handling complex longitudinal datasets, optimizing them for research in healthcare, demography, and other fields requiring temporal data modeling.
We provide the collection of data-sets used in the book An Introduction to Statistical Learning with Applications in R, Second Edition'. These include many data-sets that we used in the first edition (some with minor changes), and some new datasets.
Density, distribution function, quantile function and random generation for the K-distribution. A plotting function that plots data on Weibull paper and another function to draw additional lines. See results from package in T Lamont-Smith (2018), submitted J. R. Stat. Soc.
This package provides a new approach to detect change points based on smoothing and multiple testing, which is for long data sequence modeled as piecewise constant functions plus stationary Gaussian noise, see Dan Cheng and Armin Schwartzman (2015) <arXiv:1504.06384>.
Estimate coefficient of variation percent (CV%) for any arbitrary distribution, including some built-in estimates for commonly-used transformations in pharmacometrics. Methods are described in various sources, but applied here as summarized in: Prybylski, (2024) <doi:10.1007/s40262-023-01343-2>.
Fits single- and multiple-group penalized factor analysis models via a trust-region algorithm with integrated automatic multiple tuning parameter selection (Geminiani et al., 2021 <doi:10.1007/s11336-021-09751-8>). Available penalties include lasso, adaptive lasso, scad, mcp, and ridge.
Handles and formats author information in scientific writing in R Markdown and Quarto'. plume provides easy-to-use and flexible tools for inserting author data in YAML as well as generating author and contribution lists (among others) as strings from tabular data.
Calculates and plots the SiZer map for scatterplot data. A SiZer map is a way of examining when the p-th derivative of a scatterplot-smoother is significantly negative, possibly zero or significantly positive across a range of smoothing bandwidths.
This package provides utilities for conducting specification curve analyses (Simonsohn, Simmons & Nelson (2020, <doi: 10.1038/s41562-020-0912-z>) or multiverse analyses (Steegen, Tuerlinckx, Gelman & Vanpaemel, 2016, <doi: 10.1177/1745691616658637>) including functions to setup, run, evaluate, and plot all specifications.
Implementations self-normalization (SN) based algorithms for change-points estimation in time series data. This comprises nested local-window algorithms for detecting changes in both univariate and multivariate time series developed in Zhao, Jiang and Shao (2022) <doi:10.1111/rssb.12552>.
Cluster user-supplied somatic read counts with corresponding allele-specific copy number and tumor purity to infer feasible underlying intra-tumor heterogeneity in terms of number of subclones, multiplicity, and allocation (Little et al. (2019) <doi:10.1186/s13073-019-0643-9>).
This package implements S3 classes for storing dates and date-times based on the Jalali calendar. The main design goal of shide is consistency with base R's Date and POSIXct'. It provide features such as: date-time parsing, formatting and arithmetic.
Hierarchical models for the analysis of species-area relationships (SARs) by combining several data sets and covariates; with a global data set combining individual SAR studies; as described in Solymos and Lele (2012) <doi:10.1111/j.1466-8238.2011.00655.x>.
This package provides a set of functions that allow users for styling their R code according to the tidyverse style guide. The package uses a native Rust implementation to ensure the highest performance. Learn more about tergo at <https://rtergo.pagacz.io>.
Encapsulates the pattern of untidying data into a wide matrix, performing some processing, then turning it back into a tidy form. This is useful for several operations such as co-occurrence counts, correlations, or clustering that are mathematically convenient on wide matrices.
Simplify the process of extracting and processing Clinical Practice Research Datalink (CPRD) data in order to build datasets ready for statistical analysis. This process is difficult in R', as the raw data is very large and cannot be read into the R workspace. rcprd utilises RSQLite to create SQLite databases which are stored on the hard disk. These are then queried to extract the required information for a cohort of interest, and create datasets ready for statistical analysis. The processes follow closely that from the rEHR package, see Springate et al., (2017) <doi:10.1371/journal.pone.0171784>.
An implementation of a probabilistic modeling framework that jointly analyzes personal genome and transcriptome data to estimate the probability that a variant has regulatory impact in that individual. It is based on a generative model that assumes that genomic annotations, such as the location of a variant with respect to regulatory elements, determine the prior probability that variant is a functional regulatory variant, which is an unobserved variable. The functional regulatory variant status then influences whether nearby genes are likely to display outlier levels of gene expression in that person. See the RIVER website for more information, documentation and examples.
Understanding heterogeneous causal effects based on pretreatment covariates is a crucial step in modern empirical work in data science. Building on the recent developments in Calonico et al (2025) <https://rdpackages.github.io/references/Calonico-Cattaneo-Farrell-Palomba-Titiunik_2025_HTERD.pdf>, this package provides tools for estimation and inference of heterogeneous treatment effects in Regression Discontinuity (RD) Designs. The package includes two main commands: rdhte to conduct estimation and robust bias-corrected inference for conditional RD treatment effects (given choice of bandwidth parameter); rdbwhte', which implements automatic bandwidth selection methods; and rdhte_lincom to test linear combinations of parameters.
This package is used for the identification and validation of sequence motifs. It makes use of STAMP for comparing a set of motifs to a given database (e.g. JASPAR). It can also be used to visualize motifs, motif distributions, modules and filter motifs.
This package provides functions for fitting and plotting SITAR growth curve models. SITAR is a shape- invariant model with a regression B-spline mean curve and subject-specific random effects on both the measurement and age scales.
This package provides functions to compute insolation on tilted surfaces, computes atmospheric transmittance and related parameters such as: Earth radius vector, declination, sunset and sunrise, daylength, equation of time, vector in the direction of the sun, vector normal to surface, and some atmospheric physics.
Given a protein multiple sequence alignment, it is a daunting task to assess the effects of substitutions along sequence length. The aaSEA package is intended to help researchers to rapidly analyze property changes caused by single, multiple and correlated amino acid substitutions in proteins.