While autoregressive distributed lag (ARDL) models allow for extremely flexible dynamics, interpreting substantive significance of complex lag structures remains difficult. This package is designed to assist users in dynamically simulating and plotting the results of various ARDL models. It also contains post-estimation diagnostics, including a test for cointegration when estimating the error-correction variant of the autoregressive distributed lag model (Pesaran, Shin, and Smith 2001 <doi:10.1002/jae.616>).
This package implements information-theoretic measures to explore variable interactions, including KSG mutual information estimation for continuous variables from Kraskov et al. (2004) <doi:10.1103/PhysRevE.69.066138>, knockoff conditional mutual information described in Zhang & Chen (2025) <doi:10.1126/sciadv.adu6464>, synergistic-unique-redundant decomposition introduced by Martinez-Sanchez et al. (2024) <doi:10.1038/s41467-024-53373-4>, allowing detection of complex and diverse relationships among variables.
Estimation of life expectancy and Life Years Lost (LYL, or lillies for short) for a given population, for example those with a given disease or condition. In addition, the package can be used to compare estimates from different populations, or to estimate confidence intervals. Technical details of the method are available in Plana-Ripoll et al. (2020) <doi:10.1371/journal.pone.0228073> and Andersen (2017) <doi:10.1002/sim.7357>.
Implementation of the SSR-Algorithm. The Sign-Simplicity-Regression model is a nonparametric statistical model which is based on residual signs and simplicity assumptions on the regression function. Goal is to calculate the most parsimonious regression function satisfying the statistical adequacy requirements. Theory and functions are specified in Metzner (2020, ISBN: 979-8-68239-420-3, "Trendbasierte Prognostik") and Metzner (2021, ISBN: 979-8-59347-027-0, "Adäquates Maschinelles Lernen").
Calculates a modified Simplified Surface Energy Balance Index (SSEBI) and the Evaporative Fraction (EF) using geospatial raster data such as albedo and surface-air temperature difference (TSâ TA). The SSEBI is computed from albedo and TSâ TA to estimate surface moisture and evaporative dynamics, providing a robust assessment of surface dryness while accounting for atmospheric variations. Based on Roerink, Su, and Menenti (2000) <doi:10.1016/S1464-1909(99)00128-8>.
Estimating the force of infection from time varying, age varying, or constant serocatalytic models from population based seroprevalence studies using a Bayesian framework, including data simulation functions enabling the generation of serological surveys based on this models. This tool also provides a flexible prior specification syntax for the force of infection and the seroreversion rate, as well as methods to assess model convergence and comparison criteria along with useful visualisation functions.
This package implements an algorithm for Latent Dirichlet Allocation (LDA), Blei et at. (2003) <https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf>, using style conventions from the tidyverse', Wickham et al. (2019)<doi:10.21105/joss.01686>, and tidymodels', Kuhn et al.<https://tidymodels.github.io/model-implementation-principles/>. Fitting is done via collapsed Gibbs sampling. Also implements several novel features for LDA such as guided models and transfer learning.
Includes: (i) tests and visualisations that can help the modeller explore time series components and perform decomposition; (ii) modelling shortcuts, such as functions to construct lagmatrices and seasonal dummy variables of various forms; (iii) an implementation of the Theta method; (iv) tools to facilitate the design of the forecasting process, such as ABC-XYZ analyses; and (v) "quality of life" functions, such as treating time series for trailing and leading values.
Fit species distribution models (SDMs) using the tidymodels framework, which provides a standardised interface to define models and process their outputs. tidysdm expands tidymodels by providing methods for spatial objects, models and metrics specific to SDMs, as well as a number of specialised functions to process occurrences for contemporary and palaeo datasets. The full functionalities of the package are described in Leonardi et al. (2024) <doi:10.1111/2041-210X.14406>.
This package provides tools to analyze vaccine coverage data and simulate potential disease outbreak scenarios. It allows users to calculate key epidemiological metrics such as the effective reproduction number (Re), outbreak probabilities, and expected infection counts based on county-level vaccination rates, disease characteristics, and vaccine effectiveness. The package includes historical kindergarten vaccination data for Florida counties and offers functions for generating summary tables, visualizations, and exporting the underlying plot data.
The package ABarray is designed to work with Applied Biosystems whole genome microarray platform, as well as any other platform whose data can be transformed into expression data matrix. Functions include data preprocessing, filtering, control probe analysis, statistical analysis in one single function. A graphical user interface (GUI) is also provided. The raw data, processed data, graphics output and statistical results are organized into folders according to the analysis settings used.
Monocle performs differential expression and time-series analysis for single-cell expression experiments. It orders individual cells according to progress through a biological process, without knowing ahead of time which genes define progress through that process. Monocle also performs differential expression analysis, clustering, visualization, and other useful tasks on single cell expression data. It is designed to work with RNA-Seq and qPCR data, but could be used with other types as well.
This package implements a variety of methods for combining p-values in differential analyses of genome-scale datasets. Functions can combine p-values across different tests in the same analysis (e.g., genomic windows in ChIP-seq, exons in RNA-seq) or for corresponding tests across separate analyses (e.g., replicated comparisons, effect of different treatment conditions). Support is provided for handling log-transformed input p-values, missing values and weighting where appropriate.
PathNet uses topological information present in pathways and differential expression levels of genes (obtained from microarray experiment) to identify pathways that are 1) significantly enriched and 2) associated with each other in the context of differential expression. The algorithm is described in: PathNet: A tool for pathway analysis using topological information. Dutta B, Wallqvist A, and Reifman J. Source Code for Biology and Medicine 2012 Sep 24;7(1):10.
This is a package that includes pre-processing and quality control functions that can remove margin events, compensate and transform the data and that will use PeacoQCSignalStability for quality control. This last function will first detect peaks in each channel of the flowframe. It will remove anomalies based on the IsolationTree function and the MAD outlier detection method. This package can be used for both flow- and mass cytometry data.
Like all gene expression data, single-cell data suffers from batch effects and other unwanted variations that makes accurate biological interpretations difficult. The scMerge method leverages factor analysis, stably expressed genes (SEGs) and (pseudo-) replicates to remove unwanted variations and merge multiple single-cell data. This package contains all the necessary functions in the scMerge pipeline, including the identification of SEGs, replication-identification methods, and merging of single-cell data.
The NGS (Next-Generation Sequencing) reads from FFPE (Formalin-Fixed Paraffin-Embedded) samples contain numerous artifact chimeric reads (ACRS), which can lead to false positive structural variant calls. These ACRs are derived from the combination of two single-stranded DNA (ss-DNA) fragments with short reverse complementary regions (SRCRs). This package simulates these artifact chimeric reads as well as normal reads for FFPE samples on the whole genome / several chromosomes / large regions.
This package provides constructions of series of partially balanced incomplete block designs (PBIB) based on the combinatory method S, introduced by Rezgui et al. (2014) <doi:10.3844/jmssp.2014.45.48>. This package also offers the associated U-type designs. Version 1.1-1 generalizes the approach to designs with v = wnl treatments. It includes various rectangular and generalized rectangular right angular association schemes with 4, 5, and 7 associated classes.
This package provides methods of computerized adaptive testing for survey researchers. See Montgomery and Rossiter (2020) <doi:10.1093/jssam/smz027>. Includes functionality for data fit with the classic item response methods including the latent trait model, the Birnbaum three parameter model, the graded response, and the generalized partial credit model. Additionally, includes several ability parameter estimation and item selection routines. During item selection, all calculations are done in compiled C++ code.
It is an open source insurance claim simulation engine sponsored by the Casualty Actuarial Society. It generates individual insurance claims including open claims, reopened claims, incurred but not reported claims and future claims. It also includes claim data fitting functions to help set simulation assumptions. It is useful for claim level reserving analysis. Parodi (2013) <https://www.actuaries.org.uk/documents/triangle-free-reserving-non-traditional-framework-estimating-reserves-and-reserve-uncertainty>.
Unifying an inconsistently coded categorical variable between two different time points in accordance with a mapping table. The main rule is to replicate the observation if it could be assigned to a few categories. Then using frequencies or statistical methods to approximate the probabilities of being assigned to each of them. This procedure was invented and implemented in the paper by Nasinski, Majchrowska, and Broniatowska (2020) <doi:10.24425/cejeme.2020.134747>.
Bindings for additional classification models for use with the parsnip package. Models include flavors of discriminant analysis, such as linear (Fisher (1936) <doi:10.1111/j.1469-1809.1936.tb02137.x>), regularized (Friedman (1989) <doi:10.1080/01621459.1989.10478752>), and flexible (Hastie, Tibshirani, and Buja (1994) <doi:10.1080/01621459.1994.10476866>), as well as naive Bayes classifiers (Hand and Yu (2007) <doi:10.1111/j.1751-5823.2001.tb00465.x>).
Figures, data sets and examples from the book "A practical guide to ecological modelling - using R as a simulation platform" by Karline Soetaert and Peter MJ Herman (2009). Springer. All figures from chapter x can be generated by "demo(chapx)", where x = 1 to 11. The R-scripts of the model examples discussed in the book are in subdirectory "examples", ordered per chapter. Solutions to model projects are in the same subdirectories.
Designing experimental plans that involve both discrete and continuous factors with general parametric statistical models using the ForLion algorithm and EW ForLion algorithm. The algorithms searches for locally optimal designs and EW optimal designs under the D-criterion. See Huang, Y., Li, K., Mandal, A., & Yang, J., (2024) <doi:10.1007/s11222-024-10465-x> and Lin, S., Huang, Y., & Yang, J. (2025) <doi:10.48550/arXiv.2505.00629>.