It can be necessary to limit the rate of execution of a loop or repeated function call e.g. to show or gather data only at particular intervals. This package includes two methods for limiting this execution rate; speed governors and timers. A speed governor will insert pauses during execution to meet a user-specified loop time. Timers are alarm clocks which will indicate whether a certain time has passed. These mechanisms are implemented in C to minimize processing overhead.
Duct tape the quanteda ecosystem (Benoit et al., 2018) <doi:10.21105/joss.00774> to modern Transformer-based text classification models (Wolf et al., 2020) <doi:10.18653/v1/2020.emnlp-demos.6>, in order to facilitate supervised machine learning for textual data. This package mimics the behaviors of quanteda.textmodels and provides a function to setup the Python environment to use the pretrained models from Hugging Face <https://huggingface.co/>. More information: <doi:10.5117/CCR2023.1.003.CHAN>.
Group SLOPE (Group Sorted L1 Penalized Estimation) is a penalized linear regression method that is used for adaptive selection of groups of significant predictors in a high-dimensional linear model. The Group SLOPE method can control the (group) false discovery rate at a user-specified level (i.e., control the expected proportion of irrelevant among all selected groups of predictors). For additional information about the implemented methods please see Brzyski, Gossmann, Su, Bogdan (2018) <doi:10.1080/01621459.2017.1411269>.
Supports modeling health outcomes using Bayesian hierarchical spatio-temporal models with complex covariate effects (e.g., linear, non-linear, interactions, distributed lag linear and non-linear models) in the INLA framework. It is designed to help users identify key drivers and predictors of disease risk by enabling streamlined model exploration, comparison, and visualization of complex covariate effects. See an application of the modelling framework in Lowe, Lee, O'Reilly et al. (2021) <doi:10.1016/S2542-5196(20)30292-8>.
This package provides a fragmentation spectra detection pipeline for high-throughput LC/HRMS data processing using peaklists generated by the IDSL.IPA workflow <doi:10.1021/acs.jproteome.2c00120>. The IDSL.CSA package can deconvolute fragmentation spectra from Composite Spectra Analysis (CSA), Data Dependent Acquisition (DDA) analysis, and various Data-Independent Acquisition (DIA) methods such as MS^E, All-Ion Fragmentation (AIF) and SWATH-MS analysis. The IDSL.CSA package was introduced in <doi:10.1021/acs.analchem.3c00376>.
Routines to handle family data with a pedigree object. The initial purpose was to create correlation structures that describe family relationships such as kinship and identity-by-descent, which can be used to model family data in mixed effects models, such as in the coxme function. Also includes a tool for pedigree drawing which is focused on producing compact layouts without intervention. Recent additions include utilities to trim the pedigree object with various criteria, and kinship for the X chromosome.
Estimation of a multi-group count regression models (i.e., Poisson, negative binomial) with latent covariates. This packages provides two extensions compared to ordinary count regression models based on a generalized linear model: First, measurement models for the predictors can be specified allowing to account for measurement error. Second, the count regression can be simultaneously estimated in multiple groups with stochastic group weights. The marginal maximum likelihood estimation is described in Kiefer & Mayer (2020) <doi:10.1080/00273171.2020.1751027>.
Simulate a (bivariate) multivariate renewal Hawkes (MRHawkes) self-exciting process, with given immigrant hazard rate functions and offspring density function. Calculate the likelihood of a MRHawkes process with given hazard rate functions and offspring density function for an (increasing) sequence of event times. Calculate the Rosenblatt residuals of the event times. Predict future event times based on observed event times up to a given time. For details see Stindl and Chen (2018) <doi:10.1016/j.csda.2018.01.021>.
This package provides tools for analyzing data generated from conjoint survey experiments, a method widely used in the social sciences for studying multidimensional preferences. The package implements estimation of marginal means (MMs) and average marginal component effects (AMCEs), with corrections for measurement error. Methods include profile-level and choice-level estimators, bias correction using intra-respondent reliability (IRR), and visualization utilities. For details on the methodology, see Clayton, Horiuchi, Kaufman, King, and Komisarchik (2025) <https://gking.harvard.edu/conjointE>.
This package implements the American Heart Association Predicting Risk of cardiovascular disease EVENTs (PREVENT) equations from Khan SS, Matsushita K, Sang Y, and colleagues (2023) <doi:10.1161/CIRCULATIONAHA.123.067626>, with optional comparison with their de facto predecessor, the Pooled Cohort Equations from the American Heart Association and American College of Cardiology (2013) <doi:10.1161/01.cir.0000437741.48606.98> and the revision to the Pooled Cohort Equations from Yadlowsky and colleagues (2018) <doi:10.7326/M17-3011>.
Monitoring reporting rates of subject-level clinical events (e.g. adverse events, protocol deviations) reported by clinical trial sites is an important aspect of risk-based quality monitoring strategy. Sites that are under-reporting or over-reporting events can be detected using bootstrap simulations during which patients are redistributed between sites. Site-specific distributions of event reporting rates are generated that are used to assign probabilities to the observed reporting rates. (Koneswarakantha 2024 <doi:10.1007/s43441-024-00631-8>).
This package aims to analyse count-based methylation data on predefined genomic regions, such as those obtained by targeted sequencing, and thus to identify differentially methylated regions (DMRs) that are associated with phenotypes or traits. The method is built a rich flexible model that allows for the effects, on the methylation levels, of multiple covariates to vary smoothly along genomic regions. At the same time, this method also allows for sequencing errors and can adjust for variability in cell type mixture.
For researchers to quickly and comprehensively acquire disease genes, so as to understand the mechanism of disease, we developed this program to acquire disease-related genes. The data is integrated from three public databases. The three databases are eDGAR', DrugBank and MalaCards'. The eDGAR is a comprehensive database, containing data on the relationship between disease and genes. DrugBank contains information on 13443 drugs and 5157 targets. MalaCards integrates human disease information, including disease-related genes.
Automatic model selection for structural time series decomposition into trend, cycle, and seasonal components, plus optionality for structural interpolation, using the Kalman filter. Koopman, Siem Jan and Marius Ooms (2012) "Forecasting Economic Time Series Using Unobserved Components Time Series Models" <doi:10.1093/oxfordhb/9780195398649.013.0006>. Kim, Chang-Jin and Charles R. Nelson (1999) "State-Space Models with Regime Switching: Classical and Gibbs-Sampling Approaches with Applications" <doi:10.7551/mitpress/6444.001.0001><http://econ.korea.ac.kr/~cjkim/>.
Provide early termination phase II trial designs with a decreasingly informative prior (DIP) or a regular Bayesian prior chosen by the user. The program can determine the minimum planned sample size necessary to achieve the user-specified admissible designs. The program can also perform power and expected sample size calculations for the tests in early termination Phase II trials. See Wang C and Sabo RT (2022) <doi:10.18203/2349-3259.ijct20221110>; Sabo RT (2014) <doi:10.1080/10543406.2014.888441>.
BEAST2 (<https://www.beast2.org>) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. BEAUti 2 (which is part of BEAST2') is a GUI tool that allows users to specify the many possible setups and generates the XML file BEAST2 needs to run. This package provides a way to create BEAST2 input files without active user input, but using R function calls instead.
This package provides functions to perform the following analyses: i) inferring epistasis from RNAi double knockdown data; ii) identifying gene pairs of multiple mutation patterns; iii) assessing association between gene pairs and survival; and iv) calculating the smallworldness of a graph (e.g., a gene interaction network). Data and analyses are described in Wang, X., Fu, A. Q., McNerney, M. and White, K. P. (2014). Widespread genetic epistasis among breast cancer genes. Nature Communications. 5 4828. <doi:10.1038/ncomms5828>.
This package provides various functions for reading and preparing the Panel Study of Income Dynamics (PSID) for longitudinal analysis, including functions that read the PSID's fixed width format files directly into R, rename all of the PSID's longitudinal variables so that recurring variables have consistent names across years, simplify assembling longitudinal datasets from cross sections of the PSID Family Files, and export the resulting PSID files into file formats common among other statistical programming languages ('SAS', STATA', and SPSS').
Simultaneously estimates sparse regression coefficients and response network structure in multivariate models with missing data. Unlike traditional approaches requiring imputation, handles missingness natively through unbiased estimating equations (MCAR/MAR compatible). Employs dual L1 regularization with automated selection via cross-validation or information criteria. Includes parallel computation, warm starts, adaptive grids, publication-ready visualizations, and prediction methods. Ideal for genomics, neuroimaging, and multi-trait studies with incomplete high-dimensional outcomes. See Zeng et al. (2025) <doi:10.48550/arXiv.2507.05990>.
This R package provides a calculation of between-cases AUC estimate, corresponding covariance, and variance estimate in the nested data problem. Also, the package has the function to simulate the nested data. The calculated between-cases AUC estimate is used to evaluate the reader's diagnostic performance in clinical tasks with nested data. For more details on the above methods, please refer to the paper by H Du, S Wen, Y Guo, F Jin, BD Gallas (2022) <doi:10.1177/09622802221111539>.
This package provides tools for the analysis of land use and cover (LUC) time series. It includes support for loading spatiotemporal raster data and synthesized spatial plotting. Several LUC change (LUCC) metrics in regular or irregular time intervals can be extracted and visualized through one- and multistep sankey and chord diagrams. A complete intensity analysis according to Aldwaik and Pontius (2012) <doi:10.1016/j.landurbplan.2012.02.010> is implemented, including tools for the generation of standardized multilevel output graphics.
Creation of linkage maps in polyploid species from marker dosage scores of an F1 cross from two heterozygous parents. Currently works for outcrossing diploid, autotriploid, autotetraploid and autohexaploid species, as well as segmental allotetraploids. Methods are described in a manuscript of Bourke et al. (2018) <doi:10.1093/bioinformatics/bty371>. Since version 1.1.0, both discrete and probabilistic genotypes are acceptable input; for more details on the latter see Liao et al. (2021) <doi:10.1007/s00122-021-03834-x>.
This package contains functions to fit proportional hazards (PH) model to partly interval-censored (PIC) data (Pan et al. (2020) <doi:10.1177/0962280220921552>), PH model with spatial frailty to spatially dependent PIC data (Pan and Cai (2021) <doi:10.1080/03610918.2020.1839497>), and mixed effects PH model to clustered PIC data. Each random intercept/random effect can follow both a normal prior and a Dirichlet process mixture prior. It also includes the corresponding functions for general interval-censored data.
Various functions for discrete time survival analysis and longitudinal analysis. SIMEX method for correcting for bias for errors-in-variables in a mixed effects model. Asymptotic mean and variance of different proportional hazards test statistics using different ties methods given two survival curves and censoring distributions. Score test and Wald test for regression analysis of grouped survival data. Calculation of survival curves for events defined by the response variable in a mixed effects model crossing a threshold with or without confirmation.