This package provides a toolkit for causal inference in experimental and observational studies. Implements various simple Bayesian models including linear, negative binomial, and logistic regression for impact estimation. Provides functionality for randomization and checking baseline equivalence in experimental designs. The package aims to simplify the process of impact measurement for researchers and analysts across different fields. Examples and detailed usage instructions are available at <https://book.martinez.fyi>.
This package provides a variable selection procedure, dubbed KKO, for nonparametric additive model with finite-sample false discovery rate control guarantee. The method integrates three key components: knockoffs, subsampling for stability, and random feature mapping for nonparametric function approximation. For more information, see the accompanying paper: Dai, X., Lyu, X., & Li, L. (2021). â Kernel Knockoffs Selection for Nonparametric Additive Modelsâ . arXiv preprint <arXiv:2105.11659>.
Fast manipulation of symbolic multivariate polynomials using the Map class of the Standard Template Library. The package uses print and coercion methods from the mpoly package but offers speed improvements. It is comparable in speed to the spray package for sparse arrays, but retains the symbolic benefits of mpoly'. To cite the package in publications, use Hankin 2022 <doi:10.48550/ARXIV.2210.15991>. Uses disordR discipline.
Calculate the maximal fat oxidation, the exercise intensity that elicits the maximal fat oxidation and the SIN model to represent the fat oxidation kinetics. Three variables can be obtained from the SIN model: dilatation, symmetry and translation. Examples of these methods can be found in Montes de Oca et al (2021) <doi:10.1080/17461391.2020.1788650> and Chenevière et al. (2009) <doi:10.1249/MSS.0b013e31819e2f91>.
Offers a gentle introduction to machine learning concepts for practitioners with a statistical pedigree: decomposition of model error (bias-variance trade-off), nonlinear correlations, information theory and functional permutation/bootstrap simulations. Székely GJ, Rizzo ML, Bakirov NK. (2007). <doi:10.1214/009053607000000505>. Reshef DN, Reshef YA, Finucane HK, Grossman SR, McVean G, Turnbaugh PJ, Lander ES, Mitzenmacher M, Sabeti PC. (2011). <doi:10.1126/science.1205438>.
Simulation, estimation, prediction procedure, and model identification methods for nonlinear time series analysis, including threshold autoregressive models, Markov-switching models, convolutional functional autoregressive models, nonlinearity tests, Kalman filters and various sequential Monte Carlo methods. More examples and details about this package can be found in the book "Nonlinear Time Series Analysis" by Ruey S. Tsay and Rong Chen, John Wiley & Sons, 2018 (ISBN: 978-1-119-26407-1).
This package provides tools for performing Transition Network Analysis (TNA) to study relational dynamics, including functions for building and plotting TNA models, calculating centrality measures, and identifying dominant events and patterns. TNA statistical techniques (e.g., bootstrapping and permutation tests) ensure the reliability of observed insights and confirm that identified dynamics are meaningful. See (Saqr et al., 2025) <doi:10.1145/3706468.3706513> for more details on TNA.
This package implements various independence tests for discrete, continuous, and infinite-dimensional data. The tests are based on a U-statistic permutation test, the USP of Berrett, Kontoyiannis and Samworth (2020) <arXiv:2001.05513>, and shown to be minimax rate optimal in a wide range of settings. As the permutation principle is used, all tests have exact, non-asymptotic Type I error control at the nominal level.
Compute the standard expected years of life lost (YLL), as developed by the Global Burden of Disease Study (Murray, C.J., Lopez, A.D. and World Health Organization, 1996). The YLL is based on comparing the age of death to an external standard life expectancy curve. It also computes the average YLL, which highlights premature causes of death and brings attention to preventable deaths (Aragon et al., 2008).
The mzR package provides a unified API to the common file formats and parsers available for mass spectrometry data. It comes with a wrapper for the ISB random access parser for mass spectrometry mzXML, mzData and mzML files. The package contains the original code written by the ISB, and a subset of the proteowizard library for mzML and mzIdentML. The netCDF reading code has previously been used in XCMS.
The compound growth rate indicates the percentage change of a specific variable over a defined period. It is calculated using non-linear models, particularly the exponential model. To estimate the compound growth rates, the growth model is first converted to semilog form and then analyzed using Ordinary Least Squares (OLS) regression. This package has been developed using concept of Shankar et al. (2022)<doi:10.3389/fsufs.2023.1208898>.
Builds on the EMD package to provide additional tools for empirical mode decomposition (EMD) and Hilbert spectral analysis. It also implements the ensemble empirical decomposition (EEMD) and the complete ensemble empirical mode decomposition (CEEMD) methods to avoid mode mixing and intermittency problems found in EMD analysis. The package comes with several plotting methods that can be used to view intrinsic mode functions, the HHT spectrum, and the Fourier spectrum.
Multiscale Graph Correlation (MGC) is a framework developed by Vogelstein et al. (2019) <DOI:10.7554/eLife.41690> that extends global correlation procedures to be multiscale; consequently, MGC tests typically require far fewer samples than existing methods for a wide variety of dependence structures and dimensionalities, while maintaining computational efficiency. Moreover, MGC provides a simple and elegant multiscale characterization of the potentially complex latent geometry underlying the relationship.
Market area models are used to analyze and predict store choices and market areas concerning retail and service locations. This package implements two market area models (Huff Model, Multiplicative Competitive Interaction Model) into R, while the emphases lie on 1.) fitting these models based on empirical data via OLS regression and nonlinear techniques and 2.) data preparation and processing (esp. interaction matrices and data preparation for the MCI Model).
Analyse, plot, and tabulate antimicrobial minimum inhibitory concentration (MIC) data. Validate the results of an MIC experiment by comparing observed MIC values to a gold standard assay, in line with standards from the International Organization for Standardization (2021) <https://www.iso.org/standard/79377.html>. Perform MIC prediction from whole genome sequence data stored in the Pathosystems Resource Integration Center (2013) <doi:10.1093/nar/gkt1099> database or locally.
The goal of pak is to make package installation faster and more reliable. In particular, it performs all HTTP operations in parallel, so metadata resolution and package downloads are fast. Metadata and package files are cached on the local disk as well. pak has a dependency solver, so it finds version conflicts before performing the installation. This version of pak supports CRAN, Bioconductor and GitHub packages as well.
Pathway analysis based on p-values associated to genes from a genes expression analysis of interest. Utility functions enable to extract pathways from the Gene Ontology Biological Process (GOBP), Molecular Function (GOMF) and Cellular Component (GOCC), Kyoto Encyclopedia of Genes of Genomes (KEGG) and Reactome databases. Methodology, and helper functions to display the results as a table, barplot of pathway significance, Gene Ontology graph and pathway significance are available.
Call job::job(<code here>) to run R code as an RStudio job and keep your console free in the meantime. This allows for a productive workflow while testing (multiple) long-running chunks of code. It can also be used to organize results using the RStudio Jobs GUI or to test code in a clean environment. Two RStudio Addins can be used to run selected code as a job.
Fitting, cross-validating, and predicting with Bayesian Knowledge Tracing (BKT) models. It is designed for analyzing educational datasets to trace student knowledge over time. The package includes functions for fitting BKT models, evaluating their performance using various metrics, and making predictions on new data. It provides the similar functionality as the Python package pyBKT authored by Zachary A. Pardos (zp@berkeley.edu) at <https://github.com/CAHLR/pyBKT>.
Numerical integration of cause-specific survival curves to arrive at cause-specific cumulative incidence functions, with three usage modes: 1) Convenient API for parametric survival regression followed by competing-risk analysis, 2) API for CFC, accepting user-specified survival functions in R, and 3) Same as 2, but accepting survival functions in C++. For mathematical details and software tutorial, see Mahani and Sharabiani (2019) <DOI:10.18637/jss.v089.i09>.
Developed as a collaboration between Earth lab and the North Central Climate Adaptation Science Center to help users gain insights from available climate data. Includes tools and instructions for downloading climate data via a USGS API and then organizing those data for visualization and analysis that drive insight. Web interface for USGS API can be found at <http://thredds.northwestknowledge.net:8080/thredds/reacch_climate_CMIP5_aggregated_macav2_catalog.html>.
Mixed model analysis for quantitative genetics with multi-trait responses and pedigree-based partitioning of individual variation into a range of environmental and genetic variance components for individual and maternal effects. Method documented in dmmOverview.pdf; dmm is an implementation of dispersion mean model described by Searle et al. (1992) "Variance Components", Wiley, NY. Dmm() can do MINQUE', bias-corrected-ML', and REML variance and covariance component estimates.
Intensity-duration-frequency (IDF) curves are a widely used analysis-tool in hydrology to assess extreme values of precipitation [e.g. Mailhot et al., 2007, <doi:10.1016/j.jhydrol.2007.09.019>]. The package IDF provides functions to estimate IDF parameters for given precipitation time series on the basis of a duration-dependent generalized extreme value distribution [Koutsoyiannis et al., 1998, <doi:10.1016/S0022-1694(98)00097-3>].
Psychometric analysis and scoring of judgment data using polytomous Item-Response Theory (IRT) models, as described in Myszkowski and Storme (2019) <doi:10.1037/aca0000225> and Myszkowski (2021) <doi:10.1037/aca0000287>. A function is used to automatically compare and select models, as well as to present a variety of model-based statistics. Plotting functions are used to present category curves, as well as information, reliability and standard error functions.