Fits sparse interaction models for continuous and binary responses subject to the strong (or weak) hierarchy restriction that an interaction between two variables only be included if both (or at least one of) the variables is included as a main effect. For more details, see Bien, J., Taylor, J., Tibshirani, R., (2013) "A Lasso for Hierarchical Interactions." Annals of Statistics. 41(3). 1111-1141.
The penalized inverse-variance weighted (pIVW) estimator is a Mendelian randomization method for estimating the causal effect of an exposure variable on an outcome of interest based on summary-level GWAS data. The pIVW estimator accounts for weak instruments and balanced horizontal pleiotropy simultaneously. See Xu S., Wang P., Fung W.K. and Liu Z. (2022) <doi:10.1111/biom.13732>.
Implementation of a next-generation, multi-stock age-structured fisheries assessment model. multiSA is intended for use in mixed fisheries where stock composition can not be readily identified in fishery data alone, e.g., from catch and age/length composition. Models can be fitted to genetic data, e.g., stock composition of catches and close-kin pairs, with seasonal stock availability and movement.
Extends the mlr3 ecosystem to functional analysis by adding support for irregular and regular functional data as defined in the tf package. The package provides PipeOps for preprocessing functional columns and for extracting scalar features, thereby allowing standard machine learning algorithms to be applied afterwards. Available operations include simple functional features such as the mean or maximum, smoothing, interpolation, flattening, and functional PCA'.
Comprehensive network analysis package. Calculate correlation network fastly, accelerate lots of analysis by parallel computing. Support for multi-omics data, search sub-nets fluently. Handle bigger data, more than 10,000 nodes in each omics. Offer various layout method for multi-omics network and some interfaces to other software ('Gephi', Cytoscape', ggplot2'), easy to visualize. Provide comprehensive topology indexes calculation, including ecological network stability.
This package provides transfusion-related differential tests on Near-infrared spectroscopy (NIRS) time series with detection limit, which contains two testing statistics: Mean Area Under the Curve (MAUC) and slope statistic. This package applied a penalized spline method within imputation setting. Testing is conducted by a nested permutation approach within imputation. Refer to Guo et al (2018) <doi:10.1177/0962280218786302> for further details.
Distributed reproducible computing framework, adopting ideas from git, docker and other software. By defining a lightweight interface around the inputs and outputs of an analysis, a lot of the repetitive work for reproducible research can be automated. We define a simple format for organising and describing work that facilitates collaborative reproducible research and acknowledges that all analyses are run multiple times over their lifespans.
XKCD described a supposedly "bad" colormap that it called a "Painbow" (see <https://xkcd.com/2537/>). But simple tests demonstrate that under some circumstances, the colormap can perform very well, and people can find information that is difficult to detect with the ggplot2 default and even supposedly "good" colormaps like viridis. This library let's you use the Painbow in your own ggplot graphs.
Consolidated data simulation, sample size calculation and analysis functions for several snSMART (small sample sequential, multiple assignment, randomized trial) designs under one library. See Wei, B., Braun, T.M., Tamura, R.N. and Kidwell, K.M. "A Bayesian analysis of small n sequential multiple assignment randomized trials (snSMARTs)." (2018) Statistics in medicine, 37(26), pp.3723-3732 <doi:10.1002/sim.7900>.
Estimating the Shapley values using the algorithm in the paper Liuqing Yang, Yongdao Zhou, Haoda Fu, Min-Qian Liu and Wei Zheng (2024) <doi:10.1080/01621459.2023.2257364> "Fast Approximation of the Shapley Values Based on Order-of-Addition Experimental Designs". You provide the data and define the value function, it retures the estimated Shapley values based on sampling methods or experimental designs.
Indirect method for the estimation of reference intervals (RIs) using Real-World Data ('RWD') and methods for comparing and verifying RIs. Estimates RIs by applying advanced statistical methods to routine diagnostic test measurements, which include both pathological and non-pathological samples, to model the distribution of non-pathological samples. This distribution is then used to derive reference intervals and support RI verification, i.e., deciding if a specific RI is suitable for the local population. The package also provides functions for printing and plotting algorithm results. See ?refineR for a detailed description of features. Version 1.0 of the algorithm is described in Ammer et al. (2021) <doi:10.1038/s41598-021-95301-2>. Additional guidance is in Ammer et al. (2023) <doi:10.1093/jalm/jfac101>. The verification method is described in Beck et al. (2025) <doi:10.1515/cclm-2025-0728>.
This package provides a set of tools for creation, manipulation, and modeling of tensors with arbitrary number of modes. A tensor in the context of data analysis is a multidimensional array. rTensor does this by providing a S4 class Tensor that wraps around the base array class. rTensor provides common tensor operations as methods, including matrix unfolding, summing/averaging across modes, calculating the Frobenius norm, and taking the inner product between two tensors. Familiar array operations are overloaded, such as index subsetting via [ and element-wise operations. rTensor also implements various tensor decomposition, including CP, GLRAM, MPCA, PVD, and Tucker. For tensors with 3 modes, rTensor also implements transpose, t-product, and t-SVD, as defined in Kilmer et al. (2013). Some auxiliary functions include the Khatri-Rao product, Kronecker product, and the Hadamard product for a list of matrices.
EBImage provides general purpose functionality for image processing and analysis. In the context of (high-throughput) microscopy-based cellular assays, EBImage offers tools to segment cells and extract quantitative cellular descriptors. This allows the automation of such tasks using the R programming language and facilitates the use of other tools in the R environment for signal processing, statistical modeling, machine learning and visualization with image data.
This package provides a function for estimating the parameters of Structural Bayesian Vector Autoregression models with the method developed by Baumeister and Hamilton (2015) <doi:10.3982/ECTA12356>, Baumeister and Hamilton (2017) <doi:10.3386/w24167>, and Baumeister and Hamilton (2018) <doi:10.1016/j.jmoneco.2018.06.005>. Functions for plotting impulse responses, historical decompositions, and posterior distributions of model parameters are also provided.
This package provides a tool for the preparation and enrichment of health datasets for analysis (Toner et al. (2023) <doi:10.1093/gigascience/giad030>). Provides functionality for assessing data quality and for improving the reliability and machine interpretability of a dataset. eHDPrep also enables semantic enrichment of a dataset where metavariables are discovered from the relationships between input variables determined from user-provided ontologies.
This package provides a shiny'-based graphical user interface for the earth package, enabling interactive building and exploration of Multivariate Adaptive Regression Splines (MARS) models. Features include data import from CSV and Excel files, automatic detection of categorical variables, interactive control of interaction terms via an allowed matrix, comprehensive model diagnostics with variable importance and partial dependence plots, and publication-quality report generation via Quarto'.
Solves a least squares system Ax~=b (dim(A)=(m,n) with m >= n) with a precondition matrix B: BAx=Bb (dim(B)=(n,m)). Implemented method is based on GMRES (Saad, Youcef; Schultz, Martin H. (1986). "GMRES: A Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear Systems" <doi:10.1137/0907058>) with callback functions, i.e. no explicit A, B or b are required.
This package provides tools for manipulating, visualizing, and exporting raster images in R. Designed as an educational resource for students learning the basics of remote sensing, the package provides user-friendly functions to apply color ramps, export RGB composites, and create multi-frame visualizations. Built on top of the terra and ggplot2 packages. See <https://github.com/ducciorocchini/imageRy> for more details and examples.
Estimates Variable Length Markov Chains (VLMC) models and VLMC with covariates models from discrete sequences. Supports model selection via information criteria and simulation of new sequences from an estimated model. See Bühlmann, P. and Wyner, A. J. (1999) <doi:10.1214/aos/1018031204> for VLMC and Zanin Zambom, A., Kim, S. and Lopes Garcia, N. (2022) <doi:10.1111/jtsa.12615> for VLMC with covariates.
This package provides a collection of methods for large scale single mediator hypothesis testing. The six included methods for testing the mediation effect are Sobel's test, Max P test, joint significance test under the composite null hypothesis, high dimensional mediation testing, divide-aggregate composite null test, and Sobel's test under the composite null hypothesis. Du et al (2023) <doi:10.1002/gepi.22510>.
This package provides a framework for multiple hypothesis testing based on distribution of p values. It is well known that the p values come from different distribution for null and alternatives, in this package we provide functions to detect that change. We provide a method for using the change in distribution of p values as a way to detect the true signals in the data.
Applies an objective Bayesian method to the Mb capture-recapture model to estimate the population size N. The Mb model is a class of capture-recapture methods used to account for variations in capture probability due to animal behavior. Under the Mb formulation, the initial capture of an animal may effect the probability of subsequent captures due to their becoming "trap happy" or "trap shy.".
An interface to the Apache OpenNLP tools (version 1.5.3). The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text written in Java. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. See <https://opennlp.apache.org/> for more information.
Simple method of purging independent variables of mediating effects. First, regress the direct variable on the indirect variable. Then, used the stored residuals as the new purged (direct) variable in the updated specification. This purging process allows for use of a new direct variable uncorrelated with the indirect variable. Please cite the method and/or package using Waggoner, Philip D. (2018) <doi:10.1177/1532673X18759644>.