This package implements the One Rule (OneR) Machine Learning classification algorithm (Holte, R.C. (1993) <doi:10.1023/A:1022631118932>) with enhancements for sophisticated handling of numeric data and missing values together with extensive diagnostic functions. It is useful as a baseline for machine learning models and the rules are often helpful heuristics.
The portmanteau local feature discriminant approach first identifies the local discriminant features and their differential structures, then constructs the discriminant rule by pooling the identified local features together. This method is applicable to high-dimensional matrix-variate data. See the paper by Xu, Luo and Chen (2023, <doi:10.1007/s13171-021-00255-2>).
Given a project schedule and associated costs, this package calculates the earned value to date. It is an implementation of Project Management Body of Knowledge (PMBOK) methodologies (reference Project Management Institute. (2021). A guide to the Project Management Body of Knowledge (PMBOK guide) (7th ed.). Project Management Institute, Newtown Square, PA, ISBN 9781628256673 (pdf)).
Quantile-based estimators (Q-estimators) can be used to fit any parametric distribution, using its quantile function. Q-estimators are usually more robust than standard maximum likelihood estimators. The method is described in: Sottile G. and Frumento P. (2022). Robust estimation and regression with parametric quantile functions. <doi:10.1016/j.csda.2022.107471>.
Strength training prescription using percent-based approach requires numerous computations and assumptions. STMr package allow users to estimate individual reps-max relationships, implement various progression tables, and create numerous set and rep schemes. The STMr package is originally created as a tool to help writing JovanoviÄ M. (2020) Strength Training Manual <ISBN:979-8604459898>.
Offers Bayesian semiparametric density estimation and tail-index estimation for heavy tailed data, by using a parametric, tail-respecting transformation of the data to the unit interval and then modeling the transformed data with a purely nonparametric logistic Gaussian process density prior. Based on Tokdar et al. (2022) <doi:10.1080/01621459.2022.2104727>.
Sparsity Oriented Importance Learning (SOIL) provides a new variable importance measure for high dimensional linear regression and logistic regression from a sparse penalization perspective, by taking into account the variable selection uncertainty via the use of a sensible model weighting. The package is an implementation of Ye, C., Yang, Y., and Yang, Y. (2017+).
Fast, reproducible detection and quantitative analysis of tertiary lymphoid structures (TLS) in multiplexed tissue imaging. Implements Independent Component Analysis Trace (ICAT) index, local Ripley's K scanning, automated K Nearest Neighbor (KNN)-based TLS detection, and T-cell clusters identification as described in Amiryousefi et al. (2025) <doi:10.1101/2025.09.21.677465>.
This is a statistical tool interactive that provides multivariate statistical tests that are more powerful than traditional Hotelling T2 test and LRT (likelihood ratio test) for the vector of normal mean populations with and without contamination and non-normal populations (Henrique J. P. Alves & Daniel F. Ferreira (2019) <DOI: 10.1080/03610918.2019.1693596>).
This package provides a suite of routines for Weyl algebras. Notation follows Coutinho (1995, ISBN 0-521-55119-6, "A Primer of Algebraic D-Modules"). Uses disordR discipline (Hankin 2022 <doi:10.48550/arXiv.2210.03856>). To cite the package in publications, use Hankin 2022 <doi:10.48550/arXiv.2212.09230>.
MDQC is a multivariate quality assessment method for microarrays based on quality control (QC) reports. The Mahalanobis distance of an array's quality attributes is used to measure the similarity of the quality of that array against the quality of the other arrays. Then, arrays with unusually high distances can be flagged as potentially low-quality.
Many modern biological datasets consist of small counts that are not well fit by standard linear-Gaussian methods such as principal component analysis. This package provides implementations of count-based feature selection and dimension reduction algorithms. These methods can be used to facilitate unsupervised analysis of any high-dimensional data such as single-cell RNA-seq.
This package is an implementation of about 6 major classes of statistical regression models. Currently only fixed-effects models are implemented, i.e., no random-effects models. Many (150+) models and distributions are estimated by maximum likelihood estimation (MLE) or penalized MLE, using Fisher scoring. VGLMs can be loosely thought of as multivariate generalised linear models.
reptyr is a utility for taking an existing running program and attaching it to a new terminal. Started a long-running process over ssh, but have to leave and don't want to interrupt it? Just start a screen, use reptyr to grab it, and then kill the ssh session and head on home.
Discovery of genome-wide variable alternative splicing events from short-read RNA-seq data and visualizations of gene splicing information for publication-quality multi-panel figures in a population. (Warning: The visualizing function is removed due to the dependent package Sushi deprecated. If you want to use it, please change back to an older version.).
This package provides tools for the quantitative analysis of axon integrity in microscopy images. It implements image pre-processing, adaptive thresholding, feature extraction, and support vector machine-based classification to compute indices such as the Axon Integrity Index (AII) and Degeneration Index (DI). The package is designed for reproducible and automated analysis in neuroscience research.
This package provides a collection of functions related to density estimation by using Chen's (2000) idea. Mean Squared Errors (MSE) are calculated for estimated curves. For this purpose, R functions allow the distribution to be Gamma, Exponential or Weibull. For details see Chen (2000), Scaillet (2004) <doi:10.1080/10485250310001624819> and Khan and Akbar.
This comprehensive framework for periodic time series modeling is designated as "CLIC" (The LIC for Distributed Cosine Regression Analysis) analysis. It is predicated on the assumption that the underlying data exhibits complex periodic structures beyond simple harmonic components. The philosophy of the method is articulated in Guo G. (2020) <doi:10.1080/02664763.2022.2053949>.
Latent process embedding for functional network data with the Functional Adjacency Spectral Embedding. Fits smooth latent processes based on cubic spline bases. Also generates functional network data from three models, and evaluates a network generalized cross-validation criterion for dimension selection. For more information, see MacDonald, Zhu and Levina (2022+) <arXiv:2210.07491>.
The half-weight index gregariousness (HWIG) is an association index used in social network analyses. It extends the half-weight association index (HWI), correcting for level of gregariousness in individuals. It is calculated using group by individual data according to methods described in Godde et al. (2013) <doi:10.1016/j.anbehav.2012.12.010>.
Creating effective colour palettes for figures is challenging. This package generates and plot palettes of optimally distinct colours in perceptually uniform colour space, based on iwanthue <http://tools.medialab.sciences-po.fr/iwanthue/>. This is done through k-means clustering of CIE Lab colour space, according to user-selected constraints on hue, chroma, and lightness.
This package implements Interpretable Boosted Linear Models (IBLMs). These combine a conventional generalized linear model (GLM) with a machine learning component, such as XGBoost. The package also provides tools within for explaining and analyzing these models. For more details see Gawlowski and Wang (2025) <https://ifoa-adswp.github.io/IBLM/reference/figures/iblm_paper.pdf>.
This package provides functions for dimension reduction, using MAVE (Minimum Average Variance Estimation), OPG (Outer Product of Gradient) and KSIR (sliced inverse regression of kernel version). Methods for selecting the best dimension are also included. Xia (2002) <doi:10.1111/1467-9868.03411>; Xia (2007) <doi:10.1214/009053607000000352>; Wang (2008) <doi:10.1198/016214508000000418>.
This package implements an MCMC sampler for the posterior distribution of arbitrary time-homogeneous multivariate stochastic differential equation (SDE) models with possibly latent components. The package provides a simple entry point to integrate user-defined models directly with the sampler's C++ code, and parallelizes large portions of the calculations when compiled with OpenMP'.