This package provides new functions info(), warn() and error(), similar to message(), warning() and stop() respectively. However, the new functions can have a level associated with them, so that when executed the global level option determines whether they are shown or not. This allows debug modes, outputting more information. The can also output all messages to a log file.
Ordered homogeneity pursuit lasso (OHPL) algorithm for group variable selection proposed in Lin et al. (2017) <DOI:10.1016/j.chemolab.2017.07.004>. The OHPL method exploits the homogeneity structure in high-dimensional data and enjoys the grouping effect to select groups of important variables automatically. This feature makes it particularly useful for high-dimensional datasets with strongly correlated variables, such as spectroscopic data.
This package provides a system for fast, accurate, and flexible whole genome bisulfite sequencing (WGBS) data analysis of two-condition comparisons. Principal Component BiSulfite, PCBS', assigns methylated loci eigenvector values from the treatment-delineating principal component in lieu of running millions of pairwise statistical tests, which dramatically increases analysis flexibility and reduces computational requirements. Methods: <https://katlande.github.io/PCBS/articles/Differential_Methylation.html>.
This package provides a collection of functions to simulate, estimate and forecast a wide range of regression based dynamic models for positive time series. This package implements the results presented in Prass, T.S.; Pumi, G.; Taufemback, C.G. and Carlos, J.H. (2025). "Positive time series regression models: theoretical and computational aspects". Computational Statistics 40, 1185â 1215. <doi:10.1007/s00180-024-01531-z>.
This package implements recently developed projection pursuit algorithms for finding optimal linear cluster separators. The clustering algorithms use optimal hyperplane separators based on minimum density, Pavlidis et. al (2016) <http://jmlr.org/papers/volume17/15-307/15-307.pdf>; minimum normalised cut, Hofmeyr (2017) <doi:10.1109/TPAMI.2016.2609929>; and maximum variance ratio clusterability, Hofmeyr and Pavlidis (2015) <doi:10.1109/SSCI.2015.116>.
We present a penalized log-density estimation method using Legendre polynomials with lasso penalty to adjust estimate's smoothness. Re-expressing the logarithm of the density estimator via a linear combination of Legendre polynomials, we can estimate parameters by maximizing the penalized log-likelihood function. Besides, we proposed an implementation strategy that builds on the coordinate decent algorithm, together with the Bayesian information criterion (BIC).
Spatial Stochastic Frontier Analysis (SSFA) is an original method for controlling the spatial heterogeneity in Stochastic Frontier Analysis (SFA) models, for cross-sectional data, by splitting the inefficiency term into three terms: the first one related to spatial peculiarities of the territory in which each single unit operates, the second one related to the specific production features and the third one representing the error term.
Parameter estimation for zero-inflated discrete Weibull (ZIDW) regression models, the univariate setting, distribution functions, functions to generate randomized quantile residuals a pseudo R2, and plotting of rootograms. For more details, see Kalktawi (2017) <https://bura.brunel.ac.uk/handle/2438/14476>, Taconeli and Rodrigues de Lara (2022) <doi:10.1080/00949655.2021.2005597>, and Yeh and Young (2025) <doi:10.1080/03610918.2025.2464076>.
We provide a flexible Zero-inflated Poisson-Gamma Model (ZIPG) by connecting both the mean abundance and the variability to different covariates, and build valid statistical inference procedures for both parameter estimation and hypothesis testing. These functions can be used to analyze microbiome count data with zero-inflation and overdispersion. The model is discussed in Jiang et al (2023) <doi:10.1080/01621459.2022.2151447>.
The curl() and curl_download() functions provide highly configurable drop-in replacements for base url() and download.file() with better performance, support for encryption, gzip compression, authentication, and other libcurl goodies. The core of the package implements a framework for performing fully customized requests where data can be processed either in memory, on disk, or streaming via the callback or connection interfaces.
Logging functions in RcppSpdlog provide access to the logging functionality from the spdlog C++ library. This package offers shorter convenience wrappers for the R functions which match the C++ functions, namely via, say, spdl::debug() at the debug level. The actual formatting is done by the fmt::format() function from the fmtlib library (that is also std::format() in C++20 or later).
qsea (quantitative sequencing enrichment analysis) was developed as the successor of the MEDIPS package for analyzing data derived from methylated DNA immunoprecipitation (MeDIP) experiments followed by sequencing (MeDIP-seq). However, qsea provides several functionalities for the analysis of other kinds of quantitative sequencing data (e.g. ChIP-seq, MBD-seq, CMS-seq and others) including calculation of differential enrichment between groups of samples.
Single linkage clustering and connected component analyses are often performed on biological images. Bioi provides a set of functions for performing these tasks. This functionality is implemented in several key functions that can extend to from 1 to many dimensions. The single linkage clustering method implemented here can be used on n-dimensional data sets, while connected component analyses are limited to 3 or fewer dimensions.
This package provides a tool to use a principal component analysis on radially averaged two dimensional Fourier spectra to characterize image texture. The method within the context of ecology was first described by Couteron et al. (2005) <doi:10.1111/j.1365-2664.2005.01097.x> and expanded upon by Solorzano et al. (2018) <doi:10.1117/1.JRS.12.036006> using a moving window approach.
Fit a regression model for when the response variable is presented as a ratio or proportion. This adjustment can occur globally, with the same estimate for the entire study space, or locally, where a beta regression model is fitted for each region, considering only influential locations for that area. Da Silva, A. R. and Lima, A. O. (2017) <doi:10.1016/j.spasta.2017.07.011>.
We propose a fully efficient sieve maximum likelihood method to estimate genotype-specific distribution of time-to-event outcomes under a nonparametric model. We can handle missing genotypes in pedigrees. We estimate the time-dependent hazard ratio between two genetic mutation groups using B-splines, while applying nonparametric maximum likelihood estimation to the reference baseline hazard function. The estimators are calculated via an expectation-maximization algorithm.
Computes bootstrapped Monte Carlo estimate of p value of Kolmogorov-Smirnov (KS) test and likelihood ratio test for zero-inflated count data, based on the work of Aldirawi et al. (2019) <doi:10.1109/BHI.2019.8834661>. With the package, user can also find tools to simulate random deviates from zero inflated or hurdle models and obtain maximum likelihood estimate of unknown parameters in these models.
Generates random numbers corresponding to the events on a Poisson point process with changing event rates. This includes the possibility to incorporate additional information such as the number of events occurring or the location of an already known event. It can also generate the probability density functions of specific events in the cases where additional information is available. Based on Hohmann (2019) <arXiv:1901.10754>.
Algorithms and utility functions for indoor positioning using fingerprinting techniques. These functions are designed for manipulation of RSSI (Received Signal Strength Intensity) data sets, estimation of positions,comparison of the performance of different models, and graphical visualization of data. Machine learning algorithms and methods such as k-nearest neighbors or probabilistic fingerprinting are implemented in this package to perform analysis and estimations over RSSI data sets.
An implementation of the Jaya optimization algorithm for both single-objective and multi-objective problems. Jaya is a population-based, gradient-free optimization algorithm capable of solving constrained and unconstrained optimization problems without hyperparameters. This package includes features such as multi-objective Pareto optimization, adaptive population adjustment, and early stopping. For further details, see R.V. Rao (2016) <doi:10.5267/j.ijiec.2015.8.004>.
This package provides tools for working with the Korea Standard Industrial Classification (KSIC). Includes datasets for the 9th, 10th, and 11th revisions. Functions include searching codes and names by keyword, converting codes across revisions, validating KSIC codes, and navigating the classification hierarchy (e.g., identifying parent or child categories). Intended for use in statistical analysis, data processing, and research involving South Koreaâ s industrial classification system.
State space modelling is an efficient and flexible framework for statistical inference of a broad class of time series and other data. KFAS includes computationally efficient functions for Kalman filtering, smoothing, forecasting, and simulation of multivariate exponential family state space models, with observations from Gaussian, Poisson, binomial, negative binomial, and gamma distributions. See the paper by Helske (2017) <doi:10.18637/jss.v078.i10> for details.
This package provides fast and accurate inference for the parameter estimation problem in Ordinary Differential Equations, including the case when there are unobserved system components. Implements the MAGI method (MAnifold-constrained Gaussian process Inference) of Yang, Wong, and Kou (2021) <doi:10.1073/pnas.2020397118>. A user guide is provided by the accompanying software paper Wong, Yang, and Kou (2024) <doi:10.18637/jss.v109.i04>.
We develop Multi-source Graph Synthesis (MUGS), an algorithm designed to create embeddings for pediatric Electronic Health Record (EHR) codes by leveraging graphical information from three distinct sources: (1) pediatric EHR data, (2) EHR data from the general patient population, and (3) existing hierarchical medical ontology knowledge shared across different patient populations. See Li et al. (2024) <doi:10.1038/s41746-024-01320-4> for details.