We propose a fully efficient sieve maximum likelihood method to estimate genotype-specific distribution of time-to-event outcomes under a nonparametric model. We can handle missing genotypes in pedigrees. We estimate the time-dependent hazard ratio between two genetic mutation groups using B-splines, while applying nonparametric maximum likelihood estimation to the reference baseline hazard function. The estimators are calculated via an expectation-maximization algorithm.
Fit a regression model for when the response variable is presented as a ratio or proportion. This adjustment can occur globally, with the same estimate for the entire study space, or locally, where a beta regression model is fitted for each region, considering only influential locations for that area. Da Silva, A. R. and Lima, A. O. (2017) <doi:10.1016/j.spasta.2017.07.011>.
Algorithms and utility functions for indoor positioning using fingerprinting techniques. These functions are designed for manipulation of RSSI (Received Signal Strength Intensity) data sets, estimation of positions,comparison of the performance of different models, and graphical visualization of data. Machine learning algorithms and methods such as k-nearest neighbors or probabilistic fingerprinting are implemented in this package to perform analysis and estimations over RSSI data sets.
Computes bootstrapped Monte Carlo estimate of p value of Kolmogorov-Smirnov (KS) test and likelihood ratio test for zero-inflated count data, based on the work of Aldirawi et al. (2019) <doi:10.1109/BHI.2019.8834661>. With the package, user can also find tools to simulate random deviates from zero inflated or hurdle models and obtain maximum likelihood estimate of unknown parameters in these models.
Generates random numbers corresponding to the events on a Poisson point process with changing event rates. This includes the possibility to incorporate additional information such as the number of events occurring or the location of an already known event. It can also generate the probability density functions of specific events in the cases where additional information is available. Based on Hohmann (2019) <arXiv:1901.10754>.
An implementation of the Jaya optimization algorithm for both single-objective and multi-objective problems. Jaya is a population-based, gradient-free optimization algorithm capable of solving constrained and unconstrained optimization problems without hyperparameters. This package includes features such as multi-objective Pareto optimization, adaptive population adjustment, and early stopping. For further details, see R.V. Rao (2016) <doi:10.5267/j.ijiec.2015.8.004>.
State space modelling is an efficient and flexible framework for statistical inference of a broad class of time series and other data. KFAS includes computationally efficient functions for Kalman filtering, smoothing, forecasting, and simulation of multivariate exponential family state space models, with observations from Gaussian, Poisson, binomial, negative binomial, and gamma distributions. See the paper by Helske (2017) <doi:10.18637/jss.v078.i10> for details.
This package provides tools for working with the Korea Standard Industrial Classification (KSIC). Includes datasets for the 9th, 10th, and 11th revisions. Functions include searching codes and names by keyword, converting codes across revisions, validating KSIC codes, and navigating the classification hierarchy (e.g., identifying parent or child categories). Intended for use in statistical analysis, data processing, and research involving South Koreaâ s industrial classification system.
This package provides fast and accurate inference for the parameter estimation problem in Ordinary Differential Equations, including the case when there are unobserved system components. Implements the MAGI method (MAnifold-constrained Gaussian process Inference) of Yang, Wong, and Kou (2021) <doi:10.1073/pnas.2020397118>. A user guide is provided by the accompanying software paper Wong, Yang, and Kou (2024) <doi:10.18637/jss.v109.i04>.
This package provides an R wrapper for the MD4C (Markdown for C') library. Functions exist for parsing markdown ('CommonMark compliant) along with support for other common markdown extensions (e.g. GitHub flavored markdown, LaTeX equation support, etc.). The package also provides a number of higher level functions for exploring and manipulating markdown abstract syntax trees as well as translating and displaying the documents.
This package provides fundamental functions for descriptive statistics, including MODE(), estimate_mode(), center_stats(), position_stats(), pct(), spread_stats(), kurt(), skew(), and shape_stats(), which assist in summarizing the center, spread, and shape of numeric data. For more details, see McCurdy (2025), "Introduction to Data Science with R" <https://jonmccurdy.github.io/Introduction-to-Data-Science/>.
We develop Multi-source Graph Synthesis (MUGS), an algorithm designed to create embeddings for pediatric Electronic Health Record (EHR) codes by leveraging graphical information from three distinct sources: (1) pediatric EHR data, (2) EHR data from the general patient population, and (3) existing hierarchical medical ontology knowledge shared across different patient populations. See Li et al. (2024) <doi:10.1038/s41746-024-01320-4> for details.
An implementation of optimal weight exchange algorithm Yang(2013) <doi:10.1080/01621459.2013.806268> for three models. They are Crossover model with subject dropout, crossover model with proportional first order residual effects and interference model. You can use it to find either A-opt or D-opt approximate designs. Exact designs can be automatically rounded from approximate designs and relative efficiency is provided as well.
Computation of second-generation p-values as described in Blume et al. (2018) <doi:10.1371/journal.pone.0188299> and Blume et al. (2019) <doi:10.1080/00031305.2018.1537893>. There are additional functions which provide power and type I error calculations, create graphs (particularly suited for large-scale inference usage), and a function to estimate false discovery rates based on second-generation p-value inference.
Bayesian trophic position models using stan by leveraging brms for stable isotope data. Trophic position models are derived by using equations from Post (2002) <doi:10.1890/0012-9658(2002)083[0703:USITET]2.0.CO;2>, Vander Zanden and Vadeboncoeur (2002) <doi:10.1890/0012-9658(2002)083[2152:FAIOBA]2.0.CO;2>, and Heuvel et al. (2024) <doi:10.1139/cjfas-2024-0028>.
Analyze Peptide Array Data and characterize peptide sequence space. Allows for high level visualization of global signal, Quality control based on replicate correlation and/or relative Kd, calculation of peptide Length/Charge/Kd parameters, Hits selection based on RFU Signal, and amino acid composition/basic motif recognition with RFU signal weighting. Basic signal trends can be used to generate peptides that follow the observed compositional trends.
qsea (quantitative sequencing enrichment analysis) was developed as the successor of the MEDIPS package for analyzing data derived from methylated DNA immunoprecipitation (MeDIP) experiments followed by sequencing (MeDIP-seq). However, qsea provides several functionalities for the analysis of other kinds of quantitative sequencing data (e.g. ChIP-seq, MBD-seq, CMS-seq and others) including calculation of differential enrichment between groups of samples.
Logging functions in RcppSpdlog provide access to the logging functionality from the spdlog C++ library. This package offers shorter convenience wrappers for the R functions which match the C++ functions, namely via, say, spdl::debug() at the debug level. The actual formatting is done by the fmt::format() function from the fmtlib library (that is also std::format() in C++20 or later).
The curl() and curl_download() functions provide highly configurable drop-in replacements for base url() and download.file() with better performance, support for encryption, gzip compression, authentication, and other libcurl goodies. The core of the package implements a framework for performing fully customized requests where data can be processed either in memory, on disk, or streaming via the callback or connection interfaces.
Enables binary package installations on Linux distributions. Provides functions to manage packages via the distribution's package manager. Also provides transparent integration with R's install.packages() and a fallback mechanism. When installed as a system package, interacts with the system's package manager without requiring administrative privileges via an integrated D-Bus service; otherwise, uses sudo. Currently, the following backends are supported: DNF, APT, ALPM.
This package provides a shortcut procedure is proposed to implement closed testing for large-scale multiple testings, especially with the global test. This shortcut is asymptotically equivalent to closed testing and post hoc. Users could detect any possible sets of features or pathways with family-wise error rate controlled. The global test is powerful to detect associations between a group of features and an outcome of interest.
This package provides a revision to the stats::ks.test() function and the associated ks.test.Rd help page. With one minor exception, it does not change the existing behavior of ks.test(), and it adds features necessary for doing one-sample tests with hypothesized discrete distributions. The package also contains cvm.test(), for doing one-sample Cramer-von Mises goodness-of-fit tests.
Wrapper functions that interface with FSL <http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/>, a powerful and commonly-used neuroimaging software, using system commands. The goal is to be able to interface with FSL completely in R, where you pass R objects of class nifti', implemented by package oro.nifti', and the function executes an FSL command and returns an R object of class nifti if desired.
Implementation of the GTE (Group Technical Effects) model for single-cell data. GTE is a quantitative metric to assess batch effects for individual genes in single-cell data. For a single-cell dataset, the user can calculate the GTE value for individual features (such as genes), and then identify the highly batch-sensitive features. Removing these highly batch-sensitive features results in datasets with low batch effects.