This package provides a set of functions for working with American postal codes, which are known as ZIP Codes. These include accessing ZIP Code to ZIP Code Tabulation Area (ZCTA) crosswalks, retrieving demographic data for ZCTAs, and tabulating demographic data for three-digit ZCTAs.
This package contains functions for reading raw data in ImaGene TXT format obtained from Exiqon miRCURY LNA arrays, annotating them with appropriate GAL files, and normalizing them using a spike-in probe-based method. Other platforms and data formats are also supported.
This package provides a client to simplify fetching predictions from the Koina web service. Koina is a model repository enabling the remote execution of models. Predictions are generated as a response to HTTP/S requests, the standard protocol used for nearly all web traffic.
There are increasing demands on designing virus mutants with specific dinucleotide or codon composition. This tool can take both dinucleotide preference and/or codon usage bias into account while designing mutants. It is a powerful tool for in silico designs of DNA sequence mutants.
An unsupervised cross-validation method to select the optimal number of mutational signatures. A data set of mutational counts is split into training and validation data.Signatures are estimated in the training data and then used to predict the mutations in the validation data.
The Shaman package implements functions for resampling Hi-C matrices in order to generate expected contact distributions given constraints on marginal coverage and contact-distance probability distributions. The package also provides support for visualizing normalized matrices and statistical analysis of contact distributions around selected landmarks.
The glmnet package provides efficient procedures for fitting the entire lasso or elastic-net regularization path for linear and Poisson regression, as well as logistic, multinomial, Cox, multiple-response Gaussian and grouped multinomial models. The algorithm uses cyclical coordinate descent in a path-wise fashion.
This package provides an implementation of both the exact and approximation methods for computing the cumulative distribution function (CDF) of the Poisson binomial distribution. It also provides the probability mass function (PMF), quantile function, and random number generation for the Poisson binomial distribution.
The httpuv package provides low-level socket and protocol support for handling HTTP and WebSocket requests directly from within R. It is primarily intended as a building block for other packages, rather than making it particularly easy to create complete web applications using httpuv alone.
This package provides tools for the analysis of high-dimensional data developed/implemented at the group "Statistical Complexity Reduction In Molecular Epidemiology" (SCRIME). The main focus is on SNP data, but most of the functions can also be applied to other types of categorical data.
This package contains a variety of functions, based around regime shift analysis of paleoecological data. Citations: Rodionov() from Rodionov (2004) <doi:10.1029/2004GL019448> Lanzante() from Lanzante (1996) <doi:10.1002/(SICI)1097-0088(199611)16:11%3C1197::AID-JOC89%3E3.0.CO;2-L> Hellinger_trans from Numerical Ecology, Legendre & Legendre (ISBN 9780444538680) rolling_autoc from Liu, Gao & Wang (2018) <doi:10.1016/j.scitotenv.2018.06.276> Sample data sets lake_data & lake_RSI processed from Bush, Silman & Urrego (2004) <doi:10.1126/science.1090795> Sample data set January_PDO from NOAA: <https://www.ncei.noaa.gov/access/monitoring/pdo/>.
This package provides tools for implementing Retrieval-Augmented Generation (RAG) workflows with Large Language Models (LLM). Includes functions for document processing, text chunking, embedding generation, storage management, and content retrieval. Supports various document types and embedding providers ('Ollama', OpenAI'), with DuckDB as the default storage backend. Integrates with the ellmer package to equip chat objects with retrieval capabilities. Designed to offer both sensible defaults and customization options with transparent access to intermediate outputs. For a review of retrieval-augmented generation methods, see Gao et al. (2023) "Retrieval-Augmented Generation for Large Language Models: A Survey" <doi:10.48550/arXiv.2312.10997>.
Provides: (1) Tools to infer dominance hierarchies based on calculating Elo scores, but with custom functions to improve estimates in animals with relatively stable dominance ranks. (2) Tools to plot the shape of the dominance hierarchy and estimate the uncertainty of a given data set.
This package provides tools working with data from ACLED (Armed Conflict Location and Event Data). Functions include simplified access to ACLED's API (<https://apidocs.acleddata.com/>), methods for keeping local versions of ACLED data up-to-date, and functions for common ACLED data transformations.
Modeling associations between covariates and power spectra of replicated time series using a cepstral-based semiparametric framework. Implements a fast two-stage estimation procedure via Whittle likelihood and multivariate regression.The methodology is based on Li and Dong (2025) <doi:10.1080/10618600.2025.2473936>.
Downloads the Representative Market Rate Exchange (RMRE) from the <www.datos.gov.co> source. Allows setting the data series in time frequencies, splitting the time series through start and end functions, transforming the data set in log returns or levels, and making a Dynamic graph.
This package implements the DAAREM method for accelerating the convergence of slow, monotone sequences from smooth, fixed-point iterations such as the EM algorithm. For further details about the DAAREM method, see Henderson, N.C. and Varadhan, R. (2019) <doi:10.1080/10618600.2019.1594835>.
This package provides tools for automatic model selection and diagnostics for Climate and Environmental data. In particular the envcpt() function does automatic model selection between a variety of trend, changepoint and autocorrelation models. The envcpt() function should be your first port of call.
Estimates item and person parameters for the Continuous Response Model (CRM; Samejima, 1973, <doi:10.1007/BF02291114>), computes item fit residual statistics, draws empirical 3D item category response curves, draws theoretical 3D item category response curves, and generates data under the CRM for simulation studies.
Computes various effect sizes of the difference, their variance, and confidence interval. This package treats Cohen's d, Hedges d, biased/unbiased c (an effect size between a mean and a constant) and e (an effect size between means without assuming the variance equality).
Streamlines the training, evaluation, and comparison of multiple machine learning models with minimal code by providing comprehensive data preprocessing and support for a wide range of algorithms with hyperparameter tuning. It offers performance metrics and visualization tools to facilitate efficient and effective machine learning workflows.
The gap encodes the distance between clusters and improves interpretation of cluster heatmaps. The gaps can be of the same distance based on a height threshold to cut the dendrogram. Another option is to vary the size of gaps based on the distance between clusters.
This package provides a function that generates a customized correlation matrix based on limit values and proportions for intervals composed by its limits. It can also generate random matrices with low, medium, and high correlations, in which low, medium, and high thresholds are user-defined.
Various functions and algorithms are provided here for solving optimal matching tasks in the context of preclinical cancer studies. Further, various helper and plotting functions are provided for unsupervised and supervised machine learning as well as longitudinal mixed-effects modeling of tumor growth response patterns.