An easy way to work with census, survey, and geographic data provided by IPUMS in R. Generate and download data through the IPUMS API and load IPUMS files into R with their associated metadata to make analysis easier. IPUMS data describing 1.4 billion individuals drawn from over 750 censuses and surveys is available free of charge from the IPUMS website <https://www.ipums.org>.
Vector operations between grapes: An infix-only package! The invctr functions perform common and less common operations on vectors, data frames matrices and list objects: - Extracting a value (range), or, finding the indices of a value (range). - Trimming, or padding a vector with a value of your choice. - Simple polynomial regression. - Set and membership operations. - General check & replace function for NAs, Inf and other values.
This package implements penalised multivariate regression (i.e., for multiple outcomes and many features) by stacked generalisation (<doi:10.1093/bioinformatics/btab576>). For positively correlated outcomes, a single multivariate regression is typically more predictive than multiple univariate regressions. Includes functions for model fitting, extracting coefficients, outcome prediction, and performance measurement. For optional comparisons, install remMap from GitHub (<https://github.com/cran/remMap>).
Interact with the Entrez API hosted by the National Center for Biotechnology Information (NCBI), <https://www.ncbi.nlm.nih.gov/books/NBK25499/>. This package is focused on working with sequence metadata and links. It handles pagination and compensates for some API limitations to simplify these tasks. API calls are printed to the console to highlight how high-level queries are translated into individual HTTP requests.
An R code with a GUI for microclimate time series, with an emphasis on underground environments. KarsTS provides linear and nonlinear methods, including recurrence analysis (Marwan et al. (2007) <doi:10.1016/j.physrep.2006.11.001>) and filling methods (Moffat et al. (2007) <doi:10.1016/j.agrformet.2007.08.011>), as well as tools to manipulate easily time series and gap sets.
This package provides extensions for packages leaflet & mapdeck', many of which are used by package mapview'. Focus is on functionality readily available in Geographic Information Systems such as Quantum GIS'. Includes functions to display coordinates of mouse pointer position, query image values via mouse pointer and zoom-to-layer buttons. Additionally, provides a feature type agnostic function to add points, lines, polygons to a map.
Extends the mlr3 package with a backend to transparently work with databases such as SQLite', DuckDB', MySQL', MariaDB', or PostgreSQL'. The package provides three additional backends: DataBackendDplyr relies on the abstraction of package dbplyr to interact with most DBMS. DataBackendDuckDB operates on DuckDB data bases and also on Apache Parquet files. DataBackendPolars operates on Polars data frames.
Import bathymetric and hypsometric data from the NOAA (National Oceanic and Atmospheric Administration, <https://www.ncei.noaa.gov/products/etopo-global-relief-model>), GEBCO (General Bathymetric Chart of the Oceans, <https://www.gebco.net>) and other sources, plot xyz data to prepare publication-ready figures, analyze xyz data to extract transects, get depth / altitude based on geographical coordinates, or calculate z-constrained least-cost paths.
The QRI_func() function performs quantile regression analysis using age and sex as predictors to calculate the Quantile Regression Index (QRI) score for each individualâ s regional brain imaging metrics and then averages across the regional scores to generate an average tissue specific score for each subject. The QRI_plot() is used to plot QRI and generate the normative curves for individual measurements.
Predicts the occurrence times (in day-of-year) of spring phenological events. Three methods, including the accumulated degree days (ADD) method, the accumulated days transferred to a standardized temperature (ADTS) method, and the accumulated developmental progress (ADP) method, were used. See Shi et al. (2017a) <doi:10.1016/j.agrformet.2017.04.001> and Shi et al. (2017b) <doi:10.1093/aesa/sax063> for details.
This package provides models to identify bimodally expressed genes from RNAseq data based on the Bimodality Index. SIBERG models the RNAseq data in the finite mixture modeling framework and incorporates mechanisms for dealing with RNAseq normalization. Three types of mixture models are implemented, namely, the mixture of log normal, negative binomial, or generalized Poisson distribution. See Tong et al. (2013) <doi:10.1093/bioinformatics/bts713>.
Estimate vaccine efficacy (VE) using immunogenicity data. The inclusion of immunogenicity data in regression models can increase precision in VE. The methods are described in the publications "Elucidating vaccine efficacy using a correlate of protection, demographics, and logistic regression" and "Improving precision of vaccine efficacy evaluation using immune correlate data in time-to-event models" by Julie Dudasova, Zdenek Valenta, and Jeffrey R. Sachs (2024).
Hamiltonian Monte Carlo for both continuous and discontinuous posterior distributions with a customizable trajectory length termination criterion. See Nishimura et al. (2020) <doi:10.1093/biomet/asz083> for the original Discontinuous Hamiltonian Monte Carlo; Hoffman et al. (2014) <doi:10.48550/arXiv.1111.4246> and Betancourt (2016) <doi:10.48550/arXiv.1601.00225> for the definition of possible Hamiltonian Monte Carlo termination criteria.
This package provides a comprehensive suite of utilities for univariate continuous probability distributions and reliability models. Includes functions to compute the probability density, cumulative distribution, quantile, reliability, and hazard functions, along with random variate generation. Also offers diagnostic and model assessment tools such as Quantile-Quantile (Q-Q) and Probability-Probability (P-P) plots, the Kolmogorov-Smirnov goodness-of-fit test, and model selection criteria including the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). Currently implements the following distributions: Burr X, Chen, Exponential Extension, Exponentiated Logistic, Exponentiated Weibull, Exponential Power, Flexible Weibull, Generalized Exponential, Gompertz, Generalized Power Weibull, Gumbel, Inverse Generalized Exponential, Linear Failure Rate, Log-Gamma, Logistic-Exponential, Logistic-Rayleigh, Log-log, Marshall-Olkin Extended Exponential, Marshall-Olkin Extended Weibull, and Weibull Extension distributions. Serves as a valuable resource for teaching and research in probability theory, reliability analysis, and applied statistical modeling.
ASURAT is a software for single-cell data analysis. Using ASURAT, one can simultaneously perform unsupervised clustering and biological interpretation in terms of cell type, disease, biological process, and signaling pathway activity. Inputting a single-cell RNA-seq data and knowledge-based databases, such as Cell Ontology, Gene Ontology, KEGG, etc., ASURAT transforms gene expression tables into original multivariate tables, termed sign-by-sample matrices (SSMs).
scider is an user-friendly R package providing functions to model the global density of cells in a slide of spatial transcriptomics data. All functions in the package are built based on the SpatialExperiment object, allowing integration into various spatial transcriptomics-related packages from Bioconductor. After modelling density, the package allows for serveral downstream analysis, including colocalization analysis, boundary detection analysis and differential density analysis.
Signal-to-Noise applied to Gene Expression Experiments. Signal-to-noise ratios can be used as a proxy for quality of gene expression studies and samples. The SNRs can be calculated on any gene expression data set as long as gene IDs are available, no access to the raw data files is necessary. This allows to flag problematic studies and samples in any public data set.
Various mRNA sequencing library preparation methods generate sequencing reads specifically from the transcript ends. Analyses that focus on quantification of isoform usage from such data can be aided by using truncated versions of transcriptome annotations, both at the alignment or pseudo-alignment stage, as well as in downstream analysis. This package implements some convenience methods for readily generating such truncated annotations and their corresponding sequences.
This package provides a collection of tools that support data splitting, predictive modeling, and model evaluation. A typical function is to split a dataset into a training dataset and a test dataset. Then compare the data distribution of the two datasets. Another feature is to support the development of predictive models and to compare the performance of several predictive models, helping to select the best model.
Provided are Computational methods for Immune Cell-type Subsets, including:(1) DCQ (Digital Cell Quantifier) to infer global dynamic changes in immune cell quantities within a complex tissue; and (2) VoCAL (Variation of Cell-type Abundance Loci) a deconvolution-based method that utilizes transcriptome data to infer the quantities of immune-cell types, and then uses these quantitative traits to uncover the underlying DNA loci.
This package provides estimation utilities for binary Emax dose-response models. Includes Expectation-Maximization based maximum likelihood estimation when the binary response is missing, as well as bias-reduced estimators including Jeffreys-penalized likelihood, Firth-score, and Cox-Snell corrections.The methodology is described in Zhang, Pradhan, and Zhao (2025) <doi:10.1177/09622802251403356> and Zhang, Pradhan, and Zhao (2026) <doi:10.1080/10543406.2026.2627387>.
Realize three approaches for Gene-Environment interaction analysis. All of them adopt Sparse Group Minimax Concave Penalty to identify important G variables and G-E interactions, and simultaneously respect the hierarchy between main G and G-E interaction effects. All the three approaches are available for Linear, Logistic, and Poisson regression. Also realize to mine and construct prior information for G variables and G-E interactions.
Processing, analysis and visualization of Hydrogen Deuterium eXchange monitored by Mass Spectrometry experiments (HDX-MS). HaDeX2 introduces a new standardized and reproducible workflow for the analysis of the HDX-MS data, including uncertainty propagation, data aggregation and visualization on 3D structure. Additionally, it covers data exploration, quality control and generation of publication-quality figures. All functionalities are also available in the accompanying shiny app.
General purpose TIFF file I/O for R users. Currently the only such package with read and write support for TIFF files with floating point (real-numbered) pixels, and the only package that can correctly import TIFF files that were saved from ImageJ and write TIFF files than can be correctly read by ImageJ <https://imagej.net/ij/>. Also supports text image I/O.