Vector operations between grapes: An infix-only package! The invctr functions perform common and less common operations on vectors, data frames matrices and list objects: - Extracting a value (range), or, finding the indices of a value (range). - Trimming, or padding a vector with a value of your choice. - Simple polynomial regression. - Set and membership operations. - General check & replace function for NAs, Inf and other values.
An easy way to work with census, survey, and geographic data provided by IPUMS in R. Generate and download data through the IPUMS API and load IPUMS files into R with their associated metadata to make analysis easier. IPUMS data describing 1.4 billion individuals drawn from over 750 censuses and surveys is available free of charge from the IPUMS website <https://www.ipums.org>.
This package implements penalised multivariate regression (i.e., for multiple outcomes and many features) by stacked generalisation (<doi:10.1093/bioinformatics/btab576>). For positively correlated outcomes, a single multivariate regression is typically more predictive than multiple univariate regressions. Includes functions for model fitting, extracting coefficients, outcome prediction, and performance measurement. For optional comparisons, install remMap from GitHub (<https://github.com/cran/remMap>).
An R code with a GUI for microclimate time series, with an emphasis on underground environments. KarsTS provides linear and nonlinear methods, including recurrence analysis (Marwan et al. (2007) <doi:10.1016/j.physrep.2006.11.001>) and filling methods (Moffat et al. (2007) <doi:10.1016/j.agrformet.2007.08.011>), as well as tools to manipulate easily time series and gap sets.
This package provides extensions for packages leaflet & mapdeck', many of which are used by package mapview'. Focus is on functionality readily available in Geographic Information Systems such as Quantum GIS'. Includes functions to display coordinates of mouse pointer position, query image values via mouse pointer and zoom-to-layer buttons. Additionally, provides a feature type agnostic function to add points, lines, polygons to a map.
Import bathymetric and hypsometric data from the NOAA (National Oceanic and Atmospheric Administration, <https://www.ncei.noaa.gov/products/etopo-global-relief-model>), GEBCO (General Bathymetric Chart of the Oceans, <https://www.gebco.net>) and other sources, plot xyz data to prepare publication-ready figures, analyze xyz data to extract transects, get depth / altitude based on geographical coordinates, or calculate z-constrained least-cost paths.
Extends the mlr3 package with a backend to transparently work with databases such as SQLite', DuckDB', MySQL', MariaDB', or PostgreSQL'. The package provides three additional backends: DataBackendDplyr relies on the abstraction of package dbplyr to interact with most DBMS. DataBackendDuckDB operates on DuckDB data bases and also on Apache Parquet files. DataBackendPolars operates on Polars data frames.
Robust nonparametric bootstrap and permutation tests for location, correlation, and regression problems, as described in Helwig (2019a) <doi:10.1002/wics.1457> and Helwig (2019b) <doi:10.1016/j.neuroimage.2019.116030>. Univariate and multivariate tests are supported. For each problem, exact tests and Monte Carlo approximations are available. Five different nonparametric bootstrap confidence intervals are implemented. Parallel computing is implemented via the parallel package.
The QRI_func() function performs quantile regression analysis using age and sex as predictors to calculate the Quantile Regression Index (QRI) score for each individualâ s regional brain imaging metrics and then averages across the regional scores to generate an average tissue specific score for each subject. The QRI_plot() is used to plot QRI and generate the normative curves for individual measurements.
This package provides models to identify bimodally expressed genes from RNAseq data based on the Bimodality Index. SIBERG models the RNAseq data in the finite mixture modeling framework and incorporates mechanisms for dealing with RNAseq normalization. Three types of mixture models are implemented, namely, the mixture of log normal, negative binomial, or generalized Poisson distribution. See Tong et al. (2013) <doi:10.1093/bioinformatics/bts713>.
Predicts the occurrence times (in day-of-year) of spring phenological events. Three methods, including the accumulated degree days (ADD) method, the accumulated days transferred to a standardized temperature (ADTS) method, and the accumulated developmental progress (ADP) method, were used. See Shi et al. (2017a) <doi:10.1016/j.agrformet.2017.04.001> and Shi et al. (2017b) <doi:10.1093/aesa/sax063> for details.
Estimate vaccine efficacy (VE) using immunogenicity data. The inclusion of immunogenicity data in regression models can increase precision in VE. The methods are described in the publications "Elucidating vaccine efficacy using a correlate of protection, demographics, and logistic regression" and "Improving precision of vaccine efficacy evaluation using immune correlate data in time-to-event models" by Julie Dudasova, Zdenek Valenta, and Jeffrey R. Sachs (2024).
Hamiltonian Monte Carlo for both continuous and discontinuous posterior distributions with a customizable trajectory length termination criterion. See Nishimura et al. (2020) <doi:10.1093/biomet/asz083> for the original Discontinuous Hamiltonian Monte Carlo; Hoffman et al. (2014) <doi:10.48550/arXiv.1111.4246> and Betancourt (2016) <doi:10.48550/arXiv.1601.00225> for the definition of possible Hamiltonian Monte Carlo termination criteria.
This package provides a comprehensive suite of utilities for univariate continuous probability distributions and reliability models. Includes functions to compute the probability density, cumulative distribution, quantile, reliability, and hazard functions, along with random variate generation. Also offers diagnostic and model assessment tools such as Quantile-Quantile (Q-Q) and Probability-Probability (P-P) plots, the Kolmogorov-Smirnov goodness-of-fit test, and model selection criteria including the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). Currently implements the following distributions: Burr X, Chen, Exponential Extension, Exponentiated Logistic, Exponentiated Weibull, Exponential Power, Flexible Weibull, Generalized Exponential, Gompertz, Generalized Power Weibull, Gumbel, Inverse Generalized Exponential, Linear Failure Rate, Log-Gamma, Logistic-Exponential, Logistic-Rayleigh, Log-log, Marshall-Olkin Extended Exponential, Marshall-Olkin Extended Weibull, and Weibull Extension distributions. Serves as a valuable resource for teaching and research in probability theory, reliability analysis, and applied statistical modeling.
ASURAT is a software for single-cell data analysis. Using ASURAT, one can simultaneously perform unsupervised clustering and biological interpretation in terms of cell type, disease, biological process, and signaling pathway activity. Inputting a single-cell RNA-seq data and knowledge-based databases, such as Cell Ontology, Gene Ontology, KEGG, etc., ASURAT transforms gene expression tables into original multivariate tables, termed sign-by-sample matrices (SSMs).
scider is a user-friendly R package providing functions to model the global density of cells in a slide of spatial transcriptomics data. All functions in the package are built based on the SpatialExperiment object, allowing integration into various spatial transcriptomics-related packages from Bioconductor. After modelling density, the package allows for serveral downstream analysis, including colocalization analysis, boundary detection analysis and differential density analysis.
Signal-to-Noise applied to Gene Expression Experiments. Signal-to-noise ratios can be used as a proxy for quality of gene expression studies and samples. The SNRs can be calculated on any gene expression data set as long as gene IDs are available, no access to the raw data files is necessary. This allows to flag problematic studies and samples in any public data set.
Various mRNA sequencing library preparation methods generate sequencing reads specifically from the transcript ends. Analyses that focus on quantification of isoform usage from such data can be aided by using truncated versions of transcriptome annotations, both at the alignment or pseudo-alignment stage, as well as in downstream analysis. This package implements some convenience methods for readily generating such truncated annotations and their corresponding sequences.
Developed for use by those tasked with the routine detection, characterisation and quantification of discrete changes in air quality time-series, such as identifying the impacts of air quality policy interventions. The main functions use signal isolation then break-point/segment (BP/S) methods based on strucchange and segmented methods to detect and quantify change events (Ropkins & Tate, 2021, <doi:10.1016/j.scitotenv.2020.142374>).
This package provides a collection of tools that support data splitting, predictive modeling, and model evaluation. A typical function is to split a dataset into a training dataset and a test dataset. Then compare the data distribution of the two datasets. Another feature is to support the development of predictive models and to compare the performance of several predictive models, helping to select the best model.
Provided are Computational methods for Immune Cell-type Subsets, including:(1) DCQ (Digital Cell Quantifier) to infer global dynamic changes in immune cell quantities within a complex tissue; and (2) VoCAL (Variation of Cell-type Abundance Loci) a deconvolution-based method that utilizes transcriptome data to infer the quantities of immune-cell types, and then uses these quantitative traits to uncover the underlying DNA loci.
Includes R functions for the estimation of tumor clones percentages for both snp data and (whole) genome sequencing data. See Cheng, Y., Dai, J. Y., Paulson, T. G., Wang, X., Li, X., Reid, B. J., & Kooperberg, C. (2017). Quantification of multiple tumor clones using gene array and sequencing data. The Annals of Applied Statistics, 11(2), 967-991, <doi:10.1214/17-AOAS1026> for more details.
Realize three approaches for Gene-Environment interaction analysis. All of them adopt Sparse Group Minimax Concave Penalty to identify important G variables and G-E interactions, and simultaneously respect the hierarchy between main G and G-E interaction effects. All the three approaches are available for Linear, Logistic, and Poisson regression. Also realize to mine and construct prior information for G variables and G-E interactions.
Implementation of two multi-criteria decision making methods (MCDM): Intuitionistic Fuzzy Synthetic Measure (IFSM) and Intuitionistic Fuzzy Technique for Order of Preference by Similarity to Ideal Solution (IFTOPSIS) for intuitionistic fuzzy data sets for multi-criteria decision making problems. References describing the methods: JefmaÅ ski (2020) <doi:10.1007/978-3-030-52348-0_4>; JefmaÅ ski, Roszkowska, Kusterka-JefmaÅ ska (2021) <doi:10.3390/e23121636>.