This package provides a collection of tools that support data splitting, predictive modeling, and model evaluation. A typical function is to split a dataset into a training dataset and a test dataset. Then compare the data distribution of the two datasets. Another feature is to support the development of predictive models and to compare the performance of several predictive models, helping to select the best model.
Developed for use by those tasked with the routine detection, characterisation and quantification of discrete changes in air quality time-series, such as identifying the impacts of air quality policy interventions. The main functions use signal isolation then break-point/segment (BP/S) methods based on strucchange and segmented methods to detect and quantify change events (Ropkins & Tate, 2021, <doi:10.1016/j.scitotenv.2020.142374>).
Provided are Computational methods for Immune Cell-type Subsets, including:(1) DCQ (Digital Cell Quantifier) to infer global dynamic changes in immune cell quantities within a complex tissue; and (2) VoCAL (Variation of Cell-type Abundance Loci) a deconvolution-based method that utilizes transcriptome data to infer the quantities of immune-cell types, and then uses these quantitative traits to uncover the underlying DNA loci.
Realize three approaches for Gene-Environment interaction analysis. All of them adopt Sparse Group Minimax Concave Penalty to identify important G variables and G-E interactions, and simultaneously respect the hierarchy between main G and G-E interaction effects. All the three approaches are available for Linear, Logistic, and Poisson regression. Also realize to mine and construct prior information for G variables and G-E interactions.
Processing, analysis and visualization of Hydrogen Deuterium eXchange monitored by Mass Spectrometry experiments (HDX-MS). HaDeX2 introduces a new standardized and reproducible workflow for the analysis of the HDX-MS data, including uncertainty propagation, data aggregation and visualization on 3D structure. Additionally, it covers data exploration, quality control and generation of publication-quality figures. All functionalities are also available in the accompanying shiny app.
Implementation of two multi-criteria decision making methods (MCDM): Intuitionistic Fuzzy Synthetic Measure (IFSM) and Intuitionistic Fuzzy Technique for Order of Preference by Similarity to Ideal Solution (IFTOPSIS) for intuitionistic fuzzy data sets for multi-criteria decision making problems. References describing the methods: JefmaÅ ski (2020) <doi:10.1007/978-3-030-52348-0_4>; JefmaÅ ski, Roszkowska, Kusterka-JefmaÅ ska (2021) <doi:10.3390/e23121636>.
General purpose TIFF file I/O for R users. Currently the only such package with read and write support for TIFF files with floating point (real-numbered) pixels, and the only package that can correctly import TIFF files that were saved from ImageJ and write TIFF files than can be correctly read by ImageJ <https://imagej.net/ij/>. Also supports text image I/O.
This package provides tools for creating and using lenses to simplify data manipulation. Lenses are composable getter/setter pairs for working with data in a purely functional way. Inspired by the Haskell library lens (Kmett, 2012) <https://hackage.haskell.org/package/lens>. For a fairly comprehensive (and highly technical) history of lenses please see the lens wiki <https://github.com/ekmett/lens/wiki/History-of-Lenses>.
Linear Liu regression coefficient's estimation and testing with different Liu related measures such as MSE, R-squared etc. REFERENCES i. Akdeniz and Kaciranlar (1995) <doi:10.1080/03610929508831585> ii. Druilhet and Mom (2008) <doi:10.1016/j.jmva.2006.06.011> iii. Imdadullah, Aslam, and Saima (2017) iv. Liu (1993) <doi:10.1080/03610929308831027> v. Liu (2001) <doi:10.1016/j.jspi.2010.05.030>.
This package provides functions for fitting various models to capture-recapture data including mixed-effects Cormack-Jolly-Seber(CJS) and multistate models and the multi-variate state model structure for survival estimation and POPAN structured Jolly-Seber models for abundance estimation. There are also Hidden Markov model (HMM) implementations of CJS and multistate models with and without state uncertainty and a simulation capability for HMM models.
This package provides a client that grants access to the power of the ohsome API from R. It lets you analyze the rich data source of the OpenStreetMap (OSM) history. You can retrieve the geometry of OSM data at specific points in time, and you can get aggregated statistics on the evolution of OSM elements and specify your own temporal, spatial and/or thematic filters.
The online principal component regression method can process the online data set. OPCreg implements the online principal component regression method, which is specifically designed to process online datasets efficiently. This method is particularly useful for handling large-scale, streaming data where traditional batch processing methods may be computationally infeasible.The philosophy of the package is described in Guo (2025) <doi:10.1016/j.physa.2024.130308>.
This package provides functions for unconditional and conditional quantiles. These include methods for transformation-based quantile regression, quantile-based measures of location, scale and shape, methods for quantiles of discrete variables, quantile-based multiple imputation, restricted quantile regression, directional quantile classification, and quantile ratio regression. A vignette is given in Geraci (2016, The R Journal) <doi:10.32614/RJ-2016-037> and included in the package.
Semiparametric and parametric estimation of INAR models including a finite sample refinement (Faymonville et al. (2022) <doi:10.1007/s10260-022-00655-0>) for the semiparametric setting introduced in Drost et al. (2009) <doi:10.1111/j.1467-9868.2008.00687.x>, different procedures to bootstrap INAR data (Jentsch, C. and Weià , C.H. (2017) <doi:10.3150/18-BEJ1057>) and flexible simulation of INAR data.
Some tools for cleaning up messy Excel files to be suitable for R. People who have been working with Excel for years built more or less complicated sheets with names, characters, formats that are not homogeneous. To be able to use them in R nowadays, we built a set of functions that will avoid the majority of importation problems and keep all the data at best.
Carries out analyses of two-way tables with one observation per cell, together with graphical displays for an additive fit and a diagnostic plot for removable non-additivity via a power transformation of the response. It implements Tukey's Exploratory Data Analysis (1973) <ISBN: 978-0201076165> methods, including a 1-degree-of-freedom test for row*column non-additivity', linear in the row and column effects.
DegCre generates associations between differentially expressed genes (DEGs) and cis-regulatory elements (CREs) based on non-parametric concordance between differential data. The user provides GRanges of DEG TSS and CRE regions with differential p-value and optionally log-fold changes and DegCre returns an annotated Hits object with associations and their calculated probabilities. Additionally, the package provides functionality for visualization and conversion to other formats.
In metabolic flux experiments tracer molecules (often glucose containing labelled carbon) are incorporated in compounds measured using mass spectrometry. The mass isotopologue distributions of these compounds needs to be corrected for natural abundance of labelled carbon and other effects, which are specific on the compound and ionization technique applied. This package provides functions to correct such effects in gas chromatography atmospheric pressure chemical ionization mass spectrometry analyses.
Perform censored quantile regression of Huang (2010) <doi:10.1214/09-AOS771>, and restore monotonicity respecting via adaptive interpolation for dynamic regression of Huang (2017) <doi:10.1080/01621459.2016.1149070>. The monotonicity-respecting restoration applies to general dynamic regression models including (uncensored or censored) quantile regression model, additive hazards model, and dynamic survival models of Peng and Huang (2007) <doi:10.1093/biomet/asm058>, among others.
This package provides a simple interface for multivariate correlation analysis that unifies various classical statistical procedures including t-tests, tests in univariate and multivariate linear models, parametric and nonparametric tests for correlation, Kruskal-Wallis tests, common approximate versions of Wilcoxon rank-sum and signed rank tests, chi-squared tests of independence, score tests of particular hypotheses in generalized linear models, canonical correlation analysis and linear discriminant analysis.
Decorrelates a set of summary statistics (i.e., Z-scores or P-values per SNP) via Decorrelation by Orthogonal Transformation (DOT) approach and performs gene-set analyses by combining transformed statistic values; operations are performed with algorithms that rely only on the association summary results and the linkage disequilibrium (LD). For more details on DOT and its power, see Olga (2020) <doi:10.1371/journal.pcbi.1007819>.
Replication methods to compute some basic statistic operations (means, standard deviations, frequency tables, percentiles, mean comparisons using weighted effect coding, generalized linear models, and linear multilevel models) in complex survey designs comprising multiple imputed or nested imputed variables and/or a clustered sampling structure which both deserve special procedures at least in estimating standard errors. See the package documentation for a more detailed description along with references.
Lightweight utilities to estimate autoregressive (AR) and autoregressive moving average (ARMA) noise models from residuals and apply matched generalized least squares to whiten functional magnetic resonance imaging (fMRI) design and data matrices. The ARMA estimator follows a classic 1982 approach <doi:10.1093/biomet/69.1.81>, and a restricted AR family mirrors workflows described by Cox (2012) <doi:10.1016/j.neuroimage.2011.08.056>.
Fair machine learning regression models which take sensitive attributes into account in model estimation. Currently implementing Komiyama et al. (2018) <http://proceedings.mlr.press/v80/komiyama18a/komiyama18a.pdf>, Zafar et al. (2019) <https://www.jmlr.org/papers/volume20/18-262/18-262.pdf> and my own approach from Scutari, Panero and Proissl (2022) <doi:10.1007/s11222-022-10143-w> that uses ridge regression to enforce fairness.