This package provides a fast implementation of the SWAG algorithm for Generalized Linear Models which allows to perform a meta-learning procedure that combines screening and wrapper methods to find a set of extremely low-dimensional attribute combinations. The package then performs test on the network of selected models to identify the variables that are highly predictive by using entropy-based network measures.
This package provides tools for modeling non-continuous linear responses of ecological communities to environmental data. The package is straightforward through three steps: (1) data ordering (function OrdData()), (2) split-moving-window analysis (function SMW()) and (3) piecewise redundancy analysis (function pwRDA()). Relevant references include Cornelius and Reynolds (1991) <doi:10.2307/1941559> and Legendre and Legendre (2012, ISBN: 9780444538697).
This package provides functions to perform split robust least angle regression. The approach first uses the least angle regression algorithm to split the variables into the models of an ensemble and robust estimates of the correlation between predictors. An elastic net estimator is then applied to the selected predictors in each model using the imputed data from the detect deviating cell (DDC) method.
The goal of SAFEPG is to predict climate-related extreme losses by fitting a frequency-severity model. It improves predictive performance by introducing a sign-aligned regularization term, which ensures consistent signs for the coefficients across the frequency and severity components. This enhancement not only increases model accuracy but also enhances its interpretability, making it more suitable for practical applications in risk assessment.
This package infers the V genotype of an individual from immunoglobulin (Ig) repertoire sequencing data (AIRR-Seq, Rep-Seq). Includes detection of any novel alleles. This information is then used to correct existing V allele calls from among the sample sequences. Citations: Gadala-Maria, et al (2015) <doi:10.1073/pnas.1417683112>, Gadala-Maria, et al (2019) <doi:10.3389/fimmu.2019.00129>.
This package provides a Tcl/Tk Graphical User Interface (GUI) to display images than can be zoomed and panned using the mouse and keyboard shortcuts. tkImgR read and write different image formats (PPM/PGM, PNG and GIF) using the standard Tcl/Tk distribution (>=8.6), but other formats (JPEG, TIFF, CR2) can be handled using the tkImg package for Tcl/Tk'.
This package provides functions for defining and conducting a time series prediction process including pre(post)processing, decomposition, modelling, prediction and accuracy assessment. The generated models and its yielded prediction errors can be used for benchmarking other time series prediction methods and for creating a demand for the refinement of such methods. For this purpose, benchmark data from prediction competitions may be used.
An implementation of three procedures developed by John Tukey: FUNOP (FUll NOrmal Plot), FUNOR-FUNOM (FUll NOrmal Rejection-FUll NOrmal Modification), and vacuum cleaner. Combined, they provide a way to identify, treat, and analyze outliers in two-way (i.e., contingency) tables, as described in his landmark paper "The Future of Data Analysis", Tukey, John W. (1962) <https://www.jstor.org/stable/2237638>.
The vcfpp.h (<https://github.com/Zilong-Li/vcfpp>) provides an easy-to-use C++ API of htslib', offering full functionality for manipulating Variant Call Format (VCF) files. The vcfppR package serves as the R bindings of the vcfpp.h library, enabling rapid processing of both compressed and uncompressed VCF files. Explore a range of powerful features for efficient VCF data manipulation.
Inverse normal transformation (INT) based genetic association testing. These tests are recommend for continuous traits with non-normally distributed residuals. INT-based tests robustly control the type I error in settings where standard linear regression does not, as when the residual distribution exhibits excess skew or kurtosis. Moreover, INT-based tests outperform standard linear regression in terms of power. These tests may be classified into two types. In direct INT (D-INT), the phenotype is itself transformed. In indirect INT (I-INT), phenotypic residuals are transformed. The omnibus test (O-INT) adaptively combines D-INT and I-INT into a single robust and statistically powerful approach. See McCaw ZR, Lane JM, Saxena R, Redline S, Lin X. "Operating characteristics of the rank-based inverse normal transformation for quantitative trait analysis in genome-wide association studies" <doi:10.1111/biom.13214>.
This package provides support for numerical and graphical summaries of RNA-Seq genomic read data. Provided within-lane normalization procedures to adjust for GC-content effect (or other gene-level effects) on read counts: loess robust local regression, global-scaling, and full-quantile normalization. Between-lane normalization procedures to adjust for distributional differences between lanes (e.g., sequencing depth): global-scaling and full-quantile normalization.
satuRn provides a framework for performing differential transcript usage analyses. The package consists of three main functions. The first function, fitDTU, fits quasi-binomial generalized linear models that model transcript usage in different groups of interest. The second function, testDTU, tests for differential usage of transcripts between groups of interest. Finally, plotDTU visualizes the usage profiles of transcripts in groups of interest.
This package is used to detect combination of genomic coordinates falling within a user defined window size along with user defined overlap between identified neighboring clusters. It can be used for genomic data where the clusters are built on a specific chromosome or specific strand. Clustering can be performed with a "greedy" option allowing thus the presence of additional sites within the allowed window size.
This package provides a graphics device for R that is accessible via network protocols. This package was created to make it easier to embed live R graphics in integrated development environments and other applications. The included HTML/JavaScript client (plot viewer) aims to provide a better overall user experience when dealing with R graphics. The device asynchronously serves graphics via HTTP and WebSockets'.
This package contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels or modes facilitating the analysis of differential expression results. This package does not perform differential expression. Instead, it provides methods to embed precomputed differential expression results in a SummarizedExperiment object, in a manner that is compatible with interactive visualisation in iSEE applications.
`orthos` decomposes RNA-seq contrasts, for example obtained from a gene knock-out or compound treatment experiment, into unspecific and experiment-specific components. Original and decomposed contrasts can be efficiently queried against a large database of contrasts (derived from ARCHS4, https://maayanlab.cloud/archs4/) to identify similar experiments. `orthos` furthermore provides plotting functions to visualize the results of such a search for similar contrasts.
This package provides methods to efficiently detect competitive endogeneous RNA interactions between two genes. Such interactions are mediated by one or several miRNAs such that both gene and miRNA expression data for a larger number of samples is needed as input. The SPONGE package now also includes spongEffects: ceRNA modules offer patient-specific insights into the miRNA regulatory landscape.
Comprehensive R package for differential composition and variability analysis in single-cell RNA sequencing, CyTOF, and microbiome data. Provides robust Bayesian modeling with outlier detection, random effects, and advanced statistical methods for cell type proportion analysis. Features include probabilistic outlier identification, mixed-effect modeling, differential variability testing, and comprehensive visualization tools. Perfect for cancer research, immunology, developmental biology, and single-cell genomics applications.
For a binary classification the adjusted sensitivity and specificity are measured for a given fixed threshold. If the threshold for either sensitivity or specificity is not given, the crossing point between the sensitivity and specificity curves are returned. For bootstrap procedures, mean and CI bootstrap values of sensitivity, specificity, crossing point between specificity and specificity as well as AUC and AUCPR can be evaluated.
The agghoo procedure is an alternative to usual cross-validation. Instead of choosing the best model trained on V subsamples, it determines a winner model for each subsample, and then aggregates the V outputs. For the details, see "Aggregated hold-out" by Guillaume Maillard, Sylvain Arlot, Matthieu Lerasle (2021) <arXiv:1909.04890> published in Journal of Machine Learning Research 22(20):1--55.
This package implements the Bayesian Synthetic Control method for causal inference in comparative case studies. This package provides tools for estimating treatment effects in settings with a single treated unit and multiple control units, allowing for uncertainty quantification and flexible modeling of time-varying effects. The methodology is based on the paper by Vives and Martinez (2022) <doi:10.48550/arXiv.2206.01779>.
Bindings for additional tree-based model engines for use with the parsnip package. Models include gradient boosted decision trees with LightGBM (Ke et al, 2017.), conditional inference trees and conditional random forests with partykit (Hothorn and Zeileis, 2015. and Hothorn et al, 2006. <doi:10.1198/106186006X133933>), and accelerated oblique random forests with aorsf (Jaeger et al, 2022 <doi:10.5281/zenodo.7116854>).
This package contains functions to estimate a smoothed and a non-smoothed (empirical) time-dependent receiver operating characteristic curve and the corresponding area under the receiver operating characteristic curve and the optimal cutoff point for the right and interval censored survival data. See Beyene and El Ghouch (2020)<doi:10.1002/sim.8671> and Beyene and El Ghouch (2022) <doi:10.1002/bimj.202000382>.
This package provides a framework for specifying and running flexible linear-time reachability-based algorithms for graphical causal inference. Rule tables are used to encode and customize the reachability algorithm to typical causal and probabilistic reasoning tasks such as finding d-connected nodes or more advanced applications. For more information, see Wienöbst, Weichwald and Henckel (2025) <doi:10.48550/arXiv.2506.15758>.