This package provides a toolbox for sparse contrastive principal component analysis (scPCA) of high-dimensional biological data. scPCA combines the stability and interpretability of sparse PCA with contrastive PCA's ability to disentangle biological signal from unwanted variation through the use of control data. Also implements and extends cPCA.
This package provides a pipeline for analysis of GC-MS data acquired in selected ion monitoring (SIM) mode. The tool also provides a guidance in choosing appropriate fragments for the targets of interest by using an optimization algorithm. This is done by considering overlapping peaks from a provided library by the user.
The tigre package implements our methodology of Gaussian process differential equation models for analysis of gene expression time series from single input motif networks. The package can be used for inferring unobserved transcription factor (TF) protein concentrations from expression measurements of known target genes, or for ranking candidate targets of a TF.
This package provides a method for the Bayesian functional linear regression model (scalar-on-function), including two estimators of the coefficient function and an estimator of its support. A representation of the posterior distribution is also available. Grollemund P-M., Abraham C., Baragatti M., Pudlo P. (2019) <doi:10.1214/18-BA1095>.
An interface to explore, analyze, and visualize droplet digital PCR (ddPCR) data in R. This is the first non-proprietary software for analyzing two-channel ddPCR data. An interactive tool was also created and is available online to facilitate this analysis for anyone who is not comfortable with using R.
Fast computation of the distance covariance dcov and distance correlation dcor'. The computation cost is only O(n log(n)) for the distance correlation (see Chaudhuri, Hu (2019) <arXiv:1810.11332> <doi:10.1016/j.csda.2019.01.016>). The functions are written entirely in C++ to speed up the computation.
Predictors can be converted to one or more numeric representations using a variety of methods. Effect encodings using simple generalized linear models <doi:10.48550/arXiv.1611.09477> or nonlinear models <doi:10.48550/arXiv.1604.06737> can be used. There are also functions for dimension reduction and other approaches.
An easy-to-use web client/wrapper for the Figma API <https://www.figma.com/developers/api>. It allows you to bring all data from a Figma file to your R session. This includes the data of all objects that you have drawn in this file, and their respective canvas/page metadata.
Improved version of GRIN software that streamlines its use in practice to analyze genomic lesion data, accelerate its computing, and expand its analysis capabilities to answer additional scientific questions including a rigorous evaluation of the association of genomic lesions with RNA expression. Pounds, Stan, et al. (2013) <DOI:10.1093/bioinformatics/btt372>.
This package provides a ggplot2 geom and position for visualizing brain region data on cortical, subcortical, and white matter tract atlases. Brain atlas geometries are stored as simple features ('sf'), enabling seamless integration with the ggplot2 ecosystem including faceting, custom scales, and themes. Mowinckel & Vidal-Piñeiro (2020) <doi:10.1177/2515245920928009>.
Set of routines for influence diagnostics by using case-deletion in ordinary least squares, nonlinear regression [Ross (1987). <doi:10.2307/3315198>], ridge estimation [Walker and Birch (1988). <doi:10.1080/00401706.1988.10488370>] and least absolute deviations (LAD) regression [Sun and Wei (2004). <doi:10.1016/j.spl.2003.08.018>].
This package provides a streamlined cross-referencing system for R Markdown documents generated with knitr'. R Markdown is an authoring format for generating dynamic content from R. kfigr provides a hook for anchoring code chunks and a function to cross-reference document elements generated from said chunks, e.g. figures and tables.
This package provides utilities to detect common data leakage patterns including train/test contamination, temporal leakage, and data duplication, enhancing model reliability and reproducibility in machine learning workflows. Generates diagnostic reports and visual summaries to support data validation. Methods based on best practices from Hastie, Tibshirani, and Friedman (2009, ISBN:978-0387848570).
Common mass spectrometry tools described in John Roboz (2013) <doi:10.1201/b15436>. It allows checking element isotopes, calculating (isotope labelled) exact monoisitopic mass, m/z values and mass accuracy, and inspecting possible contaminant mass peaks, examining possible adducts in electrospray ionization (ESI) and matrix-assisted laser desorption ionization (MALDI) ion sources.
Multiple contrast tests and simultaneous confidence intervals based on normal approximation. With implementations for binomial proportions in a 2xk setting (risk difference and odds ratio), poly-3-adjusted tumour rates, biodiversity indices (multinomial data) and expected values under lognormal assumption. Approximative power calculation for multiple contrast tests of binomial and Gaussian data.
Access the Red List of Montane Tree Species of the Tropical Andes Tejedor Garavito et al.(2014, ISBN:978-1-905164-60-8). This package allows users to search for globally threatened tree species within the andean montane forests, including cloud forests and seasonal (wet) forests above 1500 m a.s.l.
Automatic time series modelling with neural networks. Allows fully automatic, semi-manual or fully manual specification of networks. For details of the specification methodology see: (i) Crone and Kourentzes (2010) <doi:10.1016/j.neucom.2010.01.017>; and (ii) Kourentzes et al. (2014) <doi:10.1016/j.eswa.2013.12.011>.
QuantLib bindings are provided for R using Rcpp via an evolved version of the initial header-only Quantuccia project offering an subset of QuantLib (now maintained separately just for the calendaring subset). See the included file AUTHORS for a full list of contributors to QuantLib (and hence also Quantuccia').
Procedure to optimally split a dataset for training and testing. SPlit is based on the method of support points, which is independent of modeling methods. Please see Joseph and Vakayil (2021) <doi:10.1080/00401706.2021.1921037> for details. This work is supported by U.S. National Science Foundation grant DMREF-1921873.
This package provides a collection of functions for processing raw data from Stream Temperature, Intermittency, and Conductivity (STIC) loggers. STICr (pronounced "sticker") includes functions for tidying, calibrating, classifying, and doing quality checks on data from STIC sensors. Some package functionality is described in Wheeler/Zipper et al. (2023) <doi:10.31223/X5636K>.
Computes Value at risk and expected shortfall, two most popular measures of financial risk, for over one hundred parametric distributions, including all commonly known distributions. Also computed are the corresponding probability density function and cumulative distribution function. See Chan, Nadarajah and Afuecheta (2015) <doi:10.1080/03610918.2014.944658> for more details.
High-level functions to render LaTeX fragments in plots, including as labels and data symbols in ggplot2 plots, plus low-level functions to author LaTeX fragments (to produce LaTeX documents), typeset LaTeX documents (to produce DVI files), read DVI files (to produce "DVI" objects), and render "DVI" objects.
Regularised discriminant analysis functions. The classical regularised discriminant analysis proposed by Friedman in 1989, including cross-validation, of which the linear and quadratic discriminant analyses are special cases. Further, the regularised maximum likelihood linear discriminant analysis, including cross-validation. References: Friedman J.H. (1989): "Regularized Discriminant Analysis". Journal of the American Statistical Association 84(405): 165--175. <doi:10.2307/2289860>. Friedman J., Hastie T. and Tibshirani R. (2009). "The elements of statistical learning", 2nd edition. Springer, Berlin. <doi:10.1007/978-0-387-84858-7>. Tsagris M., Preston S. and Wood A.T.A. (2016). "Improved classification for compositional data using the alpha-transformation". Journal of Classification, 33(2): 243--261. <doi:10.1007/s00357-016-9207-5>.
Implementation of Kernelized score functions and other semi-supervised learning algorithms for node label ranking to analyze biomolecular networks. RANKS can be easily applied to a large set of different relevant problems in computational biology, ranging from automatic protein function prediction, to gene disease prioritization and drug repositioning, and more in general to any bioinformatics problem that can be formalized as a node label ranking problem in a graph. The modular nature of the implementation allows to experiment with different score functions and kernels and to easily compare the results with baseline network-based methods such as label propagation and random walk algorithms, as well as to enlarge the algorithmic scheme by adding novel user-defined score functions and kernels.