This package is for searching for datasets in EMBL-EBI Expression Atlas, and downloading them into R for further analysis. Each Expression Atlas dataset is represented as a SimpleList
object with one element per platform. Sequencing data is contained in a SummarizedExperiment
object, while microarray data is contained in an ExpressionSet
or MAList object.
This package detects naive associations between omics features and metadata in cross-sectional data-sets using non-parametric tests. In a second step, confounding effects between metadata associated to the same omics feature are detected and labeled using nested post-hoc model comparison tests. The generated output can be graphically summarized using the built-in plotting function.
Set of tools to help interested researchers to build hospital networks from data on hospitalized patients transferred between hospitals. Methods provided have been used in Donker T, Wallinga J, Grundmann H. (2010) <doi:10.1371/journal.pcbi.1000715>, and Nekkab N, Crépey P, Astagneau P, Opatowski L, Temime L. (2020) <doi:10.1038/s41598-020-71212-6>.
This package provides statistical methods for the design and analysis of a calibration study, which aims for calibrating measurements using two different methods. The package includes sample size calculation, sample selection, regression analysis with error-in measurements and change-point regression. The method is described in Tian, Durazo-Arvizu, Myers, et al. (2014) <DOI:10.1002/sim.6235>.
This package is a gene/phenotype prioritization tool that utilizes multiplex heterogeneous gene phenotype network. PhenoGeneRanker
allows multi-layer gene and phenotype networks. It also calculates empirical p-values of gene/phenotype ranking using random stratified sampling of genes/phenotypes based on their connectivity degree in the network. https://dl.acm.org/doi/10.1145/3307339.3342155.
Spatial allelic expression counts from Combs & Fraser (2018), compiled into a SummarizedExperiment
object. This package contains data of allelic expression counts of spatial slices of a fly embryo, a Drosophila melanogaster x Drosophila simulans cross. See the CITATION file for the data source, and the associated script for how the object was constructed from publicly available data.
Run Leslie Matrix models using Monte Carlo simulations for any specified shark species. This package was developed during the publication of Smart, JJ, White, WT, Baje, L, et al. (2020) "Can multi-species shark longline fisheries be managed sustainably using size limits? Theoretically, yes. Realistically, no".J Appl Ecol. 2020; 57; 1847â 1860. <doi:10.1111/1365-2664.13659>.
Location-Scale based distributions parameterized in terms of mean, standard deviation, skew and shape parameters and estimation using automatic differentiation. Distributions include the Normal, Student and GED as well as their skewed variants ('Fernandez and Steel'), the Johnson SU', and the Generalized Hyperbolic. Also included is the semi-parametric piece wise distribution ('spd') with Pareto tails and kernel interior.
An interactive introduction to Life Data Analysis that depends on WeibullR
by David Silkworth and Jurgen Symynck (2022) <https://CRAN.R-project.org/package=WeibullR>
, a R package for Weibull Analysis, and learnr by Garrick Aden-Buie et al. (2023) <https://CRAN.R-project.org/package=learnr>, a framework for building interactive learning modules in R.
This package provides computational tools for working with the Extended Laplace distribution, including the probability density function, cumulative distribution function, quantile function, random variate generation based on convolution with Uniform noise and the quantile-quantile plot. Useful for modeling contaminated Laplace data and other applications in robust statistics. See Saah and Kozubowski (2025) <doi:10.1016/j.cam.2025.116588>.
Necessary functions for optimized automated evaluation of the number and parameters of Gaussian mixtures in one-dimensional data. Various methods are available for parameter estimation and for determining the number of modes in the mixture. A detailed description of the methods ca ben found in Lotsch, J., Malkusch, S. and A. Ultsch. (2022) <doi:10.1016/j.imu.2022.101113>.
Extracted data from 369 TCGA Head and Neck Cancer DNA methylation samples. The extracted data serve as an example dataset for the package shinyMethyl
. Original samples are from 450k methylation arrays, and were obtained from The Cancer Genome Atlas (TCGA). 310 samples are from tumor, 50 are matched normals and 9 are technical replicates of a control cell line.
This package provides publicly available data from The Cancer Genome Atlas (TCGA) as MultiAssayExperiment
objects. MultiAssayExperiment
integrates multiple assays (e.g., RNA-seq, copy number, mutation, microRNA, protein, and others) with clinical / pathological data. It also links assay barcodes with patient identifiers, enabling harmonized subsetting of rows (features) and columns (patients / samples) across the entire multi-'omics experiment.
Curates biological sequences massively, quickly, without errors and without internet connection. Biological sequences curing is performed by aligning the forward and / or revers primers or ends of cloning vectors with the sequences to be cleaned. After the alignment, new subsequences are generated without biological fragment not desired by the user. Pozzi et al (2020) <doi:10.1007/s00438-020-01671-z>.
This package provides weighted versions of several metrics and performance measures used in machine learning, including average unit deviances of the Bernoulli, Tweedie, Poisson, and Gamma distributions, see Jorgensen B. (1997, ISBN: 978-0412997112). The package also contains a weighted version of generalized R-squared, see e.g. Cohen, J. et al. (2002, ISBN: 978-0805822236). Furthermore, dplyr chains are supported.
This package provides support for building pkgdown websites without an internet connection. Works by bundling cached dependencies and implementing drop-in replacements for key pkgdown functions. Enables package documentation websites to be built in environments where internet access is unavailable or restricted. For more details on generating pkgdown websites, see Wickham et al. (2025) <doi:10.32614/CRAN.package.pkgdown>.
Apply the spectral residual algorithm to data, such as a time series, to detect anomalies. Anomaly scores can be used to determine outliers based upon a threshold or fed into more sophisticated prediction models. Methods are based upon "Time-Series Anomaly Detection Service at Microsoft", Ren, H., Xu, B., Wang, Y., et al., (2019) <doi:10.48550/arXiv.1906.03821>
.
Extends beachmat to initialize tatami matrices from TileDB-backed
arrays. This allows C++ code in downstream packages to directly call the TileDB
C/C++ library to access array data, without the need for block processing via DelayedArray
. Developers only need to import this package to automatically extend the capabilities of beachmat::initializeCpp
to TileDBArray
instances.
This package provides tools for the statistical modelling of spatial extremes using max-stable processes, copula or Bayesian hierarchical models. More precisely, this package allows (conditional) simulations from various parametric max-stable models, analysis of the extremal spatial dependence, the fitting of such processes using composite likelihoods or least square (simple max-stable processes only), model checking and selection and prediction.
It provides functions to estimate parameters in linear spatial models with censored/missing responses via the Expectation-Maximization (EM), the Stochastic Approximation EM (SAEM), or the Monte Carlo EM (MCEM) algorithm. These algorithms are widely used to compute the maximum likelihood (ML) estimates in problems with incomplete data. The EM algorithm computes the ML estimates when a closed expression for the conditional expectation of the complete-data log-likelihood function is available. In the MCEM algorithm, the conditional expectation is substituted by a Monte Carlo approximation based on many independent simulations of the missing data. In contrast, the SAEM algorithm splits the E-step into simulation and integration steps. This package also approximates the standard error of the estimates using the Louis method. Moreover, it has a function that performs spatial prediction in new locations.
Package takes frequencies of mutations as reported by high throughput sequencing data from cancer and fits a theoretical neutral model of tumour evolution. Package outputs summary statistics and contains code for plotting the data and model fits. See Williams et al 2016 <doi:10.1038/ng.3489> and Williams et al 2017 <doi:10.1101/096305> for further details of the method.
An implementation of reliability estimation methods described in the paper (Bosnic, Z., & Kononenko, I. (2008) <doi:10.1007/s10489-007-0084-9>), which allows you to test the reliability of a single predicted instance made by your model and prediction function. It also allows you to make a correlation test to estimate which reliability estimate is the most accurate for your model.
Analysis and measurement of promotion effectiveness on a given target variable (e.g. daily sales). After converting promotion schedule into dummy or smoothed predictor variables, the package estimates the effects of these variables controlled for trend/periodicity/structural change using prophet by Taylor and Letham (2017) <doi:10.7287/peerj.preprints.3190v2> and some prespecified variables (e.g. start of a month).
This package provides a set of Study Data Tabulation Model (SDTM) datasets from the Clinical Data Interchange Standards Consortium (CDISC) pilot project used for testing and developing Analysis Data Model (ADaM
) datasets inside the pharmaverse family of packages. SDTM dataset specifications are described in the CDISC SDTM implementation guide, accessible by creating a free account on <https://www.cdisc.org/>.