Used to determine which cell types are enriched within gene lists. The package provides tools for testing enrichments within simple gene lists (such as human disease associated genes) and those resulting from differential expression studies. The package does not depend upon any particular Single Cell Transcriptome dataset and user defined datasets can be loaded in and used in the analyses.
This package provides the output of running Salmon on a set of 12 RNA-seq samples from King & Klose, "The pioneer factor OCT4 requires the chromatin remodeller BRG1 to support gene regulatory element function in mouse embryonic stem cells", published in eLIFE, March 2017. For details on version numbers and how the samples were processed see the package vignette.
Subtyping via Consensus Factor Analysis (SCFA) can efficiently remove noisy signals from consistent molecular patterns in multi-omics data. SCFA first uses an autoencoder to select only important features and then repeatedly performs factor analysis to represent the data with different numbers of factors. Using these representations, it can reliably identify cancer subtypes and accurately predict risk scores of patients.
It performs Canonical Correlation Analysis and provides inferential guaranties on the correlation components. The p-values are computed following the resampling method developed in Winkler, A. M., Renaud, O., Smith, S. M., & Nichols, T. E. (2020). Permutation inference for canonical correlation analysis. NeuroImage, <doi:10.1016/j.neuroimage.2020.117065>. Furthermore, it provides plotting tools to visualize the results.
This package provides functions for data preparation, parameter estimation, scoring, and plotting for the BG/BB (Fader, Hardie, and Shang 2010 <doi:10.1287/mksc.1100.0580>), BG/NBD (Fader, Hardie, and Lee 2005 <doi:10.1287/mksc.1040.0098>) and Pareto/NBD and Gamma/Gamma (Fader, Hardie, and Lee 2005 <doi:10.1509/jmkr.2005.42.4.415>) models.
Run computer experiments using the adaptive composite grid algorithm with a Gaussian process model. The algorithm works best when running an experiment that can evaluate thousands of points from a deterministic computer simulation. This package is an implementation of a forthcoming paper by Plumlee, Erickson, Ankenman, et al. For a preprint of the paper, contact the maintainer of this package.
Works with the Citizen Voting Age Population special tabulation from the US Census Bureau <https://www.census.gov/programs-surveys/decennial-census/about/voting-rights/cvap.html>. Provides tools to download and process raw data. Also provides a downloading interface to processed data. Implements a very basic approach to estimate block level citizen voting age population from block group data.
This package implements the exponential Factor Copula Model (eFCM) of Castro-Camilo, D. and Huser, R. (2020) for spatial extremes, with tools for dependence estimation, tail inference, and visualization. The package supports likelihood-based inference, Gaussian process modeling via Matérn covariance functions, and bootstrap uncertainty quantification. See Castro-Camilo and Huser (2020) <doi:10.1080/01621459.2019.1647842>.
When fitting a set of linear regressions which have some same variables, we can separate the matrix and reduce the computation cost. This package aims to fit a set of repeated linear regressions faster. More details can be found in this blog Lijun Wang (2017) <https://stats.hohoweiya.xyz/regression/2017/09/26/An-R-Package-Fit-Repeated-Linear-Regressions/>.
Computes and tests individual (species, phylogenetic and functional) diversity-area relationships, i.e., how species-, phylogenetic- and functional-diversity varies with spatial scale around the individuals of some species in a community. See applications of these methods in Wiegand et al. (2007) <doi:10.1073/pnas.0705621104> or Chacon-Labella et al. (2016) <doi:10.1007/s00442-016-3547-z>.
An implementation of generalized linear models (GLMs) for studying relationships among attributes in connected populations, where responses of connected units can be dependent, as introduced by Fritz et al. (2025) <doi:10.1080/01621459.2025.2565851>. igml extends GLMs for independent responses to dependent responses and can be used for studying spillover in connected populations and other network-mediated phenomena.
This package implements a Gibbs sampler to do linear regression with multiple covariates, multiple responses, Gaussian measurement errors on covariates and responses, Gaussian intrinsic scatter, and a covariate prior distribution which is given by either a Gaussian mixture of specified size or a Dirichlet process with a Gaussian base distribution. Described further in Mantz (2016) <DOI:10.1093/mnras/stv3008>.
This package provides functions and wrappers for using the Multiple Aggregation Prediction Algorithm (MAPA) for time series forecasting. MAPA models and forecasts time series at multiple temporal aggregation levels, thus strengthening and attenuating the various time series components for better holistic estimation of its structure. For details see Kourentzes et al. (2014) <doi:10.1016/j.ijforecast.2013.09.006>.
Nonparametric test of independence between a pair of spatial objects (random fields, point processes) based on random shifts with torus or variance correction. See MrkviÄ ka et al. (2021) <doi:10.1016/j.spasta.2020.100430>, DvoŠák et al. (2022) <doi:10.1111/insr.12503>, DvoŠák and MrkviÄ ka (2024) <doi:10.1080/10618600.2024.2357626>.
Analyse common types of plant phenotyping data, provide a simplified interface to longitudinal growth modeling and select Bayesian statistics, and streamline use of PlantCV output. Several Bayesian methods and reporting guidelines for Bayesian methods are described in Kruschke (2018) <doi:10.1177/2515245918771304>, Kruschke (2013) <doi:10.1037/a0029146>, and Kruschke (2021) <doi:10.1038/s41562-021-01177-7>.
This package implements a three-dimensional stochastic model of cancer growth and mutation similar to the one described in Waclaw et al. (2015) <doi:10.1038/nature14971>. Allows for interactive 3D visualizations of the simulated tumor. Provides a comprehensive summary of the spatial distribution of mutants within the tumor. Contains functions which create synthetic sequencing datasets from the generated tumor.
Seven different methods for multiple testing problems. The SGoF-type methods (see for example, Carvajal Rodrà guez et al., 2009 <doi:10.1186/1471-2105-10-209>; de Uña à lvarez, 2012 <doi:10.1515/1544-6115.1812>; Castro Conde et al., 2015 <doi:10.1177/0962280215597580>) and the BH and BY false discovery rate controlling procedures.
This package provides functions for implementing the targeted gold standard (GS) testing. You provide the true disease or treatment failure status and the risk score, tell TGST the availability of GS tests and which method to use, and it returns the optimal tripartite rules. Please refer to Liu et al. (2013) <doi:10.1080/01621459.2013.810149> for more details.
The ToxCast Data Analysis Pipeline ('tcpl') is an R package that manages, curve-fits, plots, and stores ToxCast data to populate its linked MySQL database, invitrodb'. The package was developed for the chemical screening data curated by the US EPA's Toxicity Forecaster (ToxCast) program, but tcpl can be used to support diverse chemical screening efforts.
Population genetic data such as Single Nucleotide Polymorphisms (SNPs) is often used to identify genomic regions that have been under recent natural or artificial selection and might provide clues about the molecular mechanisms of adaptation. One approach, the concept of an Extended Haplotype Homozygosity (EHH), introduced by (Sabeti 2002) <doi:10.1038/nature01140>, has given rise to several statistics designed for whole genome scans. The package provides functions to compute three of these, namely: iHS (Voight 2006) <doi:10.1371/journal.pbio.0040072> for detecting positive or Darwinian selection within a single population as well as Rsb (Tang 2007) <doi:10.1371/journal.pbio.0050171> and XP-EHH (Sabeti 2007) <doi:10.1038/nature06250>, targeted at differential selection between two populations. Various plotting functions are included to facilitate visualization and interpretation of these statistics.
The Recode library converts files between character sets and usages. It recognises or produces over 200 different character sets (or about 300 if combined with an iconv library) and transliterates files between almost any pair. When exact transliteration are not possible, it gets rid of offending characters or falls back on approximations. The recode program is a handy front-end to the library.
Defining the identity of a cell is fundamental to understand the heterogeneity of cells to various environmental signals and perturbations. We present Cepo, a new method to explore cell identities from single-cell RNA-sequencing data using differential stability as a new metric to define cell identity genes. Cepo computes cell-type specific gene statistics pertaining to differential stable gene expression.
The package provides S4 classes and methods to filter, summarise and visualise genetic variation data stored in VCF files. In particular, the package extends the FilterRules class (S4Vectors package) to define news classes of filter rules applicable to the various slots of VCF objects. Functionalities are integrated and demonstrated in a Shiny web-application, the Shiny Variant Explorer (tSVE).
Target capture experiments combine hybridization-based (in solution or on microarrays) capture and enrichment of genomic regions of interest (e.g. the exome) with high throughput sequencing of the captured DNA fragments. This package provides functionalities for assessing and visualizing the quality of the target enrichment process, like specificity and sensitivity of the capture, per-target read coverage and so on.