GEMINI uses log-fold changes to model sample-dependent and independent effects, and uses a variational Bayes approach to infer these effects. The inferred effects are used to score and identify genetic interactions, such as lethality and recovery. More details can be found in Zamanighomi et al. 2019 (in press).
The signeR
package provides an empirical Bayesian approach to mutational signature discovery. It is designed to analyze single nucleotide variation (SNV) counts in cancer genomes, but can also be applied to other features as well. Functionalities to characterize signatures or genome samples according to exposure patterns are also provided.
This package provides a preprocessing pipeline for single cell RNA-seq/ATAC-seq data that starts from the fastq files and produces a feature count matrix with associated quality control information. It can process fastq data generated by CEL-seq, MARS-seq, Drop-seq, Chromium 10x and SMART-seq protocols.
This package includes positive ionization mode data in NetCDF
file format. Centroided subset from 200-600 m/z and 2500-4500 seconds. Data originally reported in "Assignment of Endogenous Substrates to Enzymes by Global Metabolite Profiling" Biochemistry; 2004; 43(45). It also includes detected peaks in an xcmsSet
.
This package provides tools For analyzing Illumina Infinium DNA methylation arrays. SeSAMe
provides utilities to support analyses of multiple generations of Infinium DNA methylation BeadChips
, including preprocessing, quality control, visualization and inference. SeSAMe
features accurate detection calling, intelligent inference of ethnicity, sex and advanced quality control routines.
This package provides classes and methods for spatial objects that have a registered time column, in particular for irregular spatiotemporal data. The time
column can be of any type, but needs to be ordinal. Regularly laid out spatiotemporal data (vector or raster data cubes) are handled by package stars'.
This package provides primitives for visualizing distributions using ggplot2 that are particularly tuned for visualizing uncertainty in either a frequentist or Bayesian mode. Both analytical distributions (such as frequentist confidence distributions or Bayesian priors) and distributions represented as samples (such as bootstrap distributions or Bayesian posterior samples) are easily visualized.
Pure Rust implementation of the EAX Authenticated Encryption with Associated Data (AEAD) Cipher with optional architecture-specific hardware acceleration This scheme is only based on a block cipher. It uses counter mode (CTR) for encryption and CBC mode for generating a OMAC/CMAC/CBCMAC (all names for the same thing).
GNU Recutils is a set of tools and libraries for creating and manipulating text-based, human-editable databases. Despite being text-based, databases created with Recutils carry all of the expected features such as unique fields, primary keys, time stamps and more. Many different field types are supported, as is encryption.
This package provides a set of functions for receiver operating characteristic (ROC) curve estimation and area under the curve (AUC) calculation. All functions are designed to work with aggregated data; nevertheless, they can also handle raw samples. In ROCket', we distinguish two types of ROC curve representations: 1) parametric curves - the true positive rate (TPR) and the false positive rate (FPR) are functions of a parameter (the score), 2) functions - TPR is a function of FPR. There are several ROC curve estimation methods available. An introduction to the mathematical background of the implemented methods (and much more) can be found in de Zea Bermudez, Gonçalves, Oliveira & Subtil (2014) and Cai & Pepe (2004).
This package provides ANOCVA (ANalysis Of Cluster VAriability), a non-parametric statistical test to compare clustering structures with applications in functional magnetic resonance imaging data (fMRI
). The ANOCVA allows us to compare the clustering structure of multiple groups simultaneously and also to identify features that contribute to the differential clustering.
This package provides advanced Bayesian methods to estimate abundance and run-timing from temporally-stratified Petersen mark-recapture experiments. Methods include hierarchical modelling of the capture probabilities and spline smoothing of the daily run size. Theory described in Bonner and Schwarz (2011) <doi:10.1111/j.1541-0420.2011.01599.x>.
Test the robustness of a user's Qualitative Comparative Analysis solutions to randomness, using the bootstrapped assessment: baQCA()
. This package also includes a function that provides recommendations for improving solutions to reach typical significance levels: brQCA()
. Data included come from McVeigh
et al. (2014) <doi:10.1177/0003122414534065>.
The Certifiably Optimal RulE
ListS
(Corels) learner by Angelino et al described in <doi:10.48550/arXiv.1704.01701>
provides interpretable decision rules with an optimality guarantee, and is made available to R with this package. See the file AUTHORS for a list of copyright holders and contributors.
This package provides datasets containing preformatted maps of Norway at the county, municipality, and ward (Oslo only) level for redistricting in 2024, 2020, 2018, and 2017. Multiple layouts are provided (normal, split, and with an insert for Oslo), allowing the user to rapidly create choropleth maps of Norway without any geolibraries.
Makes deck.gl <https://deck.gl/>, a WebGL-powered
open-source JavaScript
framework for visual exploratory data analysis of large datasets, available within R via the htmlwidgets package. Furthermore, it supports basemaps from mapbox <https://www.mapbox.com/> via mapbox-gl-js <https://github.com/mapbox/mapbox-gl-js>.
Package EDISON (Estimation of Directed Interactions from Sequences Of Non-homogeneous gene expression) runs an MCMC simulation to reconstruct networks from time series data, using a non-homogeneous, time-varying dynamic Bayesian network. Networks segments and changepoints are inferred concurrently, and information sharing priors provide a reduction of the inference uncertainty.
High-dimensional data integration is a critical but difficult problem in genomics research because of potential biases from high-throughput experiments. We present MANCIE, a computational method for integrating two genomic data sets with homogenous dimensions from different sources based on a PCA procedure as an approximation to a Bayesian approach.
Equivalence tests and related confidence intervals for the comparison of two treatments, simultaneously for one or many normally distributed, primary response variables (endpoints). The step-up procedure of Quan et al. (2001) is both applied for differences and extended to ratios of means. A related single-step procedure is also available.
Fits Bayesian regression models based on latent Meshed Gaussian Processes (MGP) as described in Peruzzi, Banerjee, Finley (2020) <doi:10.1080/01621459.2020.1833889>, Peruzzi, Banerjee, Dunson, and Finley (2021) <arXiv:2101.03579>
, Peruzzi and Dunson (2022) <arXiv:2201.10080>
. Funded by ERC grant 856506 and NIH grant R01ES028804.
This package provides tools for computing Monte Carlo standard errors (MCSE) in Markov chain Monte Carlo (MCMC) settings. MCSE computation for expectation and quantile estimators is supported as well as multivariate estimations. The package also provides functions for computing effective sample size and for plotting Monte Carlo estimates versus sample size.
Employing artificial intelligence to convert data analysis questions into executable code, explanations, and algorithms. The self-correction feature ensures the generated code is optimized for performance and accuracy. mergen features a user-friendly chat interface, enabling users to interact with the AI agent and extract valuable insights from their data effortlessly.
This package contains a dataset of morphological and structural features of Medicinal LEAves (MedLEA
)'. The features of each species is recorded by manually viewing the medicinal plant repository available at (<http://www.instituteofayurveda.org/plants/>). You can also download repository of leaf images of 1099 medicinal plants in Sri Lanka.
Fit flexible (excess) hazard regression models with the possibility of including non-proportional effects of covariables and of adding a random effect at the cluster level (corresponding to a shared frailty). A detailed description of the package functionalities is provided in Charvat and Belot (2021) <doi: 10.18637/jss.v098.i14>.