Reprepro is a tool to manage a repository of Debian packages (.deb, .udeb, .dsc, ...). It stores files either being injected manually or downloaded from some other repository (partially) mirrored into one pool/ hierarchy. Managed packages and files are stored in a Berkeley DB, so no database server is needed. Checking signatures of mirrored repositories and creating signatures of the generated Package indices is supported.
The xvfb-run wrapper simplifies running commands and scripts within a virtual X server environment. It sets up an X authority file or uses an existing user-specified one, writes a cookie to it, and then starts the Xvfb X server as a background process. It also takes care of killing the server and cleaning up before returning the exit status of the command.
The xvfb-run wrapper simplifies running commands and scripts within a virtual X server environment. It sets up an X authority file or uses an existing user-specified one, writes a cookie to it, and then starts the Xvfb X server as a background process. It also takes care of killing the server and cleaning up before returning the exit status of the command.
Signal-to-Noise applied to Gene Expression Experiments. Signal-to-noise ratios can be used as a proxy for quality of gene expression studies and samples. The SNRs can be calculated on any gene expression data set as long as gene IDs are available, no access to the raw data files is necessary. This allows to flag problematic studies and samples in any public data set.
scider is an user-friendly R package providing functions to model the global density of cells in a slide of spatial transcriptomics data. All functions in the package are built based on the SpatialExperiment object, allowing integration into various spatial transcriptomics-related packages from Bioconductor. After modelling density, the package allows for serveral downstream analysis, including colocalization analysis, boundary detection analysis and differential density analysis.
Various mRNA sequencing library preparation methods generate sequencing reads specifically from the transcript ends. Analyses that focus on quantification of isoform usage from such data can be aided by using truncated versions of transcriptome annotations, both at the alignment or pseudo-alignment stage, as well as in downstream analysis. This package implements some convenience methods for readily generating such truncated annotations and their corresponding sequences.
AbSeq is a comprehensive bioinformatic pipeline for the analysis of sequencing datasets generated from antibody libraries and abseqR is one of its packages. AbseqR empowers the users of abseqPy with plotting and reporting capabilities and allows them to generate interactive HTML reports for the convenience of viewing and sharing with other researchers. Additionally, abseqR extends abseqPy to compare multiple repertoire analyses and perform further downstream analysis on its output.
This package is focused on finding differential exon usage using RNA-seq exon counts between samples with different experimental designs. It provides functions that allows the user to make the necessary statistical tests based on a model that uses the negative binomial distribution to estimate the variance between biological replicates and generalized linear models for testing. The package also provides functions for the visualization and exploration of the results.
This package includes functions and reference data to generate and manipulate log-ratios (also known as log size index (LSI) values) from measurements obtained on zooarchaeological material. Log ratios are used to compare the relative (rather than the absolute) dimensions of animals from archaeological contexts. The zoolog package is also able to seamlessly integrate data and references with heterogeneous nomenclature, which is internally managed by a zoolog thesaurus.
This package creates square pie charts also known as waffle charts. These can be used to communicate parts of a whole for categorical quantities. To emulate the percentage view of a pie chart, a 10x10 grid should be used. In this way each square is representing 1% of the total. Waffle provides tools to create charts as well as stitch them together. Isotype pictograms can be made by using glyphs.
This package provides Ion Trap positive ionization mode data in mzML file format. It includes a subset from 500-850 m/z and 1190-1310 seconds, including MS2 and MS3, intensity threshold 100.000; extracts from FTICR Apex III, m/z 400-450; a subset of UPLC - Bruker micrOTOFq data, both mzML and mz5; LC-MSMS and MRM files from proteomics experiments; and PSI mzIdentML example files for various search engines.
Network Common Data Form (netCDF) files are widely used for scientific data. Library-level access in R is provided through packages RNetCDF and ncdf4. The package ncdfCF is built on top of RNetCDF and makes the data and its attributes available as a set of R6 classes that are informed by the Climate and Forecasting Metadata Conventions. Access to the data uses standard R subsetting operators and common function forms.
This package contains the Summix2 method for estimating and adjusting for substructure in genetic summary allele frequency data. The function summix() estimates reference group proportions using a mixture model. The adjAF() function produces adjusted allele frequencies for an observed group with reference group proportions matching a target individual or sample. The summix_local() function estimates local ancestry mixture proportions and performs selection scans in genetic summary data.
Uniparental disomy (UPD) is a genetic condition where an individual inherits both copies of a chromosome or part of it from one parent, rather than one copy from each parent. This package contains a HMM for detecting UPDs through HTS (High Throughput Sequencing) data from trio assays. By analyzing the genotypes in the trio, the model infers a hidden state (normal, father isodisomy, mother isodisomy, father heterodisomy and mother heterodisomy).
This is an alternative mechanism for importing objects from packages. The syntax allows for importing multiple objects from a package with a single command in an expressive way. The import package bridges some of the gap between using library (or require) and direct (single-object) imports. Furthermore the imported objects are not placed in the current environment. It is also possible to import objects from stand-alone .R files.
This package implements core utilities for single-cell RNA-seq data analysis. Contained within are utility functions for working with DE matrices and count matrices, a collection of functions for manipulating and plotting data via ggplot2, and functions to work with cell graphs and cell embeddings. Graph-based methods include embedding kNN cell graphs into a UMAP, collapsing vertices of each cluster in the graph, and propagating graph labels.
Circle Manhattan Plot is an R package that can lay out genome-wide association study P-value results in both traditional rectangular patterns, QQ-plot and novel circular ones. United in only one bull's eye style plot, association results from multiple traits can be compared interactively, thereby to reveal both similarities and differences between signals. Additional functions include: highlight signals, a group of SNPs, chromosome visualization and candidate genes around SNPs.
Infer biological pathway activity of cells from single-cell RNA-sequencing data by calculating a pathway score for each cell (pathway genes are specified by the user). It is recommended to have the data in Transcripts-Per-Million (TPM) or Counts-Per-Million (CPM) units for best results. Scores may change when adding cells to or removing cells off the data. SiPSiC stands for Single Pathway analysis in Single Cells.
This package provides the timing functions tic and toc that can be nested. One can record all timings while a complex script is running, and examine the values later. It is also possible to instrument the timing call with custom callbacks. In addition, this package provides class 'Stack', implemented as a vector, and class 'List', implemented as a list, both of whic support operations 'push', 'pop', 'first', 'last' and 'clear'.
This package provides tools for the analysis of complex survey samples. The provided features include: summary statistics, two-sample tests, rank tests, generalised linear models, cumulative link models, Cox models, loglinear models, and general maximum pseudolikelihood estimation for multistage stratified, cluster-sampled, unequally weighted survey samples; variances by Taylor series linearisation or replicate weights; post-stratification, calibration, and raking; two-phase subsampling designs; graphics; PPS sampling without replacement; principal components, and factor analysis.
BSeq-sc is a bioinformatics analysis pipeline that leverages single-cell sequencing data to estimate cell type proportion and cell type-specific gene expression differences from RNA-seq data from bulk tissue samples. This is a companion package to the publication "A single-cell transcriptomic map of the human and mouse pancreas reveals inter- and intra-cell population structure." Baron et al. Cell Systems (2016) https://www.ncbi.nlm.nih.gov/pubmed/27667365.
This package provides an implementation of sparse linear discriminant analysis, which is a supervised classification method for multiple classes. Various novel optimization approaches to this problem are implemented including alternating direction method of multipliers (ADMM), proximal gradient (PG) and accelerated proximal gradient (APG). Functions for performing cross validation are also supplied along with basic prediction and plotting functions. Sparse zero variance discriminant (SZVD) analysis is also included in the package.
This package provides e-statistics (energy) tests and statistics for multivariate and univariate inference, including distance correlation, one-sample, two-sample, and multi-sample tests for comparing multivariate distributions, are implemented. Measuring and testing multivariate independence based on distance correlation, partial distance correlation, multivariate goodness-of-fit tests, clustering based on energy distance, testing for multivariate normality, distance components (disco) for non-parametric analysis of structured data, and other energy statistics/methods are implemented.
This a package containing diverse spatial datasets for demonstrating, benchmarking and teaching spatial data analysis. It includes R data of class sf, Spatial, and nb. It also contains data stored in a range of file formats including GeoJSON, ESRI Shapefile and GeoPackage. Some of the datasets are designed to illustrate specific analysis techniques. cycle_hire() and cycle_hire_osm(), for example, are designed to illustrate point pattern analysis techniques.