This package provides statistical tests for label-free LC-MS/MS data by spectral counts, to discover differentially expressed proteins between two biological conditions. Three tests are available: Poisson GLM regression, quasi-likelihood GLM regression, and the negative binomial of the edgeR package. The three models admit blocking factors to control for nuisance variables. To assure a good level of reproducibility a post-test filter is available, where we may set the minimum effect size considered biologicaly relevant, and the minimum expression of the most abundant condition.
This package provides useful tools for both users and developers of packages for fitting Bayesian models or working with output from Bayesian models. The primary goals of the package are to:
Efficiently convert between many different useful formats of draws (samples) from posterior or prior distributions.
Provide consistent methods for operations commonly performed on draws, for example, subsetting, binding, or mutating draws.
Provide various summaries of draws in convenient formats.
Provide lightweight implementations of state of the art posterior inference diagnostics.
This package provides probability mass, distribution, quantile, random-variate generation, and method-of-moments parameter-estimation functions for the Delaporte distribution with parameterization based on Vose (2008). The Delaporte is a discrete probability distribution which can be considered the convolution of a negative binomial distribution with a Poisson distribution. Alternatively, it can be considered a counting distribution with both Poisson and negative binomial components. It has been studied in actuarial science as a frequency distribution which has more variability than the Poisson, but less than the negative binomial.
Dunn's test computes stochastic dominance & reports pairwise comparisons. This is done following a Kruskal-Wallis test (Kruskal and Wallis, 1952). It employs Dunn's z-test-statistic approximations for rank statistics, conducting k(k-1)/2 comparisons. The null hypothesis assumes that the probability of a randomly selected value from the first group being larger than one from the second group is one half, similar to the Wilcoxon-Mann-Whitney test. Dunn's test serves as a test for median difference and takes into account tied ranks.
This package provides an R Client for the Europe PubMed Central RESTful Web Service. It gives access to both metadata on life science literature and open access full texts. Europe PMC indexes all PubMed content and other literature sources including Agricola, a bibliographic database of citations to the agricultural literature, or Biological Patents. In addition to bibliographic metadata, the client allows users to fetch citations and reference lists. Links between life-science literature and other EBI databases, including ENA, PDB or ChEMBL are also accessible.
The googleVis package provides an interface between R and the Google Charts API. Google Charts offer interactive charts which can be embedded into web pages. The functions of the googleVis package allow the user to visualise data stored in R data frames with Google Charts without uploading the data to Google. The output of a googleVis function is HTML code that contains the data and references to JavaScript functions hosted by Google. googleVis makes use of the internal R HTTP server to display the output locally.
This package provides methods to infer clonal tree configuration for a population of cells using single-cell RNA-seq data (scRNA-seq), and possibly other data modalities. Methods are also provided to assign cells to inferred clones and explore differences in gene expression between clones. These methods can flexibly integrate information from imperfect clonal trees inferred based on bulk exome-seq data, and sparse variant alleles expressed in scRNA-seq data. A flexible beta-binomial error model that accounts for stochastic dropout events as well as systematic allelic imbalance is used.
This package provides pure C++ implementations for reading and writing several common data formats based on Google protocol-buffers. It currently supports rexp.proto for serialized R objects, geobuf.proto for binary geojson, and mvt.proto for vector tiles. This package uses the auto-generated C++ code by protobuf-compiler, hence the entire serialization is optimized at compile time. The RProtoBuf package on the other hand uses the protobuf runtime library to provide a general-purpose toolkit for reading and writing arbitrary protocol-buffer data in R.
scDDboost is an R package to analyze changes in the distribution of single-cell expression data between two experimental conditions. Compared to other methods that assess differential expression, scDDboost benefits uniquely from information conveyed by the clustering of cells into cellular subtypes. Through a novel empirical Bayesian formulation it calculates gene-specific posterior probabilities that the marginal expression distribution is the same (or different) between the two conditions. The implementation in scDDboost treats gene-level expression data within each condition as a mixture of negative binomial distributions.
Chromatin segmentation analysis transforms ChIP-seq data into signals over the genome. The latter represents the observed states in a multivariate Markov model to predict the chromatin's underlying states. ChromHMM, written in Java, integrates histone modification datasets to learn the chromatin states de-novo. The goal of this package is to call chromHMM from within R, capture the output files in an S4 object and interface to other relevant Bioconductor analysis tools. In addition, segmenter provides functions to test, select and visualize the output of the segmentation.
TEKRABber is made to provide a user-friendly pipeline for comparing orthologs and transposable elements (TEs) between two species. It considers the orthology confidence between two species from BioMart to normalize expression counts and detect differentially expressed orthologs/TEs. Then it provides one to one correlation analysis for desired orthologs and TEs. There is also an app function to have a first insight on the result. Users can prepare orthologs/TEs RNA-seq expression data by their own preference to run TEKRABber following the data structure mentioned in the vignettes.
This package provides functions for importing external vector images and drawing them as part of R plots. This package is different from the grImport package because, where that package imports PostScript format images, this package imports SVG format images. Furthermore, this package imports a specific subset of SVG, so external images must be preprocessed using a package like rsvg to produce SVG that this package can import. SVG features that are not supported by R graphics, such as gradient fills, can be imported and then exported via the gridSVG package.
This package provides a Shiny app for visual exploration of omic datasets as compositions, and differential abundance analysis using ALDEx2. Useful for exploring RNA-seq, meta-RNA-seq, 16s rRNA gene sequencing with visualizations such as principal component analysis biplots (coloured using metadata for visualizing each variable), dendrograms and stacked bar plots, and effect plots (ALDEx2). Input is a table of counts and metadata file (if metadata exists), with options to filter data by count or by metadata to remove low counts, or to visualize select samples according to selected metadata.
SpotClean is a computational method to adjust for spot swapping in spatial transcriptomics data. Recent spatial transcriptomics experiments utilize slides containing thousands of spots with spot-specific barcodes that bind mRNA. Ideally, unique molecular identifiers at a spot measure spot-specific expression, but this is often not the case due to bleed from nearby spots, an artifact we refer to as spot swapping. SpotClean is able to estimate the contamination rate in observed data and decontaminate the spot swapping effect, thus increase the sensitivity and precision of downstream analyses.
Data exploration and modelling is a process in which a lot of data artifacts are produced. Artifacts like: subsets, data aggregates, plots, statistical models, different versions of data sets and different versions of results. Archivist helps to store and manage artifacts created in R. It allows you to store selected artifacts as binary files together with their metadata and relations. Archivist allows sharing artifacts with others. It can look for already created artifacts by using its class, name, date of the creation or other properties. It also makes it easy to restore such artifacts.
This package provides a client for the OmniPath web service and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation nichenetr.
This package enables you to create interactive cluster heatmaps that can be saved as a stand-alone HTML file, embedded in R Markdown documents or in a Shiny app, and made available in the RStudio viewer pane. Hover the mouse pointer over a cell to show details or drag a rectangle to zoom. A heatmap is a popular graphical method for visualizing high-dimensional data, in which a table of numbers is encoded as a grid of colored cells. The rows and columns of the matrix are ordered to highlight patterns and are often accompanied by dendrograms.
Algebraic procedures for analyses of multiple social networks are delivered with this package. multiplex makes possible, among other things, to create and manipulate multiplex, multimode, and multilevel network data with different formats. Effective ways are available to treat multiple networks with routines that combine algebraic systems like the partially ordered semigroup with decomposition procedures or semiring structures with the relational bundles occurring in different types of multivariate networks. multiplex provides also an algebraic approach for affiliation networks through Galois derivations between families of the pairs of subsets in the two domains of the network with visualization options.
The package ptairData contains two raw datasets from Proton-Transfer-Reaction Time-of-Flight mass spectrometer acquisitions (PTR-TOF-MS), in the HDF5 format. One from the exhaled air of two volunteer healthy individuals with three replicates, and one from the cell culture headspace from two mycobacteria species and one control (culture medium only) with two replicates. Those datasets are used in the examples and in the vignette of the ptairMS package (PTR-TOF-MS data pre-processing). There are also used to gererate the ptrSet in the ptairMS data : exhaledPtrset and mycobacteriaSet.
The anota2seq package provides analysis of translational efficiency and differential expression analysis for polysome-profiling and ribosome-profiling studies (two or more sample classes) quantified by RNA sequencing or DNA-microarray. Polysome-profiling and ribosome-profiling typically generate data for two RNA sources, translated mRNA and total mRNA. Analysis of differential expression is used to estimate changes within each RNA source. Analysis of translational efficiency aims to identify changes in translation efficiency leading to altered protein levels that are independent of total mRNA levels or buffering, a mechanism regulating translational efficiency so that protein levels remain constant despite fluctuating total mRNA levels.
iheatmapr is an R package for building complex, interactive heatmaps using modular building blocks. "Complex" heatmaps are heatmaps in which subplots along the rows or columns of the main heatmap add more information about each row or column. For example, a one column additional heatmap may indicate what group a particular row or column belongs to. Complex heatmaps may also include multiple side by side heatmaps which show different types of data for the same conditions. Interactivity can improve complex heatmaps by providing tooltips with information about each cell and enabling zooming into interesting features. iheatmapr uses the plotly library for interactivity.
This package is designed for the import, quality control, analysis, and visualization of methylation data generated using Sequenom's MassArray platform. The tools herein contain a highly detailed amplicon prediction for optimal assay design. Also included are quality control measures of data, such as primer dimer and bisulfite conversion efficiency estimation. Methylation data are calculated using the same algorithms contained in the EpiTyper software package. Additionally, automatic SNP-detection can be used to flag potentially confounded data from specific CG sites. Visualization includes barplots of methylation data as well as UCSC Genome Browser-compatible BED tracks. Multiple assays can be positionally combined for integrated analysis.
The extrafont package makes it easier to use fonts other than the basic PostScript fonts that R uses. Fonts that are imported into extrafont can be used with PDF or PostScript output files. There are two hurdles for using fonts in PDF (or Postscript) output files:
Making R aware of the font and the dimensions of the characters.
Embedding the fonts in the PDF file so that the PDF can be displayed properly on a device that doesn't have the font. This is usually needed if you want to print the PDF file or share it with others.
The extrafont package makes both of these things easier.
This is a data only package providing the algorithmic complexity of short strings, computed using the coding theorem method. For a given set of symbols in a string, all possible or a large number of random samples of Turing machines with a given number of states (e.g., 5) and number of symbols corresponding to the number of symbols in the strings were simulated until they reached a halting state or failed to end. This package contains data on 4.5 million strings from length 1 to 12 simulated on Turing machines with 2, 4, 5, 6, and 9 symbols. The complexity of the string corresponds to the distribution of the halting states.