Org Ref is an Emacs library that provides rich support for citations, labels and cross-references in Org mode.
The basic idea of Org Ref is that it defines a convenient interface to insert citations from a reference database (e.g., from BibTeX files), and a set of functional Org links for citations, cross-references and labels that export properly to LaTeX, and that provide clickable functionality to the user. Org Ref interfaces with Helm BibTeX to facilitate citation entry, and it can also use RefTeX.
It also provides a fairly large number of utilities for finding bad citations, extracting BibTeX entries from citations in an Org file, and functions to create and modify BibTeX entries from a variety of sources, most notably from a DOI.
Org Ref is especially suitable for Org documents destined for LaTeX export and scientific publication. Org Ref is also useful for research documents and notes.
Org-Babel support for evaluating rust code. Much of this is modeled after `ob-C'. Just like the `ob-C', you can specify :flags headers when compiling with the "rust run" command. Unlike `ob-C', you can also specify :args which can be a list of arguments to pass to the binary. If you quote the value passed into the list, it will use `ob-ref to find the reference data. If you do not include a main function or a package name, `ob-rust will provide it for you and it's the only way to properly use very limited implementation: - currently only support :results output. ; Requirements: - You must have rust and cargo installed and the rust and cargo should be in your `exec-path rust command. - rust-script - `rust-mode is also recommended for syntax highlighting and formatting. Not this particularly needs it, it just assumes you have it.
stJoincount facilitates the application of join count analysis to spatial transcriptomic data generated from the 10x Genomics Visium platform. This tool first converts a labeled spatial tissue map into a raster object, in which each spatial feature is represented by a pixel coded by label assignment. This process includes automatic calculation of optimal raster resolution and extent for the sample. A neighbors list is then created from the rasterized sample, in which adjacent and diagonal neighbors for each pixel are identified. After adding binary spatial weights to the neighbors list, a multi-categorical join count analysis is performed to tabulate "joins" between all possible combinations of label pairs. The function returns the observed join counts, the expected count under conditions of spatial randomness, and the variance calculated under non-free sampling. The z-score is then calculated as the difference between observed and expected counts, divided by the square root of the variance.
This package extends sparse matrix and vector classes from the Matrix package by providing:
Methods and operators that work natively on CSR formats (compressed sparse row, a.k.a.
RsparseMatrix) such as slicing/sub-setting, assignment,rbind(), mathematical operators for CSR and COO such as addition orsqrt(), and methods such asdiag();Multi-threaded matrix multiplication and cross-product for many
<sparse, dense>types, including thefloat32type fromfloat;Coercion methods between pairs of classes which are not present in
Matrix, such as fromdgCMatrixtongRMatrix, as well as convenience conversion functions;Utility functions for sparse matrices such as sorting the indices or removing zero-valued entries;
Fast transposes that work by outputting in the opposite storage format;
Faster replacements for many
Matrixmethods for all sparse types, such as slicing and elementwise multiplication.Convenience functions for sparse objects, such as
mapSparseor a shortershowmethod.
The differences in the RNA types being sequenced have an impact on the resulting sequencing profiles. mRNA-seq data is enriched with reads derived from exons, while GRO-, nucRNA- and chrRNA-seq demonstrate a substantial broader coverage of both exonic and intronic regions. The presence of intronic reads in GRO-seq type of data makes it possible to use it to computationally identify and quantify all de novo continuous regions of transcription distributed across the genome. This type of data, however, is more challenging to interpret and less common practice compared to mRNA-seq. One of the challenges for primary transcript detection concerns the simultaneous transcription of closely spaced genes, which needs to be properly divided into individually transcribed units. The R package transcriptR combines RNA-seq data with ChIP-seq data of histone modifications that mark active Transcription Start Sites (TSSs), such as, H3K4me3 or H3K9/14Ac to overcome this challenge. The advantage of this approach over the use of, for example, gene annotations is that this approach is data driven and therefore able to deal also with novel and case specific events.
Data quality assessment is an integral part of preparatory data analysis to ensure sound biological information retrieval. We present here the MatrixQCvis package, which provides shiny-based interactive visualization of data quality metrics at the per-sample and per-feature level. It is broadly applicable to quantitative omics data types that come in matrix-like format (features x samples). It enables the detection of low-quality samples, drifts, outliers and batch effects in data sets. Visualizations include amongst others bar- and violin plots of the (count/intensity) values, mean vs standard deviation plots, MA plots, empirical cumulative distribution function (ECDF) plots, visualizations of the distances between samples, and multiple types of dimension reduction plots. Furthermore, MatrixQCvis allows for differential expression analysis based on the limma (moderated t-tests) and proDA (Wald tests) packages. MatrixQCvis builds upon the popular Bioconductor SummarizedExperiment S4 class and enables thus the facile integration into existing workflows. The package is especially tailored towards metabolomics and proteomics mass spectrometry data, but also allows to assess the data quality of other data types that can be represented in a SummarizedExperiment object.
This package provides a first step in the data analysis of Mass Spectrometry (MS) based proteomics data is to identify peptides and proteins. With this respect the huge number of experimental mass spectra typically have to be assigned to theoretical peptides derived from a sequence database. Search engines are used for this purpose. These tools compare each of the observed spectra to all candidate theoretical spectra derived from the sequence data base and calculate a score for each comparison. The observed spectrum is then assigned to the theoretical peptide with the best score, which is also referred to as the peptide to spectrum match (PSM). It is of course crucial for the downstream analysis to evaluate the quality of these matches. Therefore False Discovery Rate (FDR) control is used to return a reliable list PSMs. The FDR, however, requires a good characterisation of the score distribution of PSMs that are matched to the wrong peptide (bad target hits). In proteomics, the target decoy approach (TDA) is typically used for this purpose. The TDA method matches the spectra to a database of real (targets) and nonsense peptides (decoys). A popular approach to generate these decoys is to reverse the target database. Hence, all the PSMs that match to a decoy are known to be bad hits and the distribution of their scores are used to estimate the distribution of the bad scoring target PSMs. A crucial assumption of the TDA is that the decoy PSM hits have similar properties as bad target hits so that the decoy PSM scores are a good simulation of the target PSM scores. Users, however, typically do not evaluate these assumptions. To this end we developed TargetDecoy to generate diagnostic plots to evaluate the quality of the target decoy method.
omicsViewer visualizes ExpressionSet (or SummarizedExperiment) in an interactive way. The omicsViewer has a separate back- and front-end. In the back-end, users need to prepare an ExpressionSet that contains all the necessary information for the downstream data interpretation. Some extra requirements on the headers of phenotype data or feature data are imposed so that the provided information can be clearly recognized by the front-end, at the same time, keep a minimum modification on the existing ExpressionSet object. The pure dependency on R/Bioconductor guarantees maximum flexibility in the statistical analysis in the back-end. Once the ExpressionSet is prepared, it can be visualized using the front-end, implemented by shiny and plotly. Both features and samples could be selected from (data) tables or graphs (scatter plot/heatmap). Different types of analyses, such as enrichment analysis (using Bioconductor package fgsea or fisher's exact test) and STRING network analysis, will be performed on the fly and the results are visualized simultaneously. When a subset of samples and a phenotype variable is selected, a significance test on means (t-test or ranked based test; when phenotype variable is quantitative) or test of independence (chi-square or fisher’s exact test; when phenotype data is categorical) will be performed to test the association between the phenotype of interest with the selected samples. Additionally, other analyses can be easily added as extra shiny modules. Therefore, omicsViewer will greatly facilitate data exploration, many different hypotheses can be explored in a short time without the need for knowledge of R. In addition, the resulting data could be easily shared using a shiny server. Otherwise, a standalone version of omicsViewer together with designated omics data could be easily created by integrating it with portable R, which can be shared with collaborators or submitted as supplementary data together with a manuscript.
The understanding of cancer mechanism requires the identification of genes playing a role in the development of the pathology and the characterization of their role (notably oncogenes and tumor suppressors). We present an updated version of the R/bioconductor package called MoonlightR, namely Moonlight2R, which returns a list of candidate driver genes for specific cancer types on the basis of omics data integration. The Moonlight framework contains a primary layer where gene expression data and information about biological processes are integrated to predict genes called oncogenic mediators, divided into putative tumor suppressors and putative oncogenes. This is done through functional enrichment analyses, gene regulatory networks and upstream regulator analyses to score the importance of well-known biological processes with respect to the studied cancer type. By evaluating the effect of the oncogenic mediators on biological processes or through random forests, the primary layer predicts two putative roles for the oncogenic mediators: i) tumor suppressor genes (TSGs) and ii) oncogenes (OCGs). As gene expression data alone is not enough to explain the deregulation of the genes, a second layer of evidence is needed. We have automated the integration of a secondary mutational layer through new functionalities in Moonlight2R. These functionalities analyze mutations in the cancer cohort and classifies these into driver and passenger mutations using the driver mutation prediction tool, CScape-somatic. Those oncogenic mediators with at least one driver mutation are retained as the driver genes. As a consequence, this methodology does not only identify genes playing a dual role (e.g. TSG in one cancer type and OCG in another) but also helps in elucidating the biological processes underlying their specific roles. In particular, Moonlight2R can be used to discover OCGs and TSGs in the same cancer type. This may for instance help in answering the question whether some genes change role between early stages (I, II) and late stages (III, IV). In the future, this analysis could be useful to determine the causes of different resistances to chemotherapeutic treatments. An additional mechanistic layer evaluates if there are mutations affecting the protein stability of the transcription factors (TFs) of the TSGs and OCGs, as that may have an effect on the expression of the genes.
A Scheme runtime compiler.
Use Pry as your rails console
An efficient implementation of an lru cache
Fast Implementation of Gruber's Markdown in C
RnBeads annotation package for the assembly hg38.
Automatically generated RnBeads annotation package for the assembly mm10.
This package provides a collection of compression filters for use with HDF5 datasets.
Agilent annotation data (chip rgug4105a) assembled using data from public repositories.
Redcarpet is an extensible Ruby library for Markdown processing and conversion to (X)HTML.
Agilent Rat annotation data (chip rgug4130a) assembled using data from public repositories.
This package is an automatically generated RnBeads annotation package for the assembly hg19.
This gem extends ruby-rdf with several common RDF vocabularies.
Agilent "Rat Genome, Whole" annotation data (chip rgug4131a) assembled using data from public repositories.
Genome wide annotation for Rat, primarily based on mapping using Entrez Gene identifiers.
Rust programming language toolchain