This package is a collection of baseline correction algorithms. Beside those it provides a framework and a Tcl/Tk enabled GUI for optimizing baseline algorithm parameters. Typical use is the removal of the background effects from spectra, which are originating from various types of spectroscopy and spectrometry. Also, there is a possibility of optimizing this with regard to regression or classification results. Correction methods include polynomial fitting, weighted local smoothers and many more.
This package aims to make it easy to use various types of fonts (TrueType, OpenType, Type 1, web fonts, etc.) in R graphs, and supports most output formats of R graphics including PNG, PDF and SVG. Text glyphs will be converted into polygons or raster images, hence after the plot has been created, it no longer relies on the font files. No external software such as Ghostscript is needed to use this package.
For distributions whose probability density functions are log-concave, the adaptive rejection sampling algorithm can be used to build envelope functions for sampling. For others, the modified adaptive rejection sampling algorithm, the concave-convex adaptive rejection sampling algorithm, and the adaptive slice sampling algorithm can be used. This R package mainly includes these four functions: rARS(), rMARS(), rCCARS(), and rASS(). These functions can realize sampling based on the algorithms above.
SNPediaR provides some tools for downloading and parsing data from the SNPedia web site <http://www.snpedia.com>. The implemented functions allow users to import the wiki text available in SNPedia pages and to extract the most relevant information out of them. If some information in the downloaded pages is not automatically processed by the library functions, users can easily implement their own parsers to access it in an efficient way.
This package is to find SNV/Indel differences between two bam files with near relationship in a way of pairwise comparison through each base position across the genome region of interest. The difference is inferred by Fisher test and euclidean distance, the input of which is the base count (A,T,G,C) in a given position and read counts for indels that span no less than 2bp on both sides of indel region.
This package contains functions to perform Bayesian inference using posterior simulation for a number of statistical models. Most simulation is done in compiled C++ written in the Scythe Statistical Library. All models return coda mcmc objects that can then be summarized using the coda package. Some useful utility functions such as density functions, pseudo-random number generators for statistical distributions, a general purpose Metropolis sampling algorithm, and tools for visualization are provided.
This package facilitates the analysis of single-cell RNA-seq UMI matrices. It does this by computing partitions of a cell similarity graph into small homogeneous groups of cells, which are defined as metacells (MCs). The derived MCs are then used for building different representations of the data, allowing matrix or 2D graph visualization forming a basis for analysis of cell types, subtypes, transcriptional gradients,cell-cycle variation, gene modules and their regulatory models and more.
Package to predict protein-protein interaction (PPI) networks in target organisms for which only a view information about PPIs is available. Path2PPI predicts PPI networks based on sets of proteins which can belong to a certain pathway from well-established model organisms. It helps to combine and transfer information of a certain pathway or biological process from several reference organisms to one target organism. Path2PPI only depends on the sequence similarity of the involved proteins.
Large data files can be difficult to work with in R, where data generally resides in memory. This package encourages a style of programming where data is streamed from disk into R via a `producer and through a series of `consumers that, typically reduce the original data to a manageable size. The package provides useful Producer and Consumer stream components for operations such as data input, sampling, indexing, and transformation; see package?Streamer for details.
Uniquorn enables users to identify cancer cell lines. Cancer cell line misidentification and cross-contamination reprents a significant challenge for cancer researchers. The identification is vital and in the frame of this package based on the locations/ loci of somatic and germline mutations/ variations. The input format is vcf/ vcf.gz and the files have to contain a single cancer cell line sample (i.e. a single member/genotype/gt column in the vcf file).
This package provides a set of convenient functions for calculating sun-related information, including the sun's position (elevation and azimuth), and the times of sunrise, sunset, solar noon, and twilight for any given geographical location on Earth. These calculations are based on equations provided by the National Oceanic & Atmospheric Administration (NOAA) as described in "Astronomical Algorithms" by Jean Meeus (1991). A resource for researchers and professionals working in fields such as climatology, biology, and renewable energy.
RNA-sense tool compares RNA-seq time curves in two experimental conditions, i.e. wild-type and mutant, and works in three steps. At Step 1, it builds expression profile for each transcript in one condition (i.e. wild-type) and tests if the transcript abundance grows or decays significantly. Dynamic transcripts are then sorted to non-overlapping groups (time profiles) by the time point of switch up or down. At Step 2, RNA-sense outputs the groups of differentially expressed transcripts, which are up- or downregulated in the mutant compared to the wild-type at each time point. At Step 3, Correlations (Fisher's exact test) between the outputs of Step 1 (switch up- and switch down- time profile groups) and the outputs of Step2 (differentially expressed transcript groups) are calculated. The results of the correlation analysis are printed as two-dimensional color plot, with time profiles and differential expression groups at y- and x-axis, respectively, and facilitates the biological interpretation of the data.
A slightly modified version of rivertile layout generator for river.
Compared to rivertile, rivercarro adds:
Monocle layout, views will takes all the usable area on the screen.
Gaps instead of padding around views or layout area.
Modify gaps size at runtime.
Smart gaps, if there is only one view, gaps will be disable.
Limit the width of the usable area of the screen.
Per tag configurations.
Cycle through layout
Perform large scale genomic data retrieval and functional annotation retrieval. This package aims to provide users with a standardized way to automate genome, proteome, RNA, coding sequence (CDS), GFF, and metagenome retrieval from NCBI RefSeq, NCBI Genbank, ENSEMBL, and UniProt databases. Furthermore, an interface to the BioMart database allows users to retrieve functional annotation for genomic loci. In addition, users can download entire databases such as NCBI RefSeq, NCBI nr, NCBI nt, NCBI Genbank, etc with only one command.
The missRows package implements the MI-MFA method to deal with missing individuals ('biological units') in multi-omics data integration. The MI-MFA method generates multiple imputed datasets from a Multiple Factor Analysis model, then the yield results are combined in a single consensus solution. The package provides functions for estimating coordinates of individuals and variables, imputing missing individuals, and various diagnostic plots to inspect the pattern of missingness and visualize the uncertainty due to missing values.
The package contains functions to infer and visualize cell cycle process using Single-cell RNA-Seq data. It exploits the idea of transfer learning, projecting new data to the previous learned biologically interpretable space. The tricycle provides a pre-learned cell cycle space, which could be used to infer cell cycle time of human and mouse single cell samples. In addition, it also offer functions to visualize cell cycle time on different embeddings and functions to build new reference.
Given a set of genomic sites/regions (e.g. ChIP-seq peaks, CpGs, differentially methylated CpGs or regions, SNPs, etc.) it is often of interest to investigate the intersecting genomic annotations. Such annotations include those relating to gene models (promoters, 5'UTRs, exons, introns, and 3'UTRs), CpGs (CpG islands, CpG shores, CpG shelves), or regulatory sequences such as enhancers. The annotatr package provides an easy way to summarize and visualize the intersection of genomic sites/regions with genomic annotations.
This package provides portable tools to run system processes in the background. It can check if a background process is running; wait on a background process to finish; get the exit status of finished processes; kill background processes and their children; restart processes. It can read the standard output and error of the processes, using non-blocking connections. processx can poll a process for standard output or error, with a timeout. It can also poll several processes at once.
This package provides a small collection of interesting and educational machine learning data sets which are used as examples in the mlr3 book Applied machine learning using mlr3 in R https://mlr3book.mlr-org.com, the use case gallery https://mlr3gallery.mlr-org.com, or in other examples. All data sets are properly preprocessed and ready to be analyzed by most machine learning algorithms. Data sets are automatically added to the dictionary of tasks if mlr3 is loaded.
This package is a Shiny app for interactively analyzing and visualizing Nanostring GeoMX Whole Transcriptome Atlas data. Users have the option of exploring a sample data to explore this app's functionality. Regions of interest (ROIs) can be filtered based on any user-provided metadata. Upon taking two or more groups of interest, all pairwise and ANOVA-like testing are automatically performed. Available ouputs include PCA, Volcano plots, tables and heatmaps. Aesthetics of each output are highly customizable.
The S4Arrays package defines the Array virtual class to be extended by other S4 classes that wish to implement a container with an array-like semantic. It also provides:
low-level functionality meant to help the developer of such container to implement basic operations like display, subsetting, or coercion of their array-like objects to an ordinary matrix or array, and
a framework that facilitates block processing of array-like objects (typically on-disk objects).
This package is an R package dedicated to the analysis of (multiplexed) 4C sequencing data. r-fourcseq provides a pipeline to detect specific interactions between DNA elements and identify differential interactions between conditions. The statistical analysis in R starts with individual bam files for each sample as inputs. To obtain these files, the package contains a Python script to demultiplex libraries and trim off primer sequences. With a standard alignment software the required bam files can be then be generated.
This package provides tools to compares k samples using the Anderson-Darling test, Kruskal-Wallis type tests with different rank score criteria, Steel's multiple comparison test, and the Jonckheere-Terpstra (JT) test. It computes asymptotic, simulated or (limited) exact P-values, all valid under randomization, with or without ties, or conditionally under random sampling from populations, given the observed tie pattern. Except for Steel's test and the JT test it also combines these tests across several blocks of samples.
This package aims to analyse count-based methylation data on predefined genomic regions, such as those obtained by targeted sequencing, and thus to identify differentially methylated regions (DMRs) that are associated with phenotypes or traits. The method is built a rich flexible model that allows for the effects, on the methylation levels, of multiple covariates to vary smoothly along genomic regions. At the same time, this method also allows for sequencing errors and can adjust for variability in cell type mixture.