This package provides tools for making the descriptive "Table 1" used in medical articles, a transition plot for showing changes between categories (also known as a Sankey diagram), flow charts by extending the grid package, a method for variable selection based on the SVD, Bezier lines with arrows complementing the ones in the grid package, and more.
This package provides functions to perform reproducible parallel foreach loops, using independent random streams as generated by L'Ecuyer's combined multiple-recursive generator. It enables to easily convert standard %dopar% loops into fully reproducible loops, independently of the number of workers, the task scheduling strategy, or the chosen parallel environment and associated foreach backend.
Machine Learning models are widely used and have various applications in classification or regression. Models created with boosting, bagging, stacking or similar techniques are often used due to their high performance, but such black-box models usually lack interpretability. The DALEX package contains various explainers that help to understand the link between input variables and model output.
The R implementation of mCOPA package published by Wang et al. (2012). Oppar provides methods for Cancer Outlier profile Analysis. Although initially developed to detect outlier genes in cancer studies, methods presented in oppar can be used for outlier profile analysis in general. In addition, tools are provided for gene set enrichment and pathway analysis.
The package provides `rlang` data masks for the SummarizedExperiment class. The enables the evaluation of unquoted expression in different contexts of the SummarizedExperiment object with optional access to other contexts. The goal for `plyxp` is for evaluation to feel like a data.frame object without ever needing to unwind to a rectangular data.frame.
This package provides a comprehensive library for date-time manipulations using a new family of orthogonal date-time classes (durations, time points, zoned-times, and calendars) that partition responsibilities so that the complexities of time zones are only considered when they are really needed. Capabilities include: date-time parsing, formatting, arithmetic, extraction and updating of components, and rounding.
sechm provides a simple interface between SummarizedExperiment objects and the ComplexHeatmap package. It enables plotting annotated heatmaps from SE objects, with easy access to rowData and colData columns, and implements a number of features to make the generation of heatmaps easier and more flexible. These functionalities used to be part of the SEtools package.
This package builds on the Epimods framework which facilitates finding weighted subnetworks ("modules") on Illumina Infinium 27k arrays using the SpinGlass algorithm, as implemented in the iGraph package. We have created a class of gene centric annotations associated with p-values and effect sizes and scores from any researchers prior statistical results to find functional modules.
The formr R package provides a few convenience functions that may be useful to the users of formr (formr.org), an online survey framework which heavily relies on R via openCPU. Some of the functions are for conveniently generating individual feedback graphics, some are just shorthands to make certain common operations in formr more palatable to R novices.
This package provides several cubic spline interpolation methods of H. Akima for irregular and regular gridded data are available through this package, both for the bivariate case and univariate case. Linear interpolation of irregular gridded data is also covered. A bilinear interpolator for regular grids was also added for comparison with the bicubic interpolator on regular grids.
This package is devoted to analyzing high-throughput data (e.g. gene expression microarray, DNA methylation microarray, RNA-seq) from complex tissues. Current functionalities include
detect cell-type specific or cross-cell type differential signals
tree-based differential analysis
improve variable selection in reference-free deconvolution
partial reference-free deconvolution with prior knowledge.
This package is designed to model gene detection pattern of scRNA-seq through a binary factor analysis model. This model allows user to pass into a cell level covariate matrix X and gene level covariate matrix Q to account for nuisance variance(e.g batch effect), and it will output a low dimensional embedding matrix for downstream analysis.
The r-mhsmm package implements estimation and prediction methods for hidden Markov and semi-Markov models for multiple observation sequences. Such techniques are of interest when observed data is thought to be dependent on some unobserved (or hidden) state. Also, this package is suitable for equidistant time series data, with multivariate and/or missing data. Allows user defined emission distributions.
The prebs package aims at making RNA-sequencing (RNA-seq) data more comparable to microarray data. The comparability is achieved by summarizing sequencing-based expressions of probe regions using a modified version of RMA algorithm. The pipeline takes mapped reads in BAM format as an input and produces either gene expressions or original microarray probe set expressions as an output.
This package provides colour choice in information visualisation. It important in order to avoid being mislead by inherent bias in the used colour palette. This package provides access to the perceptually uniform and colour-blindness friendly palettes developed by Fabio Crameri and released under the "Scientific Colour-Maps" moniker. The package contains 24 different palettes and includes both diverging and sequential types.
This package provides resampling procedures to assess the stability of selected variables with additional finite sample error control for high-dimensional variable selection procedures such as Lasso or boosting. Both, standard stability selection (Meinshausen & Buhlmann, 2010) and complementary pairs stability selection with improved error bounds (Shah & Samworth, 2013) are implemented. The package can be combined with arbitrary user specified variable selection approaches.
This package contains various tools for working with and evaluating cross-validated area under the ROC curve (AUC) estimators. The primary functions of the package are ci.cvAUC and ci.pooled.cvAUC, which report cross-validated AUC and compute confidence intervals for cross-validated AUC estimates based on influence curves for i.i.d. and pooled repeated measures data, respectively.
This package provides methods to perform trajectory analysis based on a minimum spanning tree constructed from cluster centroids. Computes pseudotemporal cell orderings by mapping cells in each cluster (or new cells) to the closest edge in the tree. Uses linear modelling to identify differentially expressed genes along each path through the tree. Several plotting and interactive visualization functions are also implemented.
This package can do differential expression analysis of RNA-seq expression profiles with biological replication. It implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models and quasi-likelihood tests. It be applied to differential signal analysis of other types of genomic data that produce counts, including ChIP-seq, SAGE and CAGE.
This package provides data sets and scripts to accompany Time Series Analysis and Its Applications: With R Examples (4th ed), by R.H. Shumway and D.S. Stoffer. Springer Texts in Statistics, 2017, https://doi.org/10.1007/978-3-319-52452-8, and Time Series: A Data Analysis Approach Using R. Chapman-Hall, 2019, https://doi.org/10.1201/9780429273285.
This package is a ggplot2 extension. It provides some utility functions that do not entirely fit within the grammar of graphics concept. The package extends ggpplots facets through customisation, by setting individual scales per panel, resizing panels and providing nested facets. It also allows multiple colour, fill scales per plot and hosts a smaller collection of stats, geoms and axis guides.
The Round Robin Database Tool (RRDtool) is a system to store and display time-series data (e.g. network bandwidth, machine-room temperature, server load average). It stores the data in Round Robin Databases (RRDs), a very compact way that will not expand over time. RRDtool processes the extracted data to enforce a certain data density, allowing for useful graphical representation of data values.
Robust normalization and difference calling procedures for ChIP-seq and alike data. Read counts are modeled jointly as a binomial mixture model with a user-specified number of components. A fitted background estimate accounts for the effect of enrichment in certain regions and, therefore, represents an appropriate null hypothesis. This robust background is used to identify significantly enriched or depleted regions.
Use multiple factor analysis to calculate individualized pathway-centric scores of deviation with respect to the sampled population based on multi-omic assays (e.g., RNA-seq, copy number alterations, methylation, etc). Graphical and numerical outputs are provided to identify highly aberrant individuals for a particular pathway of interest, as well as the gene and omics drivers of aberrant multi-omic profiles.