This package provides a selection of various tools to extend a data analysis workflow based on the tidyverse packages. This includes high-level data frame editing methods (in the style of mutate'/'mutate_at'), some methods in the style of purrr and forcats', lookup methods for dict-like lists, a generic method for lumping a data frame by a given count, various low-level methods for special treatment of NA values, python'-style tuple-assignment and truthy'/'falsy checks, saving to PDF and PNG from a pipe and various small utilities.
This package provides a set of functions to select the optimal block-length for a dependent bootstrap (block-bootstrap). Includes the Hall, Horowitz, and Jing (1995) <doi:10.1093/biomet/82.3.561> subsampling-based cross-validation method, the Politis and White (2004) <doi:10.1081/ETC-120028836> Spectral Density Plug-in method, including the Patton, Politis, and White (2009) <doi:10.1080/07474930802459016> correction, and the Lahiri, Furukawa, and Lee (2007) <doi:10.1016/j.stamet.2006.08.002> nonparametric plug-in method, with a corresponding set of S3 plot methods.
This package creates a HTML widget which displays the results of searching for a pattern in files in a given folder. The results can be viewed in the RStudio viewer pane, included in a R Markdown document or in a Shiny application. Also provides a Shiny application allowing to run this widget and to navigate in the files found by the search. Instead of creating a HTML widget, it is also possible to get the results of the search in a tibble'. The search is performed by the grep command-line utility.
Fits look-up tables by filling entries with the mean or median values of observations fall in partitions of the feature space. Partitions can be determined by user of the package using input argument feature.boundaries, and dimensions of the feature space can be any combination of continuous and categorical features provided by the data set. A Predict function directly fetches corresponding entry value, and a default value is defined as the mean or median of all available observations. The table and other components are represented using the S4 class lookupTable
.
This package provides tools to generate random landscape graphs, evaluate species occurrence in dynamic landscapes, simulate future landscape occupation and evaluate range expansion when new empty patches are available (e.g. as a result of climate change). References: Mestre, F., Canovas, F., Pita, R., Mira, A., Beja, P. (2016) <doi:10.1016/j.envsoft.2016.03.007>; Mestre, F., Risk, B., Mira, A., Beja, P., Pita, R. (2017) <doi:10.1016/j.ecolmodel.2017.06.013>; Mestre, F., Pita, R., Mira, A., Beja, P. (2020) <doi:10.1186/s12898-019-0273-5>.
An implementation of the Likelihood ratio Test (LRT) for testing that, in a (non)linear mixed effects model, the variances of a subset of the random effects are equal to zero. There is no restriction on the subset of variances that can be tested: for example, it is possible to test that all the variances are equal to zero. Note that the implemented test is asymptotic. This package should be used on model fits from packages nlme', lmer', and saemix'. Charlotte Baey and Estelle Kuhn (2019) <doi:10.18637/jss.v107.i06>.
Quick and straightforward visualization of read signal over genomic intervals is key for generating hypotheses from sequencing data sets (e.g. ChIP-seq
, ATAC-seq, bisulfite/methyl-seq). Many tools both inside and outside of R and Bioconductor are available to explore these types of data, and they typically start with a bigWig
or BAM file and end with some representation of the signal (e.g. heatmap). profileplyr leverages many Bioconductor tools to allow for both flexibility and additional functionality in workflows that end with visualization of the read signal.
This package implements the algorithm described in Trapnell,C. et al. (2010) <doi: 10.1038/nbt.1621>. This function takes read counts matrix of RNA-Seq data, feature lengths which can be retrieved using biomaRt
package, and the mean fragment lengths which can be calculated using the CollectInsertSizeMetrics(Picard
) tool. It then returns a matrix of FPKM normalised data by library size and feature effective length. It also provides the user with a quick and reliable function to generate FPKM heatmap plot of the highly variable features in RNA-Seq dataset.
This package provides the core framework for a discrete event system to implement a complete data-to-decisions, reproducible workflow. The core components facilitate the development of modular pieces, and enable the user to include additional functionality by running user-built modules. Includes conditional scheduling, restart after interruption, packaging of reusable modules, tools for developing arbitrary automated workflows, automated interweaving of modules of different temporal resolution, and tools for visualizing and understanding the within-project dependencies. The suggested package NLMR can be installed from the repository (<https://PredictiveEcology.r-universe.dev>
).
From output files obtained from the software ModestR
', the relative contribution of factors to explain species distribution is depicted using several plots. A global geographic raster file for each environmental variable may be also obtained with the mean relative contribution, considering all species present in each raster cell, of the factor to explain species distribution. Finally, for each variable it is also possible to compare the frequencies of any variable obtained in the cells where the species is present with the frequencies of the same variable in the cells of the extent.
Interactive adverse event (AE) volcano plot for monitoring clinical trial safety. This tool allows users to view the overall distribution of AEs in a clinical trial using standard (e.g. MedDRA
preferred term) or custom (e.g. Gender) categories using a volcano plot similar to proposal by Zink et al. (2013) <doi:10.1177/1740774513485311>. This tool provides a stand-along shiny application and flexible shiny modules allowing this tool to be used as a part of more robust safety monitoring framework like the Shiny app from the safetyGraphics
R package.
MotifPeeker
is used to compare and analyse datasets from epigenomic profiling methods with motif enrichment as the key benchmark. The package outputs an HTML report consisting of three sections: (1. General Metrics) Overview of peaks-related general metrics for the datasets (FRiP
scores, peak widths and motif-summit distances). (2. Known Motif Enrichment Analysis) Statistics for the frequency of user-provided motifs enriched in the datasets. (3. De-Novo Motif Enrichment Analysis) Statistics for the frequency of de-novo discovered motifs enriched in the datasets and compared with known motifs.
This R package provides tools for building and running automated end-to-end analysis workflows for a wide range of next generation sequence (NGS) applications such as RNA-Seq, ChIP-Seq, VAR-Seq and Ribo-Seq. Important features include a uniform workflow interface across different NGS applications, automated report generation, and support for running both R and command-line software, such as NGS aligners or peak/variant callers, on local computers or compute clusters. Efficient handling of complex sample sets and experimental designs is facilitated by a consistently implemented sample annotation infrastructure.
This package provides tools for testing, monitoring and dating structural changes in (linear) regression models. It features tests/methods from the generalized fluctuation test framework as well as from the F test (Chow test) framework. This includes methods to fit, plot and test fluctuation processes (e.g., CUSUM, MOSUM, recursive/moving estimates) and F statistics, respectively. It is possible to monitor incoming data online using fluctuation processes. Finally, the breakpoints in regression models with structural changes can be estimated together with confidence intervals. Emphasis is always given to methods for visualizing the data.
Takes user-provided baseline data from groups of randomised controlled data and assesses whether the observed distribution of baseline p-values, numbers of participants in each group, or categorical variables are consistent with the expected distribution, as an aid to the assessment of integrity concerns in published randomised controlled trials. References (citations in PubMed
format in details of each function): Bolland MJ, Avenell A, Gamble GD, Grey A. (2016) <doi:10.1212/WNL.0000000000003387>. Bolland MJ, Gamble GD, Avenell A, Grey A, Lumley T. (2019) <doi:10.1016/j.jclinepi.2019.05.006>. Bolland MJ, Gamble GD, Avenell A, Grey A. (2019) <doi:10.1016/j.jclinepi.2019.03.001>. Bolland MJ, Gamble GD, Grey A, Avenell A. (2020) <doi:10.1111/anae.15165>. Bolland MJ, Gamble GD, Avenell A, Cooper DJ, Grey A. (2021) <doi:10.1016/j.jclinepi.2020.11.012>. Bolland MJ, Gamble GD, Avenell A, Grey A. (2021) <doi:10.1016/j.jclinepi.2021.05.002>. Bolland MJ, Gamble GD, Avenell A, Cooper DJ, Grey A. (2023) <doi:10.1016/j.jclinepi.2022.12.018>. Carlisle JB, Loadsman JA. (2017) <doi:10.1111/anae.13650>. Carlisle JB. (2017) <doi:10.1111/anae.13938>.
Includes functions to estimate production frontiers and make ideal output predictions in the Data Envelopment Analysis (DEA) context using both standard models from DEA and Free Disposal Hull (FDH) and boosting techniques. In particular, EATBoosting (Guillen et al., 2023 <doi:10.1016/j.eswa.2022.119134>) and MARSBoosting. Moreover, the package includes code for estimating several technical efficiency measures using different models such as the input and output-oriented radial measures, the input and output-oriented Russell measures, the Directional Distance Function (DDF), the Weighted Additive Measure (WAM) and the Slacks-Based Measure (SBM).
This package provides a wrapper on top of the Domino Data Python SDK library. It lets you query and access Domino Data Sources directly from your R environment. Under the hood, Domino Data R SDK leverages the API provided by the Domino Data Python SDK', which must be installed as a prerequisite. Domino is a platform that makes it easy to run your code on scalable hardware, with integrated version control and collaboration features designed for analytical workflows. See <https://docs.dominodatalab.com/en/latest/api_guide/140b48/domino-data-api> for more information.
Several multivariate techniques from a biplot perspective. It is the translation (with many improvements) into R of the previous package developed in Matlab'. The package contains some of the main developments of my team during the last 30 years together with some more standard techniques. Package includes: Classical Biplots, HJ-Biplot, Canonical Biplots, MANOVA Biplots, Correspondence Analysis, Canonical Correspondence Analysis, Canonical STATIS-ACT, Logistic Biplots for binary and ordinal data, Multidimensional Unfolding, External Biplots for Principal Coordinates Analysis or Multidimensional Scaling, among many others. References can be found in the help of each procedure.
This package provides a set of tools for basic tensor operators. A tensor in the context of data analysis in a multidimensional array. The tools in this package rely on using any discrete transformation (e.g. Fast Fourier Transform (FFT)). Standard tools included are the Eigenvalue decomposition of a tensor, the QR decomposition and LU decomposition. Other functionality includes the inverse of a tensor and the transpose of a symmetric tensor. Functionality in the package is outlined in Kernfeld, E., Kilmer, M., and Aeron, S. (2015) <doi:10.1016/j.laa.2015.07.021>.
Animalcules is an R package for utilizing up-to-date data analytics, visualization methods, and machine learning models to provide users an easy-to-use interactive microbiome analysis framework. It can be used as a standalone software package or users can explore their data with the accompanying interactive R Shiny application. Traditional microbiome analysis such as alpha/beta diversity and differential abundance analysis are enhanced, while new methods like biomarker identification are introduced by animalcules. Powerful interactive and dynamic figures generated by animalcules enable users to understand their data better and discover new insights.
This variant of the Racket BC (``before Chez'' or ``bytecode'') implementation is not recommended for general use. It uses CGC (a ``Conservative Garbage Collector''), which was succeeded as default in PLT Scheme version 370 (which translates to 3.7 in the current versioning scheme) by the 3M variant, which in turn was succeeded in version 8.0 by the Racket CS implementation.
Racket CGC is primarily used for bootstrapping Racket BC [3M]. It may also be used for embedding applications without the annotations needed in C code to use the 3M garbage collector.
The primary function makeCPMSampler()
generates a sampler function which performs the correlated pseudo-marginal method of Deligiannidis, Doucet and Pitt (2017) <arXiv:1511.04992>
. If the rho= argument of makeCPMSampler()
is set to 0, then the generated sampler function performs the original pseudo-marginal method of Andrieu and Roberts (2009) <DOI:10.1214/07-AOS574>. The sampler function is constructed with the user's choice of prior, parameter proposal distribution, and the likelihood approximation scheme. Note that this algorithm is not automatically tuned--each one of these arguments must be carefully chosen.
This package provides a clustering algorithm similar to K-Means is implemented, it has two main advantages, namely (a) The estimator is resistant to outliers, that means that results of estimator are still correct when there are atypical values in the sample and (b) The estimator is efficient, roughly speaking, if there are no outliers in the sample, results will be similar to those obtained by a classic algorithm (K-Means). Clustering procedure is carried out by minimizing the overall robust scale so-called tau scale. (see Gonzalez, Yohai and Zamar (2019) <arxiv:1906.08198>).
This package implements quantile smoothing. It contains a dataset used to produce human chromosomal ideograms for plotting purposes and a collection of arrays that contains data of chromosome 14 of 3 colorectal tumors. The package provides functions for painting chromosomal icons, chromosome or chromosomal idiogram and other types of plots. Quantsmooth offers options like converting chromosomal ids to their numeric form, retrieving the human chromosomal length from NCBI data, retrieving regions of interest in a vector of intensities using quantile smoothing, determining cytoband position based on the location of the probe, and other useful tools.