This package can do non-parametric bootstrap and permutation resampling-based multiple testing procedures (including empirical Bayes methods) for controlling the family-wise error rate (FWER), generalized family-wise error rate (gFWER), tail probability of the proportion of false positives (TPPFP), and false discovery rate (FDR). Several choices of bootstrap-based null distribution are implemented (centered, centered and scaled, quantile-transformed). Single-step and step-wise methods are available. Tests based on a variety of T- and F-statistics (including T-statistics based on regression parameters from linear and survival models as well as those based on correlation parameters) are included. When probing hypotheses with T-statistics, users may also select a potentially faster null distribution which is multivariate normal with mean zero and variance covariance matrix derived from the vector influence function. Results are reported in terms of adjusted P-values, confidence regions and test statistic cutoffs. The procedures are directly applicable to identifying differentially expressed genes in DNA microarray experiments.
Cell surface proteins form a major fraction of the druggable proteome and can be used for tissue-specific delivery of oligonucleotide/cell-based therapeutics. Alternatively spliced surface protein isoforms have been shown to differ in their subcellular localization and/or their transmembrane (TM) topology. Surface proteins are hydrophobic and remain difficult to study thereby necessitating the use of TM topology prediction methods such as TMHMM and Phobius. However, there exists a need for bioinformatic approaches to streamline batch processing of isoforms for comparing and visualizing topologies. To address this gap, we have developed an R package, surfaltr. It pairs inputted isoforms, either known alternatively spliced or novel, with their APPRIS annotated principal counterparts, predicts their TM topologies using TMHMM or Phobius, and generates a customizable graphical output. Further, surfaltr facilitates the prioritization of biologically diverse isoform pairs through the incorporation of three different ranking metrics and through protein alignment functions. Citations for programs mentioned here can be found in the vignette.
The phenomis package provides methods to perform post-processing (i.e. quality control and normalization) as well as univariate statistical analysis of single and multi-omics data sets. These methods include quality control metrics, signal drift and batch effect correction, intensity transformation, univariate hypothesis testing, but also clustering (as well as annotation of metabolomics data). The data are handled in the standard Bioconductor formats (i.e. SummarizedExperiment and MultiAssayExperiment for single and multi-omics datasets, respectively; the alternative ExpressionSet and MultiDataSet formats are also supported for convenience). As a result, all methods can be readily chained as workflows. The pipeline can be further enriched by multivariate analysis and feature selection, by using the ropls and biosigner packages, which support the same formats. Data can be conveniently imported from and exported to text files. Although the methods were initially targeted to metabolomics data, most of the methods can be applied to other types of omics data (e.g., transcriptomics, proteomics).
Mutational signatures are carcinogenic exposures or aberrant cellular processes that can cause alterations to the genome. We created musicatk (MUtational SIgnature Comprehensive Analysis ToolKit) to address shortcomings in versatility and ease of use in other pre-existing computational tools. Although many different types of mutational data have been generated, current software packages do not have a flexible framework to allow users to mix and match different types of mutations in the mutational signature inference process. Musicatk enables users to count and combine multiple mutation types, including SBS, DBS, and indels. Musicatk calculates replication strand, transcription strand and combinations of these features along with discovery from unique and proprietary genomic feature associated with any mutation type. Musicatk also implements several methods for discovery of new signatures as well as methods to infer exposure given an existing set of signatures. Musicatk provides functions for visualization and downstream exploratory analysis including the ability to compare signatures between cohorts and find matching signatures in COSMIC V2 or COSMIC V3.
This package provides Gaussian mixture models, k-means, mini-batch-kmeans, k-medoids and affinity propagation clustering with the option to plot, validate, predict (new data) and estimate the optimal number of clusters. The package takes advantage of RcppArmadillo to speed up the computationally intensive parts of the functions. For more information, see
"Clustering in an Object-Oriented Environment" by Anja Struyf, Mia Hubert, Peter Rousseeuw (1997), Journal of Statistical Software, https://doi.org/10.18637/jss.v001.i04;
"Web-scale k-means clustering" by D. Sculley (2010), ACM Digital Library, https://doi.org/10.1145/1772690.1772862;
"Armadillo: a template-based C++ library for linear algebra" by Sanderson et al (2016), The Journal of Open Source Software, https://doi.org/10.21105/joss.00026;
"Clustering by Passing Messages Between Data Points" by Brendan J. Frey and Delbert Dueck, Science 16 Feb 2007: Vol. 315, Issue 5814, pp. 972-976, https://doi.org/10.1126/science.1136800.
Seahtrue organizes oxygen consumption and extracellular acidification analysis data from experiments performed on an XF analyzer into structured nested tibbles.This allows for detailed processing of raw data and advanced data visualization and statistics. Seahtrue introduces an open and reproducible way to analyze these XF experiments. It uses file paths to .xlsx files. These .xlsx files are supplied by the userand are generated by the user in the Wave software from Agilent from the assay result files (.asyr). The .xlsx file contains different sheets of important data for the experiment; 1. Assay Information - Details about how the experiment was set up. 2. Rate Data - Information about the OCR and ECAR rates. 3. Raw Data - The original raw data collected during the experiment. 4. Calibration Data - Data related to calibrating the instrument. Seahtrue focuses on getting the specific data needed for analysis. Once this data is extracted, it is prepared for calculations through preprocessing. To make sure everything is accurate, both the initial data and the preprocessed data go through thorough checks.
This package provides meta-analysis methods that correct for publication bias and outcome reporting bias. Four methods and a visual tool are currently included in the package.
The p-uniform method as described in van Assen, van Aert, and Wicherts (2015) doi:10.1037/met0000025 can be used for estimating the average effect size, testing the null hypothesis of no effect, and testing for publication bias using only the statistically significant effect sizes of primary studies.
The p-uniform* method as described in van Aert and van Assen (2019) doi:10.31222/osf.io/zqjr9. This method is an extension of the p-uniform method that allows for estimation of the average effect size and the between-study variance in a meta-analysis, and uses both the statistically significant and nonsignificant effect sizes.
The hybrid method as described in van Aert and van Assen (2017) doi:10.3758/s13428-017-0967-6. The hybrid method is a meta-analysis method for combining an original study and replication and while taking into account statistical significance of the original study. The p-uniform and hybrid method are based on the statistical theory that the distribution of p-values is uniform conditional on the population effect size.
The fourth method in the package is the Snapshot Bayesian Hybrid Meta-Analysis Method as described in van Aert and van Assen (2018) doi:10.1371/journal.pone.0175302. This method computes posterior probabilities for four true effect sizes (no, small, medium, and large) based on an original study and replication while taking into account publication bias in the original study. The method can also be used for computing the required sample size of the replication akin to power analysis in null hypothesis significance testing.
The meta-plot is a visual tool for meta-analysis that provides information on the primary studies in the meta-analysis, the results of the meta-analysis, and characteristics of the research on the effect under study (van Assen and others, 2020).
Helper functions to apply the Correcting for Outcome Reporting Bias (CORB) method to correct for outcome reporting bias in a meta-analysis (van Aert & Wicherts, 2020).
Enables mapping of country level and gridded user datasets.
Reline is a pure Ruby alternative GNU Readline or Editline implementation.
Platform Design Info for The Manufacturer's Name Rhesus.
This package provides an R interface to functions of the SAMtools library.
This package provides a command line interface for running for Rack applications.
This package provides a command line interface for running for Rack applications.
Platform Design Info for The Manufacturer's Name RN_U34.
This package provides tools to convert R Markdown documents into a variety of formats.
Annotation data file for ratCHRLOC assembled using data from public data repositories.
This package provides a package containing an environment representing the Rhesus.cdf file.
This package provides a package containing an environment representing the RG_U34B.cdf file.
This package provides a package containing an environment representing the RG_U34A.cdf file.
This package provides a package containing an environment representing the RG_U34C.cdf file.
Run R CMD check from R programmatically, and capture the results of the individual checks.
Affymetrix Affymetrix RG_U34A Array annotation data (chip rgu34a) assembled using data from public repositories.
Generator of web pages which display interactive network/graph visualizations with D3js, jQuery and Raphael.
Affymetrix Affymetrix RG_U34B Array annotation data (chip rgu34b) assembled using data from public repositories.