Lineagespot is a framework written in R, and aims to identify SARS-CoV-2
related mutations based on a single (or a list) of variant(s) file(s) (i.e., variant calling format). The method can facilitate the detection of SARS-CoV-2
lineages in wastewater samples using next generation sequencing, and attempts to infer the potential distribution of the SARS-CoV-2
lineages.
MerfishData
is an ExperimentHub
package that serves publicly available datasets obtained with Multiplexed Error-Robust Fluorescence in situ Hybridization (MERFISH). MERFISH is a massively multiplexed single-molecule imaging technology capable of simultaneously measuring the copy number and spatial distribution of hundreds to tens of thousands of RNA species in individual cells. The scope of the package is to provide MERFISH data for benchmarking and analysis.
# NetActivity
enables to compute gene set scores from previously trained sparsely-connected autoencoders. The package contains a function to prepare the data (`prepareSummarizedExperiment`
) and a function to compute the gene set scores (`computeGeneSetScores`
). The package `NetActivityData`
contains different pre-trained models to be directly applied to the data. Alternatively, the users might use the package to compute gene set scores using custom models.
This package defines low-level functions for mass spectrometry data and is independent of any high-level data structures. These functions include mass spectra processing functions (noise estimation, smoothing, binning), quantitative aggregation functions (median polish, robust summarisation, etc.), missing data imputation, data normalisation (quantiles, vsn, etc.) as well as misc helper functions, that are used across high-level data structure within the R for Mass Spectrometry packages.
This package generates graphics with embedded details from statistical tests. Statistical tests included in the plots themselves. It provides an easier syntax to generate information-rich plots for statistical analysis of continuous or categorical data. Currently, it supports the most common types of statistical approaches and tests: parametric, nonparametric, robust, and Bayesian versions of t-test/ANOVA, correlation analyses, contingency table analysis, meta-analysis, and regression analyses.
Presents a series of molecular and genetic routines in the R environment with the aim of assisting in analytical pipelines before and after the use of asreml or another library to perform analyses such as Genomic Selection or Genome-Wide Association Analyses. Methods and examples are described in Gezan, Oliveira, Galli, and Murray (2022) <https://asreml.kb.vsni.co.uk/wp-content/uploads/sites/3/ASRgenomics_Manual.pdf>.
Downloads wrangled Colombian socioeconomic, geospatial,population and climate data from DANE <https://www.dane.gov.co/> (National Administrative Department of Statistics) and IDEAM (Institute of Hydrology, Meteorology and Environmental Studies). It solves the problem of Colombian data being issued in different web pages and sources by using functions that allow the user to select the desired database and download it without having to do the exhausting acquisition process.
Creativity research involves the need to score open-ended problems. Usually done by humans, automatic scoring using AI becomes more and more accurate. This package provides a simple interface to the Open Scoring API <https://openscoring.du.edu/docs>, leading creativity scoring technology by Organiscak et al. (2023) <doi:10.1016/j.tsc.2023.101356>. With it, you can score your own data directly from an R script.
This package provides seamless access to the QGIS (<https://qgis.org>) processing toolbox using the standalone qgis_process command-line utility. Both native and third-party (plugin) processing providers are supported. Beside referring data sources from file, also common objects from sf', terra and stars are supported. The native processing algorithms are documented by QGIS.org (2024) <https://docs.qgis.org/latest/en/docs/user_manual/processing_algs/>.
This package provides methods for managing under- and over-enrollment in Simon's Two-Stage Design are offered by providing adaptive threshold adjustments and sample size recalibration. It also includes post-inference analysis tools to support clinical trial design and evaluation. The package is designed to enhance flexibility and accuracy in trial design, ensuring better outcomes in oncology and other clinical studies. Yunhe Liu, Haitao Pan (2024). Submitted.
Spatially-aware quality control (QC) software for both spot-level and artifact-level QC in spot-based spatial transcripomics, such as 10x Visium. These methods calculate local (nearest-neighbors) mean and variance of standard QC metrics (library size, unique genes, and mitochondrial percentage) to identify outliers spot and large technical artifacts. Scales linearly with the number of spots and is designed to be used with SpatialExperiment
objects.
Adaptive and Robust Transfer Learning (ART) is a flexible framework for transfer learning that integrates information from auxiliary data sources to improve model performance on primary tasks. It is designed to be robust against negative transfer by including the non-transfer model in the candidate pool, ensuring stable performance even when auxiliary datasets are less informative. See the paper, Wang, Wu, and Ye (2023) <doi:10.1002/sta4.582>.
Assignment of cell type labels to single-cell RNA sequencing (scRNA-seq
) clusters is often a time-consuming process that involves manual inspection of the cluster marker genes complemented with a detailed literature search. This is especially challenging when unexpected or poorly described populations are present. The clustermole R package provides methods to query thousands of human and mouse cell identity markers sourced from a variety of databases.
This package provides a set of tools for evaluating clustering robustness using proportion of ambiguously clustered pairs (Senbabaoglu et al. (2014) <doi:10.1038/srep06207>), as well as similarity across methods and method stability using element-centric clustering comparison (Gates et al. (2019) <doi:10.1038/s41598-019-44892-y>). Additionally, this package enables stability-based parameter assessment for graph-based clustering pipelines typical in single-cell data analysis.
Simplifies the process of creating essential visualizations in R, offering a range of plotting functions for common chart types like violin plots, pie charts, and histograms. With an intuitive interface, users can effortlessly customize colors, labels, and styles, making it an ideal tool for both beginners and experienced data analysts. Whether exploring datasets or producing quick visual summaries, this package provides a streamlined solution for fundamental graphics in R.
It predicts any attribute (categorical) given a set of input numeric predictor values. Note that only numeric input predictors should be given. The k value can be chosen according to accuracies provided. The attribute to be predicted can be selected from the dropdown provided (select categorical attribute). This is because categorical attributes cannot be given as inputs here. A handsontable is also provided to enter the input predictor values.
The provided package implements multiple contrast tests for functional data (Munko et al., 2023, <arXiv:2306.15259>
). These procedures enable us to evaluate the overall hypothesis regarding equality, as well as specific hypotheses defined by contrasts. In particular, we can perform post hoc tests to examine particular comparisons of interest. Different experimental designs are supported, e.g., one-way and multi-way analysis of variance for functional data.
This package performs support vectors analysis for data sets with survival outcome. Three approaches are available in the package: The regression approach takes censoring into account when formulating the inequality constraints of the support vector problem. In the ranking approach, the inequality constraints set the objective to maximize the concordance index for comparable pairs of observations. The hybrid approach combines the regression and ranking constraints in the same model.
This package implements inverse and augmented inverse probability weighted estimators for common treatment effect parameters at an interim analysis with time-lagged outcome that may not be available for all enrolled subjects. Produces estimators, standard errors, and information that can be used to compute stopping boundaries using software that assumes that the estimators/test statistics have independent increments. Tsiatis, A. A. and Davidian, M., (2022) <arXiv:2204.10739>
.
The cfToolsData
package supplies the data for the cfTools
package. It contains two pre-trained deep neural network (DNN) models for the cfSort
function. Additionally, it includes the shape parameters of beta distribution characterizing methylation markers associated with four tumor types for the CancerDetector
function, as well as the parameters characterizing methylation markers specific to 29 primary human tissue types for the cfDeconvolve
function.
This package contains the data for the paper by L. David et al. in PNAS 2006 (PMID 16569694): 8 CEL files of Affymetrix genechips, an ExpressionSet
object with the raw feature data, a probe annotation data structure for the chip and the yeast genome annotation (GFF file) that was used. In addition, some custom-written analysis functions are provided, as well as R scripts in the scripts directory.
Rank results by confident effect sizes, while maintaining False Discovery Rate and False Coverage-statement Rate control. Topconfects is an alternative presentation of TREAT results with improved usability, eliminating p-values and instead providing confidence bounds. The main application is differential gene expression analysis, providing genes ranked in order of confident log2 fold change, but it can be applied to any collection of effect sizes with associated standard errors.
This package provides tools for working with Type S (Sign) and Type M (Magnitude) errors, as proposed in Gelman and Tuerlinckx (2000) <doi:10.1007/s001800000040> and Gelman & Carlin (2014) <doi:10.1177/1745691614551642>. In addition to simply calculating the probability of Type S/M error, the package includes functions for calculating these errors across a variety of effect sizes for comparison, and recommended sample size given "tolerances" for Type S/M errors. To improve the speed of these calculations, closed forms solutions for the probability of a Type S/M error from Lu, Qiu, and Deng (2018) <doi:10.1111/bmsp.12132> are implemented. As of 1.0.0, this includes support only for simple research designs. See the package vignette for a fuller exposition on how Type S/M errors arise in research, and how to analyze them using the type of design analysis proposed in the above papers.
Tracking accrual in clinical trials is important for trial success. If accrual is too slow, the trial will take too long and be too expensive. If accrual is much faster than expected, time sensitive tasks such as the writing of statistical analysis plans might need to be rushed. accrualPlot
provides functions to aid the tracking of accrual and predict when a trial will reach it's intended sample size.