The tidyomics ecosystem is a set of packages for ’omic data analysis that work together in harmony; they share common data representations and API design, consistent with the tidyverse ecosystem. The tidyomics package is designed to make it easy to install and load core packages from the tidyomics ecosystem with a single command.
This package implements sampling, iteration, and input of FASTQ files. It includes functions for filtering and trimming reads, and for generating a quality assessment report. Data are represented as DNAStringSet-derived objects, and easily manipulated for a diversity of purposes. The package also contains legacy support for early single-end, ungapped alignment formats.
This package provides a pure data-driven gene network, WGCN(weighted gene co-expression network) could be constructed only from expression profile. Different layers in such networks may represent different time points, multiple conditions or various species. AMOUNTAIN aims to search active modules in multi-layer WGCN using a continuous optimization approach.
This package provides an R implementation of an extension of the BayeScan software for codominant markers, adding the option to group individual SNPs into pre-defined blocks. A typical application of this new approach is the identification of genomic regions, genes, or gene sets containing one or more SNPs that evolved under directional selection.
This package provides miscellaneous small tools and utilities. Many of them facilitate the work with matrices, e.g. inserting rows or columns, creating symmetric matrices, or checking for semidefiniteness. Other tools facilitate the work with regression models, e.g. extracting the standard errors, obtaining the number of (estimated) parameters, or calculating R-squared values.
RLassoCox is a package that implements the RLasso-Cox model proposed by Wei Liu. The RLasso-Cox model integrates gene interaction information into the Lasso-Cox model for accurate survival prediction and survival biomarker discovery. It is based on the hypothesis that topologically important genes in the gene interaction network tend to have stable expression changes. The RLasso-Cox model uses random walk to evaluate the topological weight of genes, and then highlights topologically important genes to improve the generalization ability of the Lasso-Cox model. The RLasso-Cox model has the advantage of identifying small gene sets with high prognostic performance on independent datasets, which may play an important role in identifying robust survival biomarkers for various cancer types.
This package provides tools for downloading hourly averages, daily maximums and minimums from each of the pollution, wind, and temperature measuring stations or geographic zones in the Mexico City metro area. The package also includes the locations of each of the stations and zones. See <http://aire.cdmx.gob.mx/> for more information.
Small toolbox for data analyses in environmental chemistry and ecotoxicology. Provides, for example, calibration() to calculate calibration curves and corresponding limits of detection (LODs) and limits of quantification (LOQs) according to German DIN 32645 (2008). texture() makes it easy to estimate soil particle size distributions from hydrometer measurements (ASTM D422-63, 2007).
Create local, regional, and global explanations for any machine learning model with forward marginal effects. You provide a model and data, and fmeffects computes feature effects. The package is based on the theory in: C. A. Scholbeck, G. Casalicchio, C. Molnar, B. Bischl, and C. Heumann (2022) <doi:10.48550/arXiv.2201.08837>.
This package provides a quick and easy way of plotting the columns of two matrices or data frames against each other using ggplot2'. Although ggmatplot doesn't provide the same flexibility as ggplot2', it can be used as a workaround for having to wrangle wide format data into long format for plotting with ggplot2'.
An extension of ggplot2 for creating complex genomic maps. It builds on the power of ggplot2 and tidyverse adding new ggplot2'-style geoms & positions and dplyr'-style verbs to manipulate the underlying data. It implements a layout concept inspired by ggraph and introduces tracks to bring tidiness to the mess that is genomics data.
Geostatistical modelling facilities using SpatRaster and SpatVector objects are provided. Non-Gaussian models are fit using INLA', and Gaussian geostatistical models use Maximum Likelihood Estimation. For details see Brown (2015) <doi:10.18637/jss.v063.i12>. The RandomFields package is available at <https://www.wim.uni-mannheim.de/schlather/publications/software>.
This package implements bootstrap methods for linear regression models with errors following a time-varying process, focusing on approximating the distribution of the least-squares estimator for regression models with locally stationary errors. It enables the construction of bootstrap and classical confidence intervals for regression coefficients, leveraging intensive simulation studies and real data analysis.
Solves quadratic programming problems where the Hessian is represented as the product of two matrices. Thanks to Greg Hunt for helping getting this version back on CRAN. The methods in this package are described in: Ormerod, Wand and Koch (2008) "Penalised spline support vector classifiers: computational issues" <doi:10.1007/s00180-007-0102-8>.
Fits the Multiple Random Dot Product Graph Model and performs a test for whether two networks come from the same distribution. Both methods are proposed in Nielsen, A.M., Witten, D., (2018) "The Multiple Random Dot Product Graph Model", arXiv preprint <arXiv:1811.12172> (Submitted to Journal of Computational and Graphical Statistics).
Implementation of the MarkerPen algorithm, short for marker gene detection via penalized principal component analysis, described in the paper by Qiu, Wang, Lei, and Roeder (2020, <doi:10.1101/2020.11.07.373043>). MarkerPen is a semi-supervised algorithm for detecting marker genes by combining prior marker information with bulk transcriptome data.
This package provides functions to simulate point prevalence studies (PPSs) of healthcare-associated infections (HAIs) and to convert prevalence to incidence in steady state setups. Companion package to the preprint Willrich et al., From prevalence to incidence - a new approach in the hospital setting; <doi:10.1101/554725> , where methods are explained in detail.
This package provides functions to compute the potential model as defined by Stewart (1941) <doi:10.1126/science.93.2404.89>. Several options are available to customize the model, such as the possibility to fine-tune the distance friction functions or to use custom distance matrices. Some computations are parallelized to improve their efficiency.
Figures rendered on graphics devices are usually rescaled to fit pre-determined device dimensions. plotscale implements the reverse: desired plot dimensions are specified and device dimensions are calculated to accommodate marginal material, giving consistent proportions for plot elements. Default methods support grid graphics such as lattice and ggplot. See "example('devsize')" and "vignette('plotscale')".
This package provides methods for assessing the performance of a prediction model with respect to identifying patient-level treatment benefit. All methods are applicable for continuous and binary outcomes, and for any type of statistical or machine-learning prediction model as long as it uses baseline covariates to predict outcomes under treatment and control.
Calculate parametric mortality and Fertility models, following packages BaSTA in Colchero, Jones and Rebke (2012) <doi:10.1111/j.2041-210X.2012.00186.x> and BaFTA <https://github.com/fercol/BaFTA>, summary statistics (e.g. ageing rates, life expectancy, lifespan equality, etc.), life table and product limit estimators from census data.
This package provides functions to access survey results directly into R using the Qualtrics API. Qualtrics <https://www.qualtrics.com/about/> is an online survey and data collection software platform. See <https://api.qualtrics.com/> for more information about the Qualtrics API. This package is community-maintained and is not officially supported by Qualtrics'.
Collection of functions to evaluate presence-absence models. It comprises functions to adjust discrimination statistics for the representativeness effect through case-weighting, along with functions for visualizing the outcomes. Originally outlined in: Jiménez-Valverde (2022) The uniform AUC: dealing with the representativeness effect in presence-absence models. Methods Ecol. Evol, 13, 1224-1236.
This package provides a user-friendly interface to map on-targets and off-targets of CRISPR gRNA spacer sequences using bwa. The alignment is fast, and can be performed using either commonly-used or custom CRISPR nucleases. The alignment can work with any reference or custom genomes. Currently not supported on Windows machines.