Mass rollup for a Bill of Materials is an example of a class of computations in which elements are arranged in a tree structure and some property of each element is a computed function of the corresponding values of its child elements. Leaf elements, i.e., those with no children, have values assigned. In many cases, the combining function is simple arithmetic sum; in other cases (e.g., mass properties), the combiner may involve other information such as the geometric relationship between parent and child, or statistical relations such as root-sum-of-squares (RSS). This package implements a general function for such problems. It is adapted to specific recursive computations by functional programming techniques; the caller passes a function as the update parameter to rollup() (or, at a lower level, passes functions as the get, set, combine, and override parameters to update_prop()) at runtime to specify the desired operations. The implementation relies on graph-theoretic algorithms from the igraph package of Csárdi, et al. (2006 <doi:10.5281/zenodo.7682609>).
This package provides methods for interpreting CoDa (Compositional Data) regression models along the lines of "Pairwise share ratio interpretations of compositional regression models" (Dargel and Thomas-Agnan 2024) <doi:10.1016/j.csda.2024.107945>. The new methods include variation scenarios, elasticities, elasticity differences and share ratio elasticities. These tools are independent of log-ratio transformations and allow an interpretation in the original space of shares. CoDaImpact is designed to be used with the compositions package and its ecosystem.
This package provides a single function that supports the installation of all packages belonging to the dartRverse'. The dartRverse is a set of packages that work together to analyse SNP (single nuclear polymorphism) data. All packages aim to have a similar look and feel and are based on the same type of data structure ('genlight'), with additional metadata for loci and individuals (samples). For more information visit the GitHub pages <https://github.com/green-striped-gecko/dartRverse>.
The main function of this package allows numerical vector objects to be displayed with their values in vulgar fractional form. This is convenient if patterns can then be more easily detected. In some cases replacing the components of a numeric vector by a rational approximation can also be expected to remove some component of round-off error. The main functions form a re-implementation of the functions fractions and rational of the MASS package, but using a radically improved programming strategy.
Compare variables of interest between (potentially large numbers of) spatial interactions and meta-variables. Spatial variables are summarized using K, or other, functions, and projected for use in a modified random forest model. The model allows comparison of functional and non-functional variables to each other and to noise, giving statistical significance to the results. Included are preparation, modeling, and interpreting tools along with example datasets, as described in VanderDoes et al., (2023) <doi:10.1101/2023.07.18.549619>.
This package provides S4 classes and methods for reading and manipulating aligned DNA sequences, supporting an indel-coding method (only simple indel-coding method is available in the current version), showing base substitutions and indels, calculating absolute pairwise distances between DNA sequences, and collapsing identical DNA sequences into haplotypes or inferring haplotypes using user-provided absolute pairwise character difference matrix. This package also includes S4 classes and methods for estimating genealogical relationships among haplotypes using statistical parsimony and plotting parsimony networks.
Implementation of Kmeans clustering algorithm and a supervised KNN (K Nearest Neighbors) learning method. It allows users to perform unsupervised clustering and supervised classification on their datasets. Additional features include data normalization, imputation of missing values, and the choice of distance metric. The package also provides functions to determine the optimal number of clusters for Kmeans and the best k-value for KNN: knn_Function(), find_Knn_best_k(), KMEANS_FUNCTION(), and find_Kmeans_best_k().
The field of immunology benefits from software that can predict which peptide sequences trigger an immune response. NetMHCIIpan is a such a tool: it predicts the binding strength of a short peptide to a Major Histocompatibility Complex class II (MHC-II) molecule. NetMHCIIpan can be used from a web server at <https://services.healthtech.dtu.dk/services/NetMHCIIpan-3.2/> or from the command-line, using a local installation. This package allows to call NetMHCIIpan from R.
Conduct internal validation of a clinical prediction model for a binary outcome. Produce bias corrected performance metrics (c-statistic, Brier score, calibration intercept/slope) via bootstrap (simple bootstrap, bootstrap optimism, .632 optimism) and cross-validation (CV optimism, CV average). Also includes functions to assess model stability via bootstrap resampling. See Steyerberg et al. (2001) <doi:10.1016/s0895-4356(01)00341-9>; Harrell (2015) <doi:10.1007/978-3-319-19425-7>; Riley and Collins (2023) <doi:10.1002/bimj.202200302>.
Given a list of substance compositions, a list of substances involved in a process, and a list of constraints in addition to mass conservation of elementary constituents, the package contains functions to build the substance composition matrix, to analyze the uniqueness of process stoichiometry, and to calculate stoichiometric coefficients if process stoichiometry is unique. (See Reichert, P. and Schuwirth, N., A generic framework for deriving process stoichiometry in enviromental models, Environmental Modelling and Software 25, 1241-1251, 2010 for more details.).
Analysis of multi environment data of plant breeding experiments following the analyses described in Malosetti, Ribaut, and van Eeuwijk (2013), <doi:10.3389/fphys.2013.00044>. One of a series of statistical genetic packages for streamlining the analysis of typical plant breeding experiments developed by Biometris. Some functions have been created to be used in conjunction with the R package asreml for the ASReml software, which can be obtained upon purchase from VSN international (<https://vsni.co.uk/software/asreml-r/>).
This package provides a collection of functions related to novel methods for estimating R(t), created by the lab of Professor Laura White. Currently implemented methods include two-step Bayesian back-calculation and now-casting for line-list data with missing reporting delays, adapted in STAN from Li (2021) <doi:10.1371/journal.pcbi.1009210>, and calculation of time-varying reproduction number assuming a flux between various adjacent states, adapted into STAN from Zhou (2021) <doi:10.1371/journal.pcbi.1010434>.
This package unifies access to Statistal Modeling of Omics Data. Across linear modeling engines (lm, lme, lmer, limma, and wilcoxon). Across coding systems (treatment, difference, deviation, etc). Across model formulae (with/without intercept, random effect, interaction or nesting). Across omics platforms (microarray, rnaseq, msproteomics, affinity proteomics, metabolomics). Across projection methods (pca, pls, sma, lda, spls, opls). Across clustering methods (hclust, pam, cmeans). It provides a fast enrichment analysis implementation. And an intuitive contrastogram visualisation to summarize contrast effects in complex designs.
CompoundDb provides functionality to create and use (chemical) compound annotation databases from a variety of different sources such as LipidMaps, HMDB, ChEBI or MassBank. The database format allows to store in addition MS/MS spectra along with compound information. The package provides also a backend for Bioconductor's Spectra package and allows thus to match experimetal MS/MS spectra against MS/MS spectra in the database. Databases can be stored in SQLite format and are thus portable.
The Barnes benchmark dataset can be used to evaluate the algorithms for Illumina microarrays. It measured a titration series of two human tissues, blood and placenta, and includes six samples with the titration ratio of blood and placenta as 100:0, 95:5, 75:25, 50:50, 25:75 and 0:100. The samples were hybridized on HumanRef-8 BeadChip (Illumina, Inc) in duplicate. The data is loaded as an LumiBatch Object (see documents in the lumi package).
NxtIRFdata is a companion package for SpliceWiz, an interactive analysis and visualization tool for alternative splicing quantitation (including intron retention) for RNA-seq BAM files. NxtIRFdata contains Mappability files required for the generation of human and mouse references. NxtIRFdata also contains a synthetic genome reference and example BAM files used to demonstrate SpliceWiz's functionality. BAM files are based on 6 samples from the Leucegene dataset provided by NCBI Gene Expression Omnibus under accession number GSE67039.
This package provides tools for conducting Bayesian analyses and Bayesian model averaging (Kass and Raftery, 1995, <doi:10.1080/01621459.1995.10476572>, Hoeting et al., 1999, <doi:10.1214/ss/1009212519>). The package contains functions for creating a wide range of prior distribution objects, mixing posterior samples from JAGS and Stan models, plotting posterior distributions, and etc... The tools for working with prior distribution span from visualization, generating JAGS and bridgesampling syntax to basic functions such as rng, quantile, and distribution functions.
Given the hypothesis of a bi-modal distribution of cells for each marker, the algorithm constructs a binary tree, the nodes of which are subpopulations of cells. At each node, observed cells and markers are modeled by both a family of normal distributions and a family of bi-modal normal mixture distributions. Splitting is done according to a normalized difference of AIC between the two families. Method is detailed in: Commenges, Alkhassim, Gottardo, Hejblum & Thiebaut (2018) <doi: 10.1002/cyto.a.23601>.
Estimation of the parameters in a model for symmetric relational data (e.g., the above-diagonal part of a square matrix), using a model-based eigenvalue decomposition and regression. Missing data is accommodated, and a posterior mean for missing data is calculated under the assumption that the data are missing at random. The marginal distribution of the relational data can be arbitrary, and is fit with an ordered probit specification. See Hoff (2007) <arXiv:0711.1146> for details on the model.
Measure fairness metrics in one place for many models. Check how big is model's bias towards different races, sex, nationalities etc. Use measures such as Statistical Parity, Equal odds to detect the discrimination against unprivileged groups. Visualize the bias using heatmap, radar plot, biplot, bar chart (and more!). There are various pre-processing and post-processing bias mitigation algorithms implemented. Package also supports calculating fairness metrics for regression models. Find more details in (WiÅ niewski, Biecek (2021)) <arXiv:2104.00507>.
This package provides a toolbox for constructing potential landscapes for Ising networks. The parameters of the networks can be directly supplied by users or estimated by the IsingFit package by van Borkulo and Epskamp (2016) <https://CRAN.R-project.org/package=IsingFit> from empirical data. The Ising model's Boltzmann distribution is preserved for the potential landscape function. The landscape functions can be used for quantifying and visualizing the stability of network states, as well as visualizing the simulation process.
This package performs multilevel matches for data with cluster- level treatments and individual-level outcomes using a network optimization algorithm. Functions for checking balance at the cluster and individual levels are also provided, as are methods for permutation-inference-based outcome analysis. Details in Pimentel et al. (2018) <doi:10.1214/17-AOAS1118>. The optmatch package, which is useful for running many of the provided functions, may be downloaded from Github at <https://github.com/markmfredrickson/optmatch> if not available on CRAN.
This package provides a switch-case construct for R', as it is known from other programming languages. It allows to test multiple, similar conditions in an efficient, easy-to-read manner, so nested if-else constructs can be avoided. The switch-case construct is designed as an R function that allows to return values depending on which condition is met and lets the programmer flexibly decide whether or not to leave the switch-case construct after a case block has been executed.
This package addresses two broad areas. It allows for in-depth analysis of spatial transcriptomic data by identifying tissue neighbourhoods. These are contiguous regions of tissue surrounding individual cells. CatsCradle allows for the categorisation of neighbourhoods by the cell types contained in them and the genes expressed in them. In particular, it produces Seurat objects whose individual elements are neighbourhoods rather than cells. In addition, it enables the categorisation and annotation of genes by producing Seurat objects whose elements are genes.