This package provides methods for automatic calculation of gene scores from gene count tables, including a Z-score method that requires a table of samples being scored and a count table with control samples; a geometric mean method that does not rely on control samples; and a principal component-based method that summarizes gene expression using user-selected principal components. The Z-score and geometric mean approaches are described in Kim et al. (2018) <doi:10.1089/jir.2017.0127>.
Prototypes for construction of a Gaussian Stochastic Process emulator (GASP) of a computer model. This is done within the objective Bayesian implementation of the GASP. The package allows for construction of a linked GASP of the composite computer model. Computational implementation follows the mathematical exposition given in publication: Ksenia N. Kyzyurova, James O. Berger, Robert L. Wolpert. Coupling computer models through linking their statistical emulators. SIAM/ASA Journal on Uncertainty Quantification, 6(3): 1151-1171, (2018).<DOI:10.1137/17M1157702>.
Software to support the introductory *MOSAIC Calculus* textbook <https://www.mosaic-web.org/MOSAIC-Calculus/>), one of many data- and modeling-oriented educational resources developed by Project MOSAIC (<https://www.mosaic-web.org/>). Provides symbolic and numerical differentiation and integration, as well as support for applied linear algebra (for data science), and differential equations/dynamics. Includes grammar-of-graphics-based functions for drawing vector fields, trajectories, etc. The software is suitable for general use, but intended mainly for teaching calculus.
Enable operationalized evaluation of disease outcomes in multiple sclerosis. â MSoutcomesâ requires longitudinally recorded clinical data structured in long format. The package is based on the research developed at Clinical Outcomes Research unit (CORe), University of Melbourne and Neuroimmunology Centre, Royal Melbourne Hospital. Kalincik et al. (2015) <doi:10.1093/brain/awv258>. Lorscheider et al. (2016) <doi:10.1093/brain/aww173>. Sharmin et al. (2022) <doi:10.1111/ene.15406>. Dzau et al. (2023) <doi:10.1136/jnnp-2023-331748>.
This package provides tools for monitoring progress during parallel processing. Lightweight package which acts as a wrapper around mclapply() and adds a progress bar to it in RStudio or Linux environments. Simply replace your original call to mclapply() with pmclapply(). A progress bar can also be displayed during parallelisation via the foreach package. Also included are functions to safely print messages (including error messages) from within parallelised code, which can be useful for debugging parallelised R code.
This package provides tools for checking that the output of an optimization algorithm is indeed at a local mode of the objective function. This is accomplished graphically by calculating all one-dimensional "projection plots" of the objective function, i.e., varying each input variable one at a time with all other elements of the potential solution being fixed. The numerical values in these plots can be readily extracted for the purpose of automated and systematic unit-testing of optimization routines.
This package provides a robust framework for analyzing the extent to which differential survival with respect to higher level trait variation is reducible to lower level variation. In addition to its primary test, it also provides functions for simulation-based power analysis, reading in common data set formats, and visualizing results. Temporarily contains an edited version of function hr.mcp() from package wild1', written by Glen Sargeant. For tutorial see: http://evolve.zoo.ox.ac.uk/Evolve/Perspectev.html.
This package provides methods for extracting various features from time series data. The features provided are those from Hyndman, Wang and Laptev (2013) <doi:10.1109/ICDMW.2015.104>, Kang, Hyndman and Smith-Miles (2017) <doi:10.1016/j.ijforecast.2016.09.004> and from Fulcher, Little and Jones (2013) <doi:10.1098/rsif.2013.0048>. Features include spectral entropy, autocorrelations, measures of the strength of seasonality and trend, and so on. Users can also define their own feature functions.
Mass rollup for a Bill of Materials is an example of a class of computations in which elements are arranged in a tree structure and some property of each element is a computed function of the corresponding values of its child elements. Leaf elements, i.e., those with no children, have values assigned. In many cases, the combining function is simple arithmetic sum; in other cases (e.g., mass properties), the combiner may involve other information such as the geometric relationship between parent and child, or statistical relations such as root-sum-of-squares (RSS). This package implements a general function for such problems. It is adapted to specific recursive computations by functional programming techniques; the caller passes a function as the update parameter to rollup() (or, at a lower level, passes functions as the get, set, combine, and override parameters to update_prop()) at runtime to specify the desired operations. The implementation relies on graph-theoretic algorithms from the igraph package of Csárdi, et al. (2006 <doi:10.5281/zenodo.7682609>).
This package provides functions to obtain an important number of electoral indicators described in the package, which can be divided into two large sections: The first would be the one containing the indicators of electoral disproportionality, such as, Rae index, Loosemoreâ Hanby index, etc. The second group is intended to study the dimensions of the party system vote, through the indicators of electoral fragmentation, polarization, volatility, etc. Moreover, multiple seat allocation simulations can also be performed based on different allocation systems, such as the D'Hondt method, Sainte-Laguë, etc. Finally, some of these functions have been built so that, if the user wishes, the data provided by the Spanish Ministry of Home Office for different electoral processes held in Spain can be obtained automatically. All the above will allow the users to carry out deep studies on the results obtained in any type of electoral process. The methods are described in: Oñate, Pablo and Ocaña, Francisco A. (1999, ISBN:9788474762815); Ruiz Rodrà guez, Leticia M. and Otero Felipe, Patricia (2011, ISBN:9788474766226).
CompoundDb provides functionality to create and use (chemical) compound annotation databases from a variety of different sources such as LipidMaps, HMDB, ChEBI or MassBank. The database format allows to store in addition MS/MS spectra along with compound information. The package provides also a backend for Bioconductor's Spectra package and allows thus to match experimetal MS/MS spectra against MS/MS spectra in the database. Databases can be stored in SQLite format and are thus portable.
The Barnes benchmark dataset can be used to evaluate the algorithms for Illumina microarrays. It measured a titration series of two human tissues, blood and placenta, and includes six samples with the titration ratio of blood and placenta as 100:0, 95:5, 75:25, 50:50, 25:75 and 0:100. The samples were hybridized on HumanRef-8 BeadChip (Illumina, Inc) in duplicate. The data is loaded as an LumiBatch Object (see documents in the lumi package).
NxtIRFdata is a companion package for SpliceWiz, an interactive analysis and visualization tool for alternative splicing quantitation (including intron retention) for RNA-seq BAM files. NxtIRFdata contains Mappability files required for the generation of human and mouse references. NxtIRFdata also contains a synthetic genome reference and example BAM files used to demonstrate SpliceWiz's functionality. BAM files are based on 6 samples from the Leucegene dataset provided by NCBI Gene Expression Omnibus under accession number GSE67039.
This package provides methods for interpreting CoDa (Compositional Data) regression models along the lines of "Pairwise share ratio interpretations of compositional regression models" (Dargel and Thomas-Agnan 2024) <doi:10.1016/j.csda.2024.107945>. The new methods include variation scenarios, elasticities, elasticity differences and share ratio elasticities. These tools are independent of log-ratio transformations and allow an interpretation in the original space of shares. CoDaImpact is designed to be used with the compositions package and its ecosystem.
This package provides a single function that supports the installation of all packages belonging to the dartRverse'. The dartRverse is a set of packages that work together to analyse SNP (single nuclear polymorphism) data. All packages aim to have a similar look and feel and are based on the same type of data structure ('genlight'), with additional metadata for loci and individuals (samples). For more information visit the GitHub pages <https://github.com/green-striped-gecko/dartRverse>.
Compare variables of interest between (potentially large numbers of) spatial interactions and meta-variables. Spatial variables are summarized using K, or other, functions, and projected for use in a modified random forest model. The model allows comparison of functional and non-functional variables to each other and to noise, giving statistical significance to the results. Included are preparation, modeling, and interpreting tools along with example datasets, as described in VanderDoes et al., (2023) <doi:10.1101/2023.07.18.549619>.
The main function of this package allows numerical vector objects to be displayed with their values in vulgar fractional form. This is convenient if patterns can then be more easily detected. In some cases replacing the components of a numeric vector by a rational approximation can also be expected to remove some component of round-off error. The main functions form a re-implementation of the functions fractions and rational of the MASS package, but using a radically improved programming strategy.
This package provides S4 classes and methods for reading and manipulating aligned DNA sequences, supporting an indel-coding method (only simple indel-coding method is available in the current version), showing base substitutions and indels, calculating absolute pairwise distances between DNA sequences, and collapsing identical DNA sequences into haplotypes or inferring haplotypes using user-provided absolute pairwise character difference matrix. This package also includes S4 classes and methods for estimating genealogical relationships among haplotypes using statistical parsimony and plotting parsimony networks.
Implementation of Kmeans clustering algorithm and a supervised KNN (K Nearest Neighbors) learning method. It allows users to perform unsupervised clustering and supervised classification on their datasets. Additional features include data normalization, imputation of missing values, and the choice of distance metric. The package also provides functions to determine the optimal number of clusters for Kmeans and the best k-value for KNN: knn_Function(), find_Knn_best_k(), KMEANS_FUNCTION(), and find_Kmeans_best_k().
The field of immunology benefits from software that can predict which peptide sequences trigger an immune response. NetMHCIIpan is a such a tool: it predicts the binding strength of a short peptide to a Major Histocompatibility Complex class II (MHC-II) molecule. NetMHCIIpan can be used from a web server at <https://services.healthtech.dtu.dk/services/NetMHCIIpan-3.2/> or from the command-line, using a local installation. This package allows to call NetMHCIIpan from R.
Conduct internal validation of a clinical prediction model for a binary outcome. Produce bias corrected performance metrics (c-statistic, Brier score, calibration intercept/slope) via bootstrap (simple bootstrap, bootstrap optimism, .632 optimism) and cross-validation (CV optimism, CV average). Also includes functions to assess model stability via bootstrap resampling. See Steyerberg et al. (2001) <doi:10.1016/s0895-4356(01)00341-9>; Harrell (2015) <doi:10.1007/978-3-319-19425-7>; Riley and Collins (2023) <doi:10.1002/bimj.202200302>.
Given a list of substance compositions, a list of substances involved in a process, and a list of constraints in addition to mass conservation of elementary constituents, the package contains functions to build the substance composition matrix, to analyze the uniqueness of process stoichiometry, and to calculate stoichiometric coefficients if process stoichiometry is unique. (See Reichert, P. and Schuwirth, N., A generic framework for deriving process stoichiometry in enviromental models, Environmental Modelling and Software 25, 1241-1251, 2010 for more details.).
Analysis of multi environment data of plant breeding experiments following the analyses described in Malosetti, Ribaut, and van Eeuwijk (2013), <doi:10.3389/fphys.2013.00044>. One of a series of statistical genetic packages for streamlining the analysis of typical plant breeding experiments developed by Biometris. Some functions have been created to be used in conjunction with the R package asreml for the ASReml software, which can be obtained upon purchase from VSN international (<https://vsni.co.uk/software/asreml-r/>).
This package provides a collection of functions related to novel methods for estimating R(t), created by the lab of Professor Laura White. Currently implemented methods include two-step Bayesian back-calculation and now-casting for line-list data with missing reporting delays, adapted in STAN from Li (2021) <doi:10.1371/journal.pcbi.1009210>, and calculation of time-varying reproduction number assuming a flux between various adjacent states, adapted into STAN from Zhou (2021) <doi:10.1371/journal.pcbi.1010434>.