Real-time quantitative polymerase chain reaction (qPCR
) data by Guescini et al. (2008) <doi:10.1186/1471-2105-9-326> in tidy format. This package provides two data sets where the amplification efficiency has been modulated: either by changing the amplification mix concentration, or by increasing the concentration of IgG
, a PCR inhibitor. Original raw data files: <https://static-content.springer.com/esm/art%3A10.1186%2F1471-2105-9-326/MediaObjects/12859_2008_2311_MOESM1_ESM.xls>
and <https://static-content.springer.com/esm/art%3A10.1186%2F1471-2105-9-326/MediaObjects/12859_2008_2311_MOESM5_ESM.xls>
.
Characterisation and calibration of single or multiple Ion Selective Electrodes (ISEs); activity estimation of experimental samples. Implements methods described in: Dillingham, P.W., Radu, T., Diamond, D., Radu, A. and McGraw
, C.M. (2012) <doi:10.1002/elan.201100510>, Dillingham, P.W., Alsaedi, B.S.O. and McGraw
, C.M. (2017) <doi:10.1109/ICSENS.2017.8233898>, Dillingham, P.W., Alsaedi, B.S.O., Radu, A., and McGraw
, C.M. (2019) <doi:10.3390/s19204544>, and Dillingham, P.W., Alsaedi, B.S.O., Granados-Focil, S., Radu, A., and McGraw
, C.M. (2020) <doi:10.1021/acssensors.9b02133>.
Analysis of musical scales (& modes, grooves, etc.) in the vein of Sherrill 2025 <https://collections.lib.utah.edu/ark:/87278/s6d2gr78>. The initials MCT in the package title refer to the article's title: "Modal Color Theory." Offers support for conventional musical pitch class set theory as developed by Forte (1973, ISBN: 9780300016109) and David Lewin (1987, ISBN: 9780300034936), as well as for the continuous geometries of Callender, Quinn, & Tymoczko (2008) <doi:10.1126/science.1153021>. Identifies structural properties of scales and calculates derived values (sign vector, color number, brightness ratio, etc.). Creates plots such as "brightness graphs" which visualize these properties.
The sample mean and standard deviation are two commonly used statistics in meta-analyses, but some trials use other summary statistics such as the median and quartiles to report the results. Therefore, researchers need to transform those information back to the sample mean and standard deviation. This package implemented sample mean estimators by Luo et al. (2016) <arXiv:1505.05687>
, sample standard deviation estimators by Wan et al. (2014) <arXiv:1407.8038>
, and the best linear unbiased estimators (BLUEs) of location and scale parameters by Yang et al. (2018, submitted) based on sample quantiles derived summaries in a meta-analysis.
The constructs used to study the human psychology have many definitions and corresponding instructions for eliciting and coding qualitative data pertaining to constructs content and for measuring the constructs. This plethora of definitions and instructions necessitates unequivocal reference to specific definitions and instructions in empirical and secondary research. This package implements a human- and machine-readable standard for specifying construct definitions and instructions for measurement and qualitative research based on YAML'. This standard facilitates systematic unequivocal reference to specific construct definitions and corresponding instructions in a decentralized manner (i.e. without requiring central curation; Peters (2020) <doi:10.31234/osf.io/xebhn>).
This package implements variational Bayesian algorithms to perform scalable variable selection for sparse, high-dimensional linear and logistic regression models. Features include a novel prioritized updating scheme, which uses a preliminary estimator of the variational means during initialization to generate an updating order prioritizing large, more relevant, coefficients. Sparsity is induced via spike-and-slab priors with either Laplace or Gaussian slabs. By default, the heavier-tailed Laplace density is used. Formal derivations of the algorithms and asymptotic consistency results may be found in Kolyan Ray and Botond Szabo (JASA 2020) and Kolyan Ray, Botond Szabo, and Gabriel Clara (NeurIPS
2020).
Scalable implementation of generalized mixed models with highly optimized C++ implementation and integration with Genomic Data Structure (GDS) files. It is designed for single variant tests and set-based aggregate tests in large-scale Phenome-wide Association Studies (PheWAS
) with millions of variants and samples, controlling for sample structure and case-control imbalance. The implementation is based on the SAIGE R package (v0.45, Zhou et al. 2018 and Zhou et al. 2020), and it is extended to include the state-of-the-art ACAT-O set-based tests. Benchmarks show that SAIGEgds is significantly faster than the SAIGE R package.
While gene signatures are frequently used to predict phenotypes (e.g. predict prognosis of cancer patients), it it not always clear how optimal or meaningful they are (cf David Venet, Jacques E. Dumont, and Vincent Detours paper "Most Random Gene Expression Signatures Are Significantly Associated with Breast Cancer Outcome"). Based on suggestions in that paper, SigCheck
accepts a data set (as an ExpressionSet
) and a gene signature, and compares its performance on survival and/or classification tasks against a) random gene signatures of the same length; b) known, related and unrelated gene signatures; and c) permuted data and/or metadata.
The caroline R library contains dozens of functions useful for: database migration (dbWriteTable2
), database style joins & aggregation (nerge, groupBy
, & bestBy
), data structure conversion (nv, tab2df), legend table making (sstable & leghead), automatic legend positioning for scatter and box plots (), plot annotation (labsegs & mvlabs), data visualization (pies, sparge, confound.grid & raPlot
), character string manipulation (m & pad), file I/O (write.delim), batch scripting, data exploration, and more. The package's greatest contributions lie in the database style merge, aggregation and interface functions as well as in it's extensive use and propagation of row, column and vector names in most functions.
This package provides diagnostic graphic tools for GLMs, beta-binomial regression model (estimated by VGAM package), beta regression model (estimated by betareg package) and negative binomial regression model (estimated by MASS package). Since most of functions implemented in glmxdiag already exist in other packages, the aim is to provide the user unique functions that work on almost all regression models previously specified. Details about some of the implemented functions can be found in Brown (1992) <doi:10.2307/2347617>, Dunn and Smyth (1996) <doi:10.2307/1390802>, O'Hara Hines and Carter (1993) <doi:10.2307/2347405>, Wang (1985) <doi:10.2307/1269708>.
With the deprecation of mocking capabilities shipped with testthat as of edition 3 it is left to third-party packages to replace this functionality, which in some test-scenarios is essential in order to run unit tests in limited environments (such as no Internet connection). Mocking in this setting means temporarily substituting a function with a stub that acts in some sense like the original function (for example by serving a HTTP response that has been cached as a file). The only exported function with_mock()
is modeled after the eponymous testthat function with the intention of providing a drop-in replacement.
This package provides functions for making run charts [Anhoej, Olesen (2014) <doi:10.1371/journal.pone.0113825>] and basic Shewhart control charts [Mohammed, Worthington, Woodall (2008) <doi:10.1136/qshc.2004.012047>] for measure and count data. The main function, qic()
, creates run and control charts and has a simple interface with a rich set of options to control data analysis and plotting, including options for automatic data aggregation by subgroups, easy analysis of before-and-after data, exclusion of one or more data points from analysis, and splitting charts into sequential time periods. Missing values and empty subgroups are handled gracefully.
Recently, regularized variable selection has emerged as a powerful tool to identify and dissect gene-environment interactions. Nevertheless, in longitudinal studies with high dimensional genetic factors, regularization methods for GÃ E interactions have not been systematically developed. In this package, we provide the implementation of sparse group variable selection, based on both the quadratic inference function (QIF) and generalized estimating equation (GEE), to accommodate the bi-level selection for longitudinal GÃ E studies with high dimensional genomic features. Alternative methods conducting only the group or individual level selection have also been included. The core modules of the package have been developed in C++.
Data from statistical agencies and other institutions are mostly confidential. This package, introduced in Templ, Kowarik and Meindl (2017) <doi:10.18637/jss.v067.i04>, can be used for the generation of anonymized (micro)data, i.e. for the creation of public- and scientific-use files. The theoretical basis for the methods implemented can be found in Templ (2017) <doi:10.1007/978-3-319-50272-4>. Various risk estimation and anonymization methods are included. Note that the package includes a graphical user interface published in Meindl and Templ (2019) <doi:10.3390/a12090191> that allows to use various methods of this package.
This package provides utilities to create and use lenses to simplify data manipulation. Lenses are composable getter/setter pairs that provide a functional approach to manipulating deeply nested data structures, e.g., elements within list columns in data frames. The implementation is based on the earlier lenses R package <https://github.com/cfhammill/lenses>, which was inspired by the Haskell lens package by Kmett (2012) <https://github.com/ekmett/lens>, one of the most widely referenced implementations of lenses. For additional background and history on the theory of lenses, see the lens package wiki: <https://github.com/ekmett/lens/wiki/History-of-Lenses>.
Normalizes a data matrix `data` by raking (using the RAS method by Bacharach, see references) the Nrows by Ncols matrix such that the row means and column means equal 1. The result is a normalized data matrix `K=RAS`, a product of row mulipliers `R` and column multipliers `S` with the original matrix `A`. Missing information needs to be presented as `NA` values and not as zero values, because CONSTANd is able to ignore missing values when calculating the mean. Using CONSTANd normalization allows for the direct comparison of values between samples within the same and even across different CONSTANd-normalized data matrices.
This is a supportive data package for the software package gage
. However, the data supplied here are also useful for gene set or pathway analysis or microarray data analysis in general. In this package, we provide two demo microarray dataset: GSE16873 (a breast cancer dataset from GEO) and BMP6 (originally published as an demo dataset for GAGE, also registered as GSE13604 in GEO). This package also includes commonly used gene set data based on KEGG pathways and GO terms for major research species, including human, mouse, rat and budding yeast. Mapping data between common gene IDs for budding yeast are also included.
The DImodels package is suitable for analysing data from biodiversity and ecosystem function studies using the Diversity-Interactions (DI) modelling approach introduced by Kirwan et al. (2009) <doi:10.1890/08-1684.1>. Suitable data will contain proportions for each species and a community-level response variable, and may also include additional factors, such as blocks or treatments. The package can perform data manipulation tasks, such as computing pairwise interactions (the DI_data()
function), can perform an automated model selection process (the autoDI()
function) and has the flexibility to fit a wide range of user-defined DI models (the DI()
function).
This tree-based method deals with high dimensional longitudinal data with correlated features through the use of a piecewise random effect model. FREE tree also exploits the network structure of the features, by first clustering them using Weighted Gene Co-expression Network Analysis ('WGCNA'). It then conducts a screening step within each cluster of features and a selecting step among the surviving features, which provides a relatively unbiased way to do feature selection. By using dominant principle components as regression variables at each leaf and the original features as splitting variables at splitting nodes, FREE tree delivers easily interpretable results while improving computational efficiency.
In streaming data analysis, it is crucial to detect significant shifts in the data distribution or the accuracy of predictive models over time, a phenomenon known as concept drift. The package aims to identify when concept drift occurs and provide methodologies for adapting models in non-stationary environments. It offers a range of state-of-the-art techniques for detecting concept drift and maintaining model performance. Additionally, the package provides tools for adapting models in response to these changes, ensuring continuous and accurate predictions in dynamic contexts. Methods for concept drift detection are described in Tavares (2022) <doi:10.1007/s12530-021-09415-z>.
Several functions to calculate two important indexes (IBR (Integrated Biomarker Response) and IBRv2 (Integrated Biological Response version 2)), it also calculates the standardized values for enzyme activity for each index, and it has a graphing function to perform radarplots that make great data visualization for this type of data. Beliaeff, B., & Burgeot, T. (2002). <https://pubmed.ncbi.nlm.nih.gov/12069320/>. Sanchez, W., Burgeot, T., & Porcher, J.-M. (2013).<doi:10.1007/s11356-012-1359-1>. Devin, S., Burgeot, T., Giambérini, L., Minguez, L., & Pain-Devin, S. (2014). <doi:10.1007/s11356-013-2169-9>. Minato N. (2022). <https://minato.sip21c.org/msb/>.
Natural strata can be used in observational studies to balance the distributions of many covariates across any number of treatment groups and any number of comparisons. These strata have proportional amounts of units within each stratum across the treatments, allowing for simple interpretation and aggregation across strata. Within each stratum, the units are chosen using randomized rounding of a linear program that balances many covariates. To solve the linear program, the Gurobi commercial optimization software is recommended, but not required. The gurobi R package can be installed following the instructions at <https://www.gurobi.com/documentation/9.1/refman/ins_the_r_package.html>.
Implementation of popular mortality models using the rstan package, which provides the R interface to the Stan C++ library for Bayesian estimation. The package supports well-known models proposed in the actuarial and demographic literature including the Lee-Carter (1992) <doi:10.1080/01621459.1992.10475265> and the Cairns-Blake-Dowd (2006) <doi:10.1111/j.1539-6975.2006.00195.x> models. By a simple call, the user inputs deaths and exposures and the package outputs the MCMC simulations for each parameter, the log likelihoods and predictions. Moreover, the package includes tools for model selection and Bayesian model averaging by leave future-out validation.
This program realizes a universal estimation approach that accommodates multi-category variables and effect scales, making up for the deficiencies of the existing approaches when dealing with non-binary exposures and complex models. The estimation via bootstrapping can simultaneously provide results of causal mediation on risk difference (RD), odds ratio (OR) and risk ratio (RR) scales with tests of the effects difference. The estimation is also applicable to many other settings, e.g., moderated mediation, inconsistent covariates, panel data, etc. The high flexibility and compatibility make it possible to apply for any type of model, greatly meeting the needs of current empirical researches.