This package provides a data-driven test for the assumptions of quantile normalization using raw data such as objects that inherit eSets (e.g. ExpressionSet, MethylSet). Group level information about each sample (such as Tumor / Normal status) must also be provided because the test assesses if there are global differences in the distributions between the user-defined groups.
Sending functions to remote processes can be wasteful of resources because they carry their environments with them. With this package, it is easy to create functions that are isolated from their environment. These isolated functions, also called crates, print to the console with their total size and can be easily tested locally before being sent to a remote.
This package provides functions for Arps decline-curve analysis on oil and gas data. Includes exponential, hyperbolic, harmonic, and hyperbolic-to-exponential models as well as the preceding with initial curtailment or a period of linear rate buildup. Functions included for computing rate, cumulative production, instantaneous decline, EUR, time to economic limit, and performing least-squares best fits.
This package provides a Bayesian model for examining the association between environmental mixtures and all Taxa measured in a hierarchical microbiome dataset in a single integrated analysis. Compared with analyzing the associations of environmental mixtures with each Taxa individually, BaHZING
controls Type 1 error rates and provides more stable effect estimates when dealing with small sample sizes.
This package provides a framework for the replicable removal of personally identifiable data (PID) in data sets. The package implements a suite of methods to suit different data types based on the suggestions of Garfinkel (2015) <doi:10.6028/NIST.IR.8053> and the ICO "Guidelines on Anonymization" (2012) <https://ico.org.uk/media/1061/anonymisation-code.pdf>.
Designing experimental plans that involve both discrete and continuous factors with general parametric statistical models using the ForLion
algorithm and EW ForLion
algorithm. The algorithms will search for locally optimal designs and EW optimal designs under the D-criterion. Reference: Huang, Y., Li, K., Mandal, A., & Yang, J., (2024)<doi:10.1007/s11222-024-10465-x>.
This package provides tools and methods to apply the model Geospatial Regression Equation for European Nutrient losses (GREEN); Grizzetti et al. (2005) <doi:10.1016/j.jhydrol.2004.07.036>; Grizzetti et al. (2008); Grizzetti et al. (2012) <doi:10.1111/j.1365-2486.2011.02576.x>; Grizzetti et al. (2021) <doi:10.1016/j.gloenvcha.2021.102281>.
Develops a General Equilibrium (GE) Model, which estimates key variables such as wages, the number of residents and workers, the prices of the floor space, and its distribution between commercial and residential use, as in Ahlfeldt et al., (2015) <doi:10.3982/ECTA10876>. By doing so, the model allows understanding the economic influence of different urban policies.
This package provides a comprehensive R interface to access data from the Kraken cryptocurrency exchange REST API <https://docs.kraken.com/api/>. It allows users to retrieve various market data, such as asset information, trading pairs, and price data. The package is designed to facilitate efficient data access for analysis, strategy development, and monitoring of cryptocurrency market trends.
Four measures of linkage disequilibrium are provided: the usual r^2 measure, the r^2_S measure (r^2 corrected by the structure sample), the r^2_V (r^2 corrected by the relatedness of genotyped individuals), the r^2_VS measure (r^2 corrected by both the relatedness of genotyped individuals and the structure of the sample).
This package provides functions to fit finite mixture of scale mixture of skew-normal (FM-SMSN) distributions, details in Prates, Lachos and Cabral (2013) <doi: 10.18637/jss.v054.i12>, Cabral, Lachos and Prates (2012) <doi:10.1016/j.csda.2011.06.026> and Basso, Lachos, Cabral and Ghosh (2010) <doi:10.1016/j.csda.2009.09.031>.
An R wrapper for pulling data from the National Public Transport Access Nodes ('NaPTAN
') API (<https://www.api.gov.uk/dft/national-public-transport-access-nodes-naptan-api/#national-public-transport-access-nodes-naptan-api>). This allows users to download NaPTAN
transport information, for the full dataset, by ATCO region code, or by name of region.
An implementation of Simultaneous Truth and Performance Level Estimation (STAPLE) <doi:10.1109/TMI.2004.828354>. This method is used when there are multiple raters for an object, typically an image, and this method fuses these ratings into one rating. It uses an expectation-maximization method to estimate this rating and the individual specificity/sensitivity for each rater.
This package provides various themes, palettes, and other functions that are used to customise ggplots to look like they were made in GraphPad Prism. The Prism-look is achieved with theme_prism()
and scale_fill|colour_prism()
, axes can be changed with custom guides like guide_prism_minor()
, and significance indicators added with add_pvalue()
.
This package helps with quality checks, visualizations and analysis of mass spectrometry data, coming from proteomics experiments. The package is developed, tested and used at the Functional Genomics Center Zurich, where it is used mainly for prototyping, teaching, and having fun with proteomics data. But it can also be used to do data analysis for small scale data sets.
R is a language and environment for statistical computing and graphics. It provides a variety of statistical techniques, such as linear and nonlinear modeling, classical statistical tests, time-series analysis, classification and clustering. It also provides robust support for producing publication-quality data plots. A large amount of 3rd-party packages are available, greatly increasing its breadth and scope.
Various tools for handling fuzzy measures, calculating Shapley value and interaction index, Choquet and Sugeno integrals, as well as fitting fuzzy measures to empirical data are provided. Construction of fuzzy measures from empirical data is done by solving a linear programming problem by using lpsolve package, whose source in C adapted to the R environment is included. The description of the basic theory of fuzzy measures is in the manual in the Doc folder in this package. Please refer to the following: [1] <https://personal-sites.deakin.edu.au/~gleb/fmtools.html> [2] G. Beliakov, H. Bustince, T. Calvo, A Practical Guide to Averaging', Springer, (2016, ISBN: 978-3-319-24753-3). [3] G. Beliakov, S. James, J-Z. Wu, Discrete Fuzzy Measures', Springer, (2020, ISBN: 978-3-030-15305-2).
Receiver Operating Characteristic (ROC)-guided survival trees and ensemble algorithms are implemented, providing a unified framework for tree-structured analysis with censored survival outcomes. A time-invariant partition scheme on the survivor population was considered to incorporate time-dependent covariates. Motivated by ideas of randomized tests, generalized time-dependent ROC curves were used to evaluate the performance of survival trees and establish the optimality of the target hazard/survival function. The optimality of the target hazard function motivates us to use a weighted average of the time-dependent area under the curve (AUC) on a set of time points to evaluate the prediction performance of survival trees and to guide splitting and pruning. A detailed description of the implemented methods can be found in Sun et al. (2019) <arXiv:1809.05627>
.
R is a language and environment for statistical computing and graphics. It provides a variety of statistical techniques, such as linear and nonlinear modeling, classical statistical tests, time-series analysis, classification and clustering. It also provides robust support for producing publication-quality data plots. A large amount of 3rd-party packages are available, greatly increasing its breadth and scope.
The GNU Privacy Guard is a complete implementation of the OpenPGP standard. It is used to encrypt and sign data and communication. It features powerful key management and the ability to access public key servers. It includes several libraries: libassuan (IPC between GnuPG components), libgpg-error (centralized GnuPG error values), and libskba (working with X.509 certificates and CMS data).
This package provides a collection of Japanese text processing tools for filling Japanese iteration marks, Japanese character type conversions, segmentation by phrase, and text normalization which is based on rules for the Sudachi morphological analyzer and the NEologd (Neologism dictionary for MeCab
'). These features are specific to Japanese and are not implemented in ICU (International Components for Unicode).
This wrapper package for mgcv makes it easier to create high-performing Generalized Additive Models (GAMs). With its central function autogam()
, by entering just a dataset and the name of the outcome column as inputs, AutoGAM
tries to automate the procedure of configuring a highly accurate GAM which performs at reasonably high speed, even for large datasets.
This package provides a fast, flexible and transparent framework to estimate context-specific word and short document embeddings using the a la carte embeddings approach developed by Khodak et al. (2018) <arXiv:1805.05388>
and evaluate hypotheses about covariate effects on embeddings using the regression framework developed by Rodriguez et al. (2021)<https://github.com/prodriguezsosa/EmbeddingRegression>
.
Package to fit diffusion-based IRT models to response and response time data. Models are fit using marginal maximum likelihood. Parameter restrictions (fixed value and equality constraints) are possible. In addition, factor scores (person drift rate and person boundary separation) can be estimated. Model fit assessment tools are also available. The traditional diffusion model can be estimated as well.