Statistical framework for comparing sets of trees using hypothesis testing methods. Designed for transmission trees, phylogenetic trees, and directed acyclic graphs (DAGs), the package implements chi-squared tests to compare edge frequencies between sets and PERMANOVA to analyse topological dissimilarities with customisable distance metrics, following Anderson (2001) <doi:10.1111/j.1442-9993.2001.01070.pp.x>.
Software to aid in modeling and analyzing mass-spectrometry-based proteome melting data. Quantitative data is imported and normalized and thermal behavior is modeled at the protein level. Methods exist for normalization, modeling, visualization, and export of results. For a general introduction to MS-based thermal profiling, see Savitski et al. (2014) <doi:10.1126/science.1255784>.
The mFilter package implements several time series filters useful for smoothing and extracting trend and cyclical components of a time series. The routines are commonly used in economics and finance, however they should also be interest to other areas. Currently, Christiano-Fitzgerald, Baxter-King, Hodrick-Prescott, Butterworth, and trigonometric regression filters are included in the package.
This package performs network meta-analysis using integrated nested Laplace approximations ('INLA') which is described in Guenhan, Held, and Friede (2018) <doi:10.1002/jrsm.1285>. Includes methods to assess the heterogeneity and inconsistency in the network. Contains more than ten different network meta-analysis dataset. INLA package can be obtained from <https://www.r-inla.org>.
Uniform random samples from simple manifolds, sometimes with noise, are commonly used to test topological data analytic (TDA) tools. This package includes samplers powered by two techniques: analytic volume-preserving parameterizations, as employed by Arvo (1995) <doi:10.1145/218380.218500>, and rejection sampling, as employed by Diaconis, Holmes, and Shahshahani (2013) <doi:10.1214/12-IMSCOLL1006>.
Plots ternary diagrams (simplex plots / Gibbs triangles) and Holdridge life zone plots <doi:10.1126/science.105.2727.367> using the standard graphics functions. Allows custom annotation, interpolating, contouring and scaling of plotting region. Includes a Shiny user interface for point-and-click ternary plotting. An alternative to ggtern', which uses the ggplot2 family of plotting functions.
This package helps with quality checks, visualizations and analysis of mass spectrometry data, coming from proteomics experiments. The package is developed, tested and used at the Functional Genomics Center Zurich, where it is used mainly for prototyping, teaching, and having fun with proteomics data. But it can also be used to do data analysis for small scale data sets.
This package provides various themes, palettes, and other functions that are used to customise ggplots to look like they were made in GraphPad Prism. The Prism-look is achieved with theme_prism() and scale_fill|colour_prism(), axes can be changed with custom guides like guide_prism_minor(), and significance indicators added with add_pvalue().
R is a language and environment for statistical computing and graphics. It provides a variety of statistical techniques, such as linear and nonlinear modeling, classical statistical tests, time-series analysis, classification and clustering. It also provides robust support for producing publication-quality data plots. A large amount of 3rd-party packages are available, greatly increasing its breadth and scope.
COSMOS (Causal Oriented Search of Multi-Omic Space) is a method that integrates phosphoproteomics, transcriptomics, and metabolomics data sets based on prior knowledge of signaling, metabolic, and gene regulatory networks. It estimated the activities of transcrption factors and kinases and finds a network-level causal reasoning. Thereby, COSMOS provides mechanistic hypotheses for experimental observations across mulit-omics datasets.
Crumblr enables analysis of count ratio data using precision weighted linear (mixed) models. It uses an asymptotic normal approximation of the variance following the centered log ration transform (CLR) that is widely used in compositional data analysis. Crumblr provides a fast, flexible alternative to GLMs and GLMM's while retaining high power and controlling the false positive rate.
High-throughput sequencing experiments followed by differential expression analysis is a widely used approach to detect genomic biomarkers. A fundamental step in differential expression analysis is to model the association between gene counts and covariates of interest. NBAMSeq a flexible statistical model based on the generalized additive model and allows for information sharing across genes in variance estimation.
Filtering of lowly expressed features (e.g. genes) is a common step before performing statistical analysis, but an arbitrary threshold is generally chosen. SeqGate implements a method that rationalize this step by the analysis of the distibution of counts in replicate samples. The gate is the threshold above which sequenced features can be considered as confidently quantified.
This package provides functions for Arps decline-curve analysis on oil and gas data. Includes exponential, hyperbolic, harmonic, and hyperbolic-to-exponential models as well as the preceding with initial curtailment or a period of linear rate buildup. Functions included for computing rate, cumulative production, instantaneous decline, EUR, time to economic limit, and performing least-squares best fits.
This package provides a framework for the replicable removal of personally identifiable data (PID) in data sets. The package implements a suite of methods to suit different data types based on the suggestions of Garfinkel (2015) <doi:10.6028/NIST.IR.8053> and the ICO "Guidelines on Anonymization" (2012) <https://ico.org.uk/media/1061/anonymisation-code.pdf>.
This package provides tools and methods to apply the model Geospatial Regression Equation for European Nutrient losses (GREEN); Grizzetti et al. (2005) <doi:10.1016/j.jhydrol.2004.07.036>; Grizzetti et al. (2008); Grizzetti et al. (2012) <doi:10.1111/j.1365-2486.2011.02576.x>; Grizzetti et al. (2021) <doi:10.1016/j.gloenvcha.2021.102281>.
Develops a General Equilibrium (GE) Model, which estimates key variables such as wages, the number of residents and workers, the prices of the floor space, and its distribution between commercial and residential use, as in Ahlfeldt et al., (2015) <doi:10.3982/ECTA10876>. By doing so, the model allows understanding the economic influence of different urban policies.
This package provides a comprehensive R interface to access data from the Kraken cryptocurrency exchange REST API <https://docs.kraken.com/api/>. It allows users to retrieve various market data, such as asset information, trading pairs, and price data. The package is designed to facilitate efficient data access for analysis, strategy development, and monitoring of cryptocurrency market trends.
Four measures of linkage disequilibrium are provided: the usual r^2 measure, the r^2_S measure (r^2 corrected by the structure sample), the r^2_V (r^2 corrected by the relatedness of genotyped individuals), the r^2_VS measure (r^2 corrected by both the relatedness of genotyped individuals and the structure of the sample).
This package provides functions to fit finite mixture of scale mixture of skew-normal (FM-SMSN) distributions, details in Prates, Lachos and Cabral (2013) <doi: 10.18637/jss.v054.i12>, Cabral, Lachos and Prates (2012) <doi:10.1016/j.csda.2011.06.026> and Basso, Lachos, Cabral and Ghosh (2010) <doi:10.1016/j.csda.2009.09.031>.
An R wrapper for pulling data from the National Public Transport Access Nodes ('NaPTAN') API (<https://www.api.gov.uk/dft/national-public-transport-access-nodes-naptan-api/#national-public-transport-access-nodes-naptan-api>). This allows users to download NaPTAN transport information, for the full dataset, by ATCO region code, or by name of region.
Computes effective population size (Ne) and the Ne/N ratio for stage-structured populations using the matrix population model framework of Yonezawa (2000) <doi:10.1111/j.0014-3820.2000.tb01244.x>. Functions are provided for sexually reproducing, clonally reproducing, and mixed (sexual + clonal) populations. Includes sensitivity and elasticity analyses for Ne/N with respect to vital rates.
An implementation of Simultaneous Truth and Performance Level Estimation (STAPLE) <doi:10.1109/TMI.2004.828354>. This method is used when there are multiple raters for an object, typically an image, and this method fuses these ratings into one rating. It uses an expectation-maximization method to estimate this rating and the individual specificity/sensitivity for each rater.
This dataset was collected using a new four-arm within-study comparison design. The study aimed to examine the impact of a mathematics training intervention and a vocabulary study session on post-test scores in mathematics and vocabulary, respectively. The innovative four-arm within-study comparison design facilitates both experimental and quasi-experimental identification of average causal effects.