Statistical analysis methods for environmental data are implemented. There is a particular focus on robust methods, and on methods for compositional data. In addition, larger data sets from geochemistry are provided. The statistical methods are described in Reimann, Filzmoser, Garrett, Dutter (2008, ISBN:978-0-470-98581-6).
Helper functions to easily add functionality to functions. The package can assign functions to have an lazy evaluation allowing you to save and update the arguments before and after each function call. You can set a temporary working directory within functions and wrap console messages around other functions.
Tool for analysis of codon usage in various unannotated or KEGG/COG annotated DNA sequences. Calculates different measures of CU bias and CU-based predictors of gene expressivity, and performs gene set enrichment analysis for annotated sequences. Implements several methods for visualization of CU and enrichment analysis results.
Harman is a PCA and constrained optimisation based technique that maximises the removal of batch effects from datasets, with the constraint that the probability of overcorrection (i.e. removing genuine biological signal along with batch noise) is kept to a fraction which is set by the end-user.
Human Phenotype Ontology (HPO) was developed to create a consistent description of gene products with disease perspectives, and is essential for supporting functional genomics in disease context. Accurate disease descriptions can discover new relationships between genes and disease, and new functions for previous uncharacteried genes and alleles.
This package is able to perform an automatic or interactive quality control on FCS data acquired using flow cytometry instruments. By evaluating three different properties:
flow rate
signal acquisition, and
dynamic range,
the quality control enables the detection and removal of anomalies.
Sleuth is a program for differential analysis of RNA-Seq data. It makes use of quantification uncertainty estimates obtained via Kallisto for accurate differential analysis of isoforms or genes, allows testing in the context of experiments with complex designs, and supports interactive exploratory data analysis via sleuth live.
This package provides a set of tools for inspecting and understanding R data structures inspired by str
. It includes ast
for visualizing abstract syntax trees, ref
for showing shared references, cst
for showing call stack trees, and obj_size
for computing object sizes.
Colored terminal output on terminals that support ANSI color and highlight codes. It also works in Emacs ESS. ANSI color support is automatically detected. Colors and highlighting can be combined and nested. New styles can also be created easily. This package was inspired by the "chalk" JavaScript project.
The ggplot2 package is an excellent and flexible package for elegant data visualization in R. However the default generated plots require some formatting before we can send them for publication. The ggpubr package provides some easy-to-use functions for creating and customizing ggplot2-based publication-ready plots.
This package enables the translation of ggplot2 graphs to an interactive web-based version and/or the creation of custom web-based visualizations directly from R. Once uploaded to a plotly account, plotly graphs (and the data behind them) can be viewed and modified in a web browser.
This package provides efficient tools to compute the proximity between rows or columns of large matrices. Functions are optimised for large sparse matrices using the Armadillo and Intel TBB libraries. Among several built-in similarity/distance measures, computation of correlation, cosine similarity and Euclidean distance is particularly fast.
ZeroMQ is a well-known library for high-performance asynchronous messaging in scalable, distributed applications. This package provides high level R wrapper functions to easily utilize ZeroMQ. The main focus is on interactive client/server programming frameworks. A few wrapper functions compatible with rzmq
are also provided.
Validates estimates of (conditional) average treatment effects obtained using observational data by a) making it easy to obtain and visualize estimates derived using a large variety of methods (G-computation, inverse propensity score weighting, etc.), and b) ensuring that estimates are easily compared to a gold standard (i.e., estimates derived from randomized controlled trials). RCTrep offers a generic protocol for treatment effect validation based on four simple steps, namely, set-selection, estimation, diagnosis, and validation. RCTrep provides a simple dashboard to review the obtained results. The validation approach is introduced by Shen, L., Geleijnse, G. and Kaptein, M. (2023) <doi:10.21203/rs.3.rs-2559287/v2>.
This package provides tools for robust regression model fitting using the RANSAC (Random Sample Consensus) algorithm. RANSAC is an iterative method to estimate parameters of a model from a dataset that contains outliers. This package allows fitting both linear lm and nonlinear nls models using RANSAC, helping users obtain more reliable models in the presence of noisy or corrupted data. The methods are particularly useful in contexts where traditional least squares regression fails due to the influence of outliers. Implementations include support for performance metrics such as RMSE, MAE, and R² based on the inlier subset. For further details, see Fischler and Bolles (1981) <doi:10.1145/358669.358692>.
This package provides functions for estimating the attributable burden of disease due to risk factors. The posterior simulation is performed using arm::sim as described in Gelman, Hill (2012) <doi:10.1017/CBO9780511790942> and the attributable burden method is based on Nielsen, Krause, Molbak <doi:10.1111/irv.12564>.
Obtain network structures from animal GPS telemetry observations and statistically analyse them to assess their adequacy for social network analysis. Methods include pre-network data permutations, bootstrapping techniques to obtain confidence intervals for global and node-level network metrics, and correlation and regression analysis of the local network metrics.
Implementation of two-dimensional (2D) correlation analysis based on the Fourier-transformation approach described by Isao Noda (I. Noda (1993) <DOI:10.1366/0003702934067694>). Additionally there are two plot functions for the resulting correlation matrix: The first one creates colored 2D plots, while the second one generates 3D plots.
This package provides equations commonly used in clinical pharmacokinetics and clinical pharmacology, such as equations for dose individualization, compartmental pharmacokinetics, drug exposure, anthropomorphic calculations, clinical chemistry, and conversion of common clinical parameters. Where possible and relevant, it provides multiple published and peer-reviewed equations within the respective R function.
Feed longitudinal data into a Bayesian Latent Factor Model to obtain a low-rank representation. Parameters are estimated using a Hamiltonian Monte Carlo algorithm with STAN. See G. Weinrott, B. Fontez, N. Hilgert and S. Holmes, "Bayesian Latent Factor Model for Functional Data Analysis", Actes des JdS
2016.
Estimates RxC
(R by C) vote transfer matrices (ecological contingency tables) from aggregate data by simultaneously minimizing Euclidean row-standardized unit-to-global distances. Acknowledgements: The authors wish to thank Generalitat Valenciana, Consellerà a de Educación, Cultura, Universidades y Empleo (grant CIAICO/2023/031) for supporting this research.
This package provides tools for modelling electric vehicle charging sessions into generic groups with similar connection patterns called "user profiles", using Gaussian Mixture Models clustering. The clustering and profiling methodology is described in Cañigueral and Meléndez (2021, ISBN:0142-0615) <doi:10.1016/j.ijepes.2021.107195>.
After being given the location of your students submissions and a test file, the function runs each .R file, and evaluates the results from all the given tests. Results are neatly returned in a data frame that has a row for each student, and a column for each test.
The jscore()
function in the package calculates the J-Score metric between two clustering assignments. The score is designed to address some problems with existing common metrics such as problem of matching. The details of J-score is described in Ahmadinejad and Liu. (2021) <arXiv:2109.01306>
.