This package provides a variety of association tests for microbiome data analysis including Quasi-Conditional Association Tests (QCAT) described in Tang Z.-Z. et al.(2017) <doi:10.1093/bioinformatics/btw804> and Zero-Inflated Generalized Dirichlet Multinomial (ZIGDM) tests described in Tang Z.-Z. & Chen G. (2017, submitted).
Enables user to perform the following: 1. Roll n number of die/dice (roll()
). 2. Toss n number of coin(s) (toss()
). 3. Play the game of Rock, Paper, Scissors. 4. Choose n number of card(s) from a pack of 52 playing cards (Joker optional).
Estimation methods for optimal treatment regimes under three different criteria, namely marginal quantile, marginal mean, and mean absolute difference. For the first two criteria, both one-stage and two-stage estimation method are implemented. A doubly robust estimator for estimating the quantile-optimal treatment regime is also included.
This package implements named semaphores from the boost C++ library <https://www.boost.org/> for interprocess communication. Multiple R sessions on the same host can block (with optional timeout) on a semaphore until it becomes positive, then atomically decrement it and unblock. Any session can increment the semaphore.
Transformation of sea currents to connectivity data. Two files of horizontal and vertical currents flows are transformed into connectivity data in the form of sfnetwork, shapefile, edge list and adjacency matrix. An application example is shown at Nagkoulis et al. (2025) <doi:10.1016/j.dib.2024.111268>.
Simulate age-structured populations that vary in space and time and explore the efficacy of a range of built-in or user-defined sampling protocols to reproduce the population parameters of the known population. (See Regular et al. (2020) <doi:10.1371/journal.pone.0232822> for more details).
Delta Method implementation to estimate standard errors with known asymptotic properties within the tidyverse workflow. The Delta Method is a statistical tool that approximates an estimatorâ s behaviour using a Taylor Expansion. For a comprehensive explanation, please refer to Chapter 3 of van der Vaart (1998, ISBN: 9780511802256).
In this package, a Hidden Semi Markov Model (HSMM) and one homogeneous segmentation model are designed and implemented for segmentation genomic data, with the aim of assisting in transcripts detection using high throughput technology like RNA-seq or tiling array, and copy number analysis using aCGH or sequencing.
This package provides tools to create pretty tables for HTML documents and other formats. Functions are provided to let users create tables, modify and format their content. It extends the officer
package and can be used within R markdown documents when rendering to HTML and to Word documents.
This package contains a number of common astronomy conversion routines, particularly the HMS and degrees schemes, which can be fiddly to convert between on mass due to the textural nature of the former. It allows users to coordinate match datasets quickly. It also contains functions for various cosmological calculations.
spacetime
provides classes and methods for spatio-temporal data, including space-time regular lattices, sparse lattices, irregular data, and trajectories; utility functions for plotting data as map sequences (lattice or animation) or multiple time series; methods for spatial and temporal matching or aggregation, retrieving coordinates, print, summary, etc.
This package provides plotting functions for posterior analysis, model checking, and MCMC diagnostics. The package is designed not only to provide convenient functionality for users, but also a common set of functions that can be easily used by developers working on a variety of R packages for Bayesian modeling.
This Python module enables remote procedure calls, clustering, and distributed-computing. For this purpose, it makes use of object-proxying, a technique that employs python's dynamic nature, to overcome the physical boundaries between processes and computers, so that remote objects can be manipulated as if they were local.
Constraint optimization, or constraint programming, is the name given to identifying feasible solutions out of a very large set of candidates, where the problem can be modeled in terms of arbitrary constraints. MiniZinc
is a free and open-source constraint modeling language. Constraint satisfaction and discrete optimization problems can be formulated in a high-level modeling language. Models are compiled into an intermediate representation that is understood by a wide range of solvers. MiniZinc
itself provides several solvers, for instance GeCode
'. R users can use the package to solve constraint programming problems without using MiniZinc
directly, modify existing MiniZinc
models and also create their own models.
The analysis of different aspects of biodiversity requires specific algorithms. For example, in regionalisation analyses, the high frequency of ties and zero values in dissimilarity matrices produced by Beta-diversity turnover produces hierarchical cluster dendrograms whose topology and bootstrap supports are affected by the order of rows in the original matrix. Moreover, visualisation of biogeographical regionalisation can be facilitated by a combination of hierarchical clustering and multi-dimensional scaling. The recluster package provides robust techniques to visualise and analyse pattern of biodiversity and to improve occurrence data for cryptic taxa. Other functions related to recluster (e.g. the biodecrypt family) are currently available in GitHub
at <https://github.com/leondap/recluster>.
This package provides tools for large, sparse optimal matching of treated units and control units in observational studies. Provisions are made for refined covariate balance constraints, which include fine and near-fine balance as special cases. Matches are optimal in the sense that they are computed as solutions to network optimization problems rather than greedy algorithms. See Pimentel, et al.(2015) <doi:10.1080/01621459.2014.997879> and Pimentel (2016), Obs. Studies 2(1):4-23. The rrelaxiv package, which provides an alternative solver for the underlying network flow problems, carries an academic license and is not available on CRAN, but may be downloaded from Github at <https://github.com/josherrickson/rrelaxiv/>.
Automatically do statistical exploration. Create formulas using tidyselect syntax, and then determine cross-validated model accuracy and variable contributions using glm and xgboost'. Contains additional helper functions to create and modify formulas. Has a flagship function to quickly determine relationships between categorical and continuous variables in the data set.
This package provides a collection of simple simulation datasets designed for generating Nonlinear Dimension Reduction representations techniques such as t-distributed Stochastic Neighbor Embedding, and Uniform Manifold Approximation and Projection. These datasets serve as a valuable resource for understanding the reliability of Nonlinear Dimension Reduction representations in various contexts.
Calculate the distance between single-arm observational studies using covariate information to remove heterogeneity in Network Meta-Analysis (NMA) of randomized clinical trials. Facilitate the inclusion of observational data in NMA, enhancing the comprehensiveness and robustness of comparative effectiveness research. Schmitz (2018) <doi:10.1186/s12874-018-0509-7>.
Filter CpGs
based on Intra-class Correlation Coefficients (ICCs) when replicates are available. ICCs are calculated by fitting linear mixed effects models to all samples including the un-replicated samples. Including the large number of un-replicated samples improves ICC estimates dramatically. The method accommodates any replicate design.
Data manipulation for Coupled Model Intercomparison Project, Phase-6 (CMIP6) hydroclimatic data. The files are archived in the Federated Research Data Repository (FRDR) (Rajulapati et al, 2024, <doi:10.20383/103.0829>). The data set is described in Abdelmoaty et al. (2025, <doi:10.1038/s41597-025-04396-z>).
Model-based methods for the detection of disease clusters using GLMs, GLMMs and zero-inflated models. These methods are described in V. Gómez-Rubio et al. (2019) <doi:10.18637/jss.v090.i14> and V. Gómez-Rubio et al. (2018) <doi:10.1007/978-3-030-01584-8_1>.
This package creates discretised versions of continuous distribution functions by mapping continuous values to an underlying discrete grid, based on a (uniform) frequency of discretisation, a valid discretisation point, and an integration range. For a review of discretisation methods, see Chakraborty (2015) <doi:10.1186/s40488-015-0028-6>.
This package implements the Edwards (1997) <doi:10.1002/j.1551-8833.1997.tb08229.x> Langmuir-based semi-empirical coagulation model, which predicts the concentration of organic carbon remaining in water after treatment with an Al- or Fe-based coagulant. Data and methods are provided to optimise empirical coefficients.