This package provides a consistent set of functions for enriching and analyzing sovereign-level economic data. Economists, data scientists, and financial professionals can use the package to add standardized identifiers, demographic and macroeconomic indicators, and derived metrics such as gross domestic product per capita or government expenditure shares.
Stores small spatial datasets used to teach basic spatial analysis concepts. Datasets are based off of the GeoDa software workbook and data site <https://geodacenter.github.io/data-and-lab/> developed by Luc Anselin and team at the University of Chicago. Datasets are stored as sf objects.
Given a high-dimensional dataset that typically represents a cytometry dataset, and a subset of the datapoints, this algorithm outputs an hyperrectangle so that datapoints within the hyperrectangle best correspond to the specified subset. In essence, this allows the conversion of clustering algorithms outputs to gating strategies outputs.
Quick indexation of any type of vector or of any combination of those. Indexation turns a vector into an integer vector going from 1 to the number of unique elements. Indexes are important building blocks for many algorithms. The method is described at <https://github.com/lrberge/indexthis/>.
Compute several variations of the Implicit Association Test (IAT) scores, including the D scores (Greenwald, Nosek, Banaji, 2003, <doi:10.1037/0022-3514.85.2.197>) and the new scores that were developed using robust statistics (Richetin, Costantini, Perugini, and Schonbrodt, 2015, <doi:10.1371/journal.pone.0129601>).
Clustering or classification of longitudinal data based on a mixture of multivariate t or Gaussian distributions with a Cholesky-decomposed covariance structure. Details in McNicholas and Murphy (2010) <doi:10.1002/cjs.10047> and McNicholas and Subedi (2012) <doi:10.1016/j.jspi.2011.11.026>.
This package provides a variety of association tests for microbiome data analysis including Quasi-Conditional Association Tests (QCAT) described in Tang Z.-Z. et al.(2017) <doi:10.1093/bioinformatics/btw804> and Zero-Inflated Generalized Dirichlet Multinomial (ZIGDM) tests described in Tang Z.-Z. & Chen G. (2017, submitted).
This package provides a suite of functions to work with data from the National Institutes of Health Brain Development Cohorts Data Hub. The package provides tools to create, clean, process, and filter datasets and associated metadata. These utilities are intended to simplify reproducible data-preparation for future research.
Enables user to perform the following: 1. Roll n number of die/dice (roll()). 2. Toss n number of coin(s) (toss()). 3. Play the game of Rock, Paper, Scissors. 4. Choose n number of card(s) from a pack of 52 playing cards (Joker optional).
Estimation methods for optimal treatment regimes under three different criteria, namely marginal quantile, marginal mean, and mean absolute difference. For the first two criteria, both one-stage and two-stage estimation method are implemented. A doubly robust estimator for estimating the quantile-optimal treatment regime is also included.
Simulate age-structured populations that vary in space and time and explore the efficacy of a range of built-in or user-defined sampling protocols to reproduce the population parameters of the known population. (See Regular et al. (2020) <doi:10.1371/journal.pone.0232822> for more details).
Transformation of sea currents to connectivity data. Two files of horizontal and vertical currents flows are transformed into connectivity data in the form of sfnetwork', shapefile, edge list and adjacency matrix. An application example is shown at Nagkoulis et al. (2025) <doi:10.1016/j.dib.2024.111268>.
This package implements named semaphores from the boost C++ library <https://www.boost.org/> for interprocess communication. Multiple R sessions on the same host can block (with optional timeout) on a semaphore until it becomes positive, then atomically decrement it and unblock. Any session can increment the semaphore.
Delta Method implementation to estimate standard errors with known asymptotic properties within the tidyverse workflow. The Delta Method is a statistical tool that approximates an estimatorâ s behaviour using a Taylor Expansion. For a comprehensive explanation, please refer to Chapter 3 of van der Vaart (1998, ISBN: 9780511802256).
This package provides a unified R6-based interface for various machine learning models with automatic interface detection, consistent cross-validation, model interpretations via numerical derivatives, and visualization. Supports both regression and classification tasks with any model function that follows R's standard modeling conventions (formula or matrix interface).
Minirhizotrons are widely used to observe and explore roots and their growth. This package provides the means to stitch images and divide them into depth layers. Please note that this R package was developed alongside the following manuscript: Stitching root scans and extracting depth layer information -- a workflow and practical examples, S. Kersting, L. Knüver, and M. Fischer. The manuscript is currently in preparation and should be citet as soon as it is available. This project was supported by the project ArtIGROW, which is a part of the WIR!-Alliance ArtIFARM â Artificial Intelligence in Farming funded by the German Federal Ministry of Research, Technology and Space (No. 03WIR4805).
This package provides tools for large, sparse optimal matching of treated units and control units in observational studies. Provisions are made for refined covariate balance constraints, which include fine and near-fine balance as special cases. Matches are optimal in the sense that they are computed as solutions to network optimization problems rather than greedy algorithms. See Pimentel, et al.(2015) <doi:10.1080/01621459.2014.997879> and Pimentel (2016), Obs. Studies 2(1):4-23. The rrelaxiv package, which provides an alternative solver for the underlying network flow problems, carries an academic license and is not available on CRAN, but may be downloaded from Github at <https://github.com/josherrickson/rrelaxiv/>.
The analysis of different aspects of biodiversity requires specific algorithms. For example, in regionalisation analyses, the high frequency of ties and zero values in dissimilarity matrices produced by Beta-diversity turnover produces hierarchical cluster dendrograms whose topology and bootstrap supports are affected by the order of rows in the original matrix. Moreover, visualisation of biogeographical regionalisation can be facilitated by a combination of hierarchical clustering and multi-dimensional scaling. The recluster package provides robust techniques to visualise and analyse pattern of biodiversity and to improve occurrence data for cryptic taxa. Other functions related to recluster (e.g. the biodecrypt family) are currently available in GitHub at <https://github.com/leondap/recluster>.
This package provides software and data for the book "An Introduction to the Bootstrap" by B. Efron and R. Tibshirani, 1993, Chapman and Hall. This package is primarily provided for projects already based on it, and for support of the book. New projects should preferentially use the recommended package "boot".
This library implements unicode-casemap, the simple, non locale-sensitive unicode collation algorithm described in RFC 5051. Proper unicode collation can be done using text-icu, but that is a big dependency that depends on a large C library, and rfc5051 might be better for some purposes.
The ptools (power tools) library extends Ruby's core File class with many additional methods modelled after common POSIX tools, such as File.which for finding executables, File.tail to print the last lines of a file, File.wc to count words, and so on.
An implementation of the American Society for Testing and Materials (ASTM) Standard E691 for interlaboratory testing procedures, designed for cross-platform genomic measurements. Given three (3) or more genomic platforms or laboratory protocols, this package provides interlaboratory testing procedures giving per-locus comparisons for sensitivity and precision between platforms.
Automatically do statistical exploration. Create formulas using tidyselect syntax, and then determine cross-validated model accuracy and variable contributions using glm and xgboost'. Contains additional helper functions to create and modify formulas. Has a flagship function to quickly determine relationships between categorical and continuous variables in the data set.
Filter CpGs based on Intra-class Correlation Coefficients (ICCs) when replicates are available. ICCs are calculated by fitting linear mixed effects models to all samples including the un-replicated samples. Including the large number of un-replicated samples improves ICC estimates dramatically. The method accommodates any replicate design.