Collection of functions for fitting and interpreting distributed lag interaction models (DLIM). A DLIM regresses a scalar outcome on repeated measures of exposure and allows for modification by a continuous variable. Includes a dlim() function for fitting, predict() function for inference, and plotting functions for visualization. Details on methodology are described in Demateis et al. (2024) <doi:10.1002/env.2843>.
Computes empirical Bayes confidence estimators and confidence intervals in a normal means model. The intervals are robust in the sense that they achieve correct coverage regardless of the distribution of the means. If the means are treated as fixed, the intervals have an average coverage guarantee. The implementation is based on Armstrong, Kolesár and Plagborg-Møller (2020) <arXiv:2004.03448>.
This package provides a tool which allows users to create and evaluate ensembles of species distribution model (SDM) predictions. Functionality is offered through R functions or a GUI (R Shiny app). This tool can assist users in identifying spatial uncertainties and making informed conservation and management decisions. The package is further described in Woodman et al (2019) <doi:10.1111/2041-210X.13283>.
This package implements the statistic FAVA, an Fst-based Assessment of Variability across vectors of relative Abundances, as well as a suite of helper functions which enable the visualization and statistical analysis of relative abundance data. The FAVA R package accompanies the paper, â Quantifying compositional variability in microbial communities with FAVAâ by Morrison, Xue, and Rosenberg (2025) <doi:10.1073/pnas.2413211122>.
This package provides a statistical hypothesis test for conditional independence. Given residuals from a sufficiently powerful regression, it tests whether the covariance of the residuals is vanishing. It can be applied to both discretely-observed functional data and multivariate data. Details of the method can be found in Anton Rask Lundborg, Rajen D. Shah and Jonas Peters (2022) <doi:10.1111/rssb.12544>.
Addresses the log of zero by developing a new family of estimators called iterated Ordinary Least Squares. This family nests standard approaches such as log-linear and Poisson regressions, offers several computational advantages, and corresponds to the correct way to perform the popular log(Y + 1) transformation. For more details about how to use it, see the notebook at: <https://www.davidbenatia.com/>.
This package implements the efficient algorithm by Ortmann and Brandes (2017) <doi:10.1007/s41109-017-0027-2> to compute the orbit-aware frequency distribution of induced and non-induced quads, i.e. subgraphs of size four. Given an edge matrix, data frame, or a graph object (e.g., igraph'), the orbit-aware counts are computed respective each of the edges and nodes.
This package provides tools for performing disproportionality analysis using the information component, proportional reporting rate and the reporting odds ratio. The anticipated use is passing data to the da() function, which executes the disproportionality analysis. See Norén et al (2011) <doi:10.1177/0962280211403604> and Montastruc et al (2011) <doi:10.1111/j.1365-2125.2011.04037.x> for further details.
This package provides a wrapper around the generic coordinate transformation software PROJ that transforms coordinates from one coordinate reference system ('CRS') to another. This includes cartographic projections as well as geodetic transformations. The intention is for this package to be used by user-packages such as reproj', and that the older PROJ.4 and version 5 pathways be provided by the proj4 package.
This package provides tooling to group dates by a variety of periods including: yearly, monthly, by second, by week of the month, and more. The groups are defined in such a way that they also represent the distance between dates in terms of the period. This extracts valuable information that can be used in further calculations that rely on a specific temporal spacing between observations.
Cluster ensembles are collections of individual solutions to a given clustering problem which are useful or necessary to consider in a wide range of applications. This R package provides an extensible computational environment for creating and analyzing cluster ensembles, with basic data structures for representing partitions and hierarchies, and facilities for computing on them, including methods for measuring proximity and obtaining consensus and secondary clusterings.
Duplicated publication data (pre-processed and formatted) for entity resolution. This data set contains a total of 1879 records. The following variables are included in the data set: id, title, book title, authors, address, date, year, editor, journal, volume, pages, publisher, institution, type, tech, note. The data set has a respective gold data set that provides information on which records match based on id.
Google's Compact Language Detector 3 is a neural network model for language identification and the successor of cld2 (available from CRAN). The algorithm is still experimental and takes a novel approach to language detection with different properties and outcomes. It can be useful to combine this with the Bayesian classifier results from cld2'. See <https://github.com/google/cld3#readme> for more information.
Cross-validate one or multiple regression and classification models and get relevant evaluation metrics in a tidy format. Validate the best model on a test set and compare it to a baseline evaluation. Alternatively, evaluate predictions from an external model. Currently supports regression and classification (binary and multiclass). Described in chp. 5 of Jeyaraman, B. P., Olsen, L. R., & Wambugu M. (2019, ISBN: 9781838550134).
This package provides a robust identification of differential binding sites method for analyzing ChIP-seq (Chromatin Immunoprecipitation Sequencing) comparing two samples that considers an ensemble of finite mixture models combined with a local false discovery rate (fdr) allowing for flexible modeling of data. Methods for Differential Identification using Mixture Ensemble (DIME) is described in: Taslim et al., (2011) <doi:10.1093/bioinformatics/btr165>.
Set of functions for Data Envelopment Analysis, including classical, fuzzy, cross-efficiency, bootstrapping, and Malmquist models. See: Banker, R.; Charnes, A.; Cooper, W.W. (1984). <doi:10.1287/mnsc.30.9.1078>, Charnes, A.; Cooper, W.W.; Rhodes, E. (1978). <doi:10.1016/0377-2217(78)90138-8> and Charnes, A.; Cooper, W.W.; Rhodes, E. (1981). <doi:10.1287/mnsc.27.6.668>.
The goal of dndR is to provide a suite of Dungeons & Dragons related functions. This package is meant to be useful both to players and Dungeon Masters (DMs). Some functions apply to many tabletop role-playing games (e.g., dice rolling), but others are focused on Fifth Edition (a.k.a. "5e") and where possible both the 2014 and 2024 versions are supported.
Regular and non-regular Fractional Factorial 2-level designs can be created. Furthermore, analysis tools for Fractional Factorial designs with 2-level factors are offered (main effects and interaction plots for all factors simultaneously, cube plot for looking at the simultaneous effects of three factors, full or half normal plot, alias structure in a more readable format than with the built-in function alias).
This package provides useful functions which are needed for bioinformatic analysis such as calculating linear principal components from numeric data and Single-nucleotide polymorphism (SNP) dataset, calculating fixation index (Fst) using Hudson method, creating scatter plots in 3 views, handling with PLINK binary file format, detecting rough structures and outliers using unsupervised clustering, and calculating matrix multiplication in the faster way for big data.
Fit multilevel manifest or latent time-series models, including popular Dynamic Structural Equation Models (DSEM). The models can be set up and modified with user-friendly functions and are fit to the data using Stan for Bayesian inference. Path models and formulas for user-defined models can be easily created with functions using knitr'. Asparouhov, Hamaker, & Muthen (2018) <doi:10.1080/10705511.2017.1406803>.
Implementation of various statistical models for multivariate event history data <doi:10.1007/s10985-013-9244-x>. Including multivariate cumulative incidence models <doi:10.1002/sim.6016>, and bivariate random effects probit models (Liability models) <doi:10.1016/j.csda.2015.01.014>. Modern methods for survival analysis, including regression modelling (Cox, Fine-Gray, Ghosh-Lin, Binomial regression) with fast computation of influence functions.
An implementation of the "Design Analysis" proposed by Gelman and Carlin (2014) <doi:10.1177/1745691614551642>. It combines the evaluation of Power-Analysis with other inferential-risks as Type-M error (i.e. Magnitude) and Type-S error (i.e. Sign). See also Altoè et al. (2020) <doi:10.3389/fpsyg.2019.02893> and Bertoldo et al. (2020) <doi:10.31234/osf.io/q9f86>.
This package provides a PEP, or Portable Encapsulated Project, is a dataset that subscribes to the PEP structure for organizing metadata. It is written using a simple YAML + CSV format, it is your one-stop solution to metadata management across data analysis environments. This package reads this standardized project configuration structure into R. Described in Sheffield et al. (2021) <doi:10.1093/gigascience/giab077>.
Estimates power, minimum detectable effect size (MDES) and sample size requirements. The context is multilevel randomized experiments with multiple outcomes. The estimation takes into account the use of multiple testing procedures. Development of this package was supported by a grant from the Institute of Education Sciences (R305D170030). For a full package description, including a detailed technical appendix, see <doi:10.18637/jss.v108.i06>.