This package provides a toolset for the exploration of genetic and genomic data. Adegenet provides formal (S4) classes for storing and handling various genetic data, including genetic markers with varying ploidy and hierarchical population structure (genind class), alleles counts by populations (genpop), and genome-wide SNP data (genlight). It also implements original multivariate methods (DAPC, sPCA), graphics, statistical tests, simulation tools, distance and similarity measures, and several spatial methods. A range of both empirical and simulated datasets is also provided to illustrate various methods.
This package provides a comprehensive toolbox for analysing Spatial Point Patterns. It is focused mainly on two-dimensional point patterns, including multitype/marked points, in any spatial region. It also supports three-dimensional point patterns, space-time point patterns in any number of dimensions, point patterns on a linear network, and patterns of other geometrical objects. It supports spatial covariate data such as pixel images and contains over 2000 functions for plotting spatial data, exploratory data analysis, model-fitting, simulation, spatial sampling, model diagnostics, and formal inference.
UMI-4C is a technique that allows characterization of 3D chromatin interactions with a bait of interest, taking advantage of a sonication step to produce unique molecular identifiers (UMIs) that help remove duplication bias, thus allowing a better differential comparsion of chromatin interactions between conditions. This package allows processing of UMI-4C data, starting from FastQ files provided by the sequencing facility. It provides two statistical methods for detecting differential contacts and includes a visualization function to plot integrated information from a UMI-4C assay.
This package provides a comprehensive system for selecting variables and weighting data to match the specifications of the American National Election Studies. The package includes methods for identifying discrepant variables, raking data, and assessing the effects of the raking algorithm. It also allows automated re-raking if target variables fall outside identified bounds and allows greater user specification than other available raking algorithms. A variety of simple weighted statistics that were previously in this package (version .55 and earlier) have been moved to the package weights.'.
We extend existing gene enrichment tests to perform adverse event enrichment analysis. Unlike the continuous gene expression data, adverse event data are counts. Therefore, adverse event data has many zeros and ties. We propose two enrichment tests. One is a modified Fisher's exact test based on pre-selected significant adverse events, while the other is based on a modified Kolmogorov-Smirnov statistic. We add Covariate adjustment to improve the analysis."Adverse event enrichment tests using VAERS" Shuoran Li, Lili Zhao (2020) <arXiv:2007.02266>.
This package contains data and functions that can be used to make actuarial life tables. Each function adds a column to the inputted dataset for each intermediate calculation between mortality rate and life expectancy. Users can run any of our functions to complete the life table until that step, or run lifetable() to output a full life table that can be customized to remove optional columns. Methods for creating lifetables are as described in Zedstatistics (2021) <https://www.youtube.com/watch?v=Dfe59glNXAQ>.
Power calculations are a critical component of any research study to determine the minimum sample size necessary to detect differences between multiple groups. Researchers often work with data taking the form of proportions that can be modeled with a beta distribution. Here we present an R package, BetaPASS', that perform power and sample size calculations for data following a beta distribution with comparative nonparametric output. This package allows flexibility with multiple options for link functions to fit the data and graphing functionality for visual comparisons.
An implementation of double generalized linear model (DGLM) building with variable selection procedures and handling of interaction terms and other complex situations. We also provide a method of handling convergence issues within the dglm() function. The package offers a simulation function for generating simulated data for testing purposes and utilizes the forward stepwise variable selection procedure in model-building. It also provides a new custom bootstrap function for mean and standard deviation estimation and functions for building crossplots and squareplots from a data set.
This package creates sophisticated models of training data and validates the models with an independent test set, cross validation, or Out Of Bag (OOB) predictions on the training data. Create graphs and tables of the model validation results. Applies these models to GIS .img files of predictors to create detailed prediction surfaces. Handles large predictor files for map making, by reading in the .img files in chunks, and output to the .txt file the prediction for each data chunk, before reading the next chunk of data.
Provide simple functions to (i) compute a class of multi-functionality measures for a single ecosystem for given function weights, (ii) decompose gamma multi-functionality for pairs of ecosystems and K ecosystems (K can be greater than 2) into a within-ecosystem component (alpha multi-functionality) and an among-ecosystem component (beta multi-functionality). In each case, the correlation between functions can be corrected for. Based on biodiversity and ecosystem function data, this software also facilitates graphics for assessing biodiversity-ecosystem functioning relationships across scales.
Due to Rstudio's status as open source software, we believe it will be utilized frequently for future data analysis by users whom lack formal training or experience with R'. The NMVANOVA (Novice Model Variation ANOVA) a streamlined variation of experimental design functions that allows novice Rstudio users to perform different model variations one-way analysis of variance without downloading multiple libraries or packages. Users can easily manipulate the data block, and needed inputs so that users only have to plugin the four designed variables/values.
Helps you determine the analysis window to use when analyzing densely-sampled time-series data, such as EEG data, using permutation testing (Maris & Oostenveld, 2007) <doi:10.1016/j.jneumeth.2007.03.024>. These permutation tests can help identify the timepoints where significance of an effect begins and ends, and the results can be plotted in various types of heatmap for reporting. Mixed-effects models are supported using an implementation of the approach by Lee & Braun (2012) <doi:10.1111/j.1541-0420.2011.01675.x>.
Efficient algorithms for fully Bayesian estimation of stochastic volatility (SV) models with and without asymmetry (leverage) via Markov chain Monte Carlo (MCMC) methods. Methodological details are given in Kastner and Frühwirth-Schnatter (2014) <doi:10.1016/j.csda.2013.01.002> and Hosszejni and Kastner (2019) <doi:10.1007/978-3-030-30611-3_8>; the most common use cases are described in Hosszejni and Kastner (2021) <doi:10.18637/jss.v100.i12> and Kastner (2016) <doi:10.18637/jss.v069.i05> and the package examples.
Supports eigenvalue block-averaging p-values (Foldnes, Grønneberg, 2018) <doi:10.1080/10705511.2017.1373021>, penalized eigenvalue block-averaging p-values (Foldnes, Moss, Grønneberg, 2024) <doi:10.1080/10705511.2024.2372028>, penalized regression p-values (Foldnes, Moss, Grønneberg, 2024) <doi:10.1080/10705511.2024.2372028>, as well as traditional p-values such as Satorra-Bentler. All p-values can be calculated using unbiased or biased gamma estimates (Du, Bentler, 2022) <doi:10.1080/10705511.2022.2063870> and two choices of chi square statistics.
This package provides some basic routines for simulating a clinical trial. The primary intent is to provide some tools to generate trial simulations for trials with time to event outcomes. Piecewise exponential failure rates and piecewise constant enrollment rates are the underlying mechanism used to simulate a broad range of scenarios such as those presented in Lin et al. (2020) <doi:10.1080/19466315.2019.1697738>. However, the basic generation of data is done using pipes to allow maximum flexibility for users to meet different needs.
Make it easy to deal with multiple cross-tables in data exploration, by creating them, manipulating them, and adding color helpers to highlight important informations (differences from totals, comparisons between lines or columns, contributions to variance, confidence intervals, odds ratios, etc.). All functions are pipe-friendly and render data frames which can be easily manipulated. In the same time, time-taking operations are done with data.table to go faster with big dataframes. Tables can be exported with formats and colors to Excel', plot and html.
This package provides a test to understand the stability of the underlying stochastic data. Helps the userâ s understand whether the random variable under consideration is stationary or non-stationary without any manual interpretation of the results. It further ensures to check all the prerequisites and assumptions which are underlying the unit root test statistics and if the underlying data is found to be non-stationary in all the 4 lags the function diagnoses the input data and returns with an optimised solution on the same.
This package performs nearest neighbor-based imputation using one or more alternative approaches to processing multivariate data. These include methods based on canonical correlation: analysis, canonical correspondence analysis, and a multivariate adaptation of the random forest classification and regression techniques of Leo Breiman and Adele Cutler. Additional methods are also offered. The package includes functions for comparing the results from running alternative techniques, detecting imputation targets that are notably distant from reference observations, detecting and correcting for bias, bootstrapping and building ensemble imputations, and mapping results.
Artificial Bee Colony (ABC) is one of the most recently defined algorithms by Dervis Karaboga in 2005, motivated by the intelligent behavior of honey bees. It is as simple as Particle Swarm Optimization (PSO) and Differential Evolution (DE) algorithms, and uses only common control parameters such as colony size and maximum cycle number. The r-abcoptim implements the Artificial bee colony optimization algorithm http://mf.erciyes.edu.tr/abc/pub/tr06_2005.pdf. This version is a work-in-progress and is written in R code.
This package provides a set of tools to facilitate package development and make R a more user-friendly place. It is intended mostly for developers (or anyone who writes/shares functions). It provides a simple, powerful and flexible way to check the arguments passed to functions. The developer can easily describe the type of argument needed. If the user provides a wrong argument, then an informative error message is prompted with the requested type and the problem clearly stated--saving the user a lot of time in debugging.
Causal Distillation Tree (CDT) is a novel machine learning method for estimating interpretable subgroups with heterogeneous treatment effects. CDT allows researchers to fit any machine learning model (or metalearner) to estimate heterogeneous treatment effects for each individual, and then "distills" these predicted heterogeneous treatment effects into interpretable subgroups by fitting an ordinary decision tree to predict the previously-estimated heterogeneous treatment effects. This package provides tools to estimate causal distillation trees (CDT), as detailed in Huang, Tang, and Kenney (2025) <doi:10.48550/arXiv.2502.07275>.
Management and analysis of camera trap wildlife data through an integrated workflow. Provides functions for image/video organization and metadata extraction, species/individual identification. Creates detection histories for occupancy and spatial capture-recapture analyses, with support for multi-season studies. Includes tools for fitting community occupancy models in JAGS and NIMBLE, and an interactive dashboard for survey data visualization and analysis. Features visualization of species distributions and activity patterns, plus export capabilities for GIS and reports. Emphasizes automation and reproducibility while maintaining flexibility for different study designs.
Computationally efficient tools for comparing all pairs of profiles in a DNA database. The expectation and covariance of the summary statistic is implemented for fast computing. Routines for estimating proportions of close related individuals are available. The use of wildcards (also called F- designation) is implemented. Dedicated functions ease plotting the results. See Tvedebrink et al. (2012) <doi:10.1016/j.fsigen.2011.08.001>. Compute the distribution of the numbers of alleles in DNA mixtures. See Tvedebrink (2013) <doi:10.1016/j.fsigss.2013.10.142>.
Simulates demic diffusion building on models previously developed for the expansion of Neolithic and other food-producing economies during the Holocene (Fort et al. (2012) <doi:10.7183/0002-7316.77.2.203>, Souza et al. (2021) <doi:10.1098/rsif.2021.0499>). Growth and emigration are modelled as density-dependent processes using logistic growth and an asymptotic threshold model. Environmental and terrain layers, which can change over time, affect carrying capacity, growth and mobility. Multiple centres of origin with their respective starting times can be specified.