This package provides implementation of statistical methods for random objects lying in various metric spaces, which are not necessarily linear spaces. The core of this package is Fréchet regression for random objects with Euclidean predictors, which allows one to perform regression analysis for non-Euclidean responses under some mild conditions. Examples include distributions in 2-Wasserstein space, covariance matrices endowed with power metric (with Frobenius metric as a special case), Cholesky and log-Cholesky metrics, spherical data. References: Petersen, A., & Müller, H.-G. (2019) <doi:10.1214/17-AOS1624>.
Spatio-temporal locations of an animal are computed from annotated data with a hidden Markov model via particle filter algorithm. The package is relatively robust to varying degrees of shading. The hidden Markov model is described in Movement Ecology - Rakhimberdiev et al. (2015) <doi:10.1186/s40462-015-0062-5>, general package description is in the Methods in Ecology and Evolution - Rakhimberdiev et al. (2017) <doi:10.1111/2041-210X.12765> and package accuracy assessed in the Journal of Avian Biology - Rakhimberdiev et al. (2016) <doi:10.1111/jav.00891>.
This package provides a mutual information estimator based on k-nearest neighbor method proposed by A. Kraskov, et al. (2004) <doi:10.1103/PhysRevE.69.066138> to measure general dependence and the time complexity for our estimator is only squared to the sample size, which is faster than other statistics. Besides, an implementation of mutual information based independence test is provided for analyzing multivariate data in Euclidean space (T B. Berrett, et al. (2019) <doi:10.1093/biomet/asz024>); furthermore, we extend it to tackle datasets in metric spaces.
Plot density and distribution functions with automatic selection of suitable regions. Numerically invert (compute quantiles) distribution functions. Simulate real and complex numbers from distributions of their magnitude and arguments. Optionally, the magnitudes and/or arguments may be fixed in almost arbitrary ways. Create polynomials from roots given in Cartesian or polar form. Small programming utilities: check if an object is identical to NA, count positional arguments in a call, set intersection of more than two sets, check if an argument is unnamed, compute the graph of S4 classes in packages.
Efficiently manage and process data from oTree experiments. Import oTree data and clean them by using functions that handle messy data, dropouts, and other problematic cases. Create IDs, calculate the time, transfer variables between app data frames, and delete sensitive information. Review your experimental data prior to running the experiment and automatically generate a detailed summary of the variables used in your oTree code. Information on oTree is found in Chen, D. L., Schonger, M., & Wickens, C. (2016) <doi:10.1016/j.jbef.2015.12.001>.
Generates a file, containing the main scientific references, prepared to be automatically inserted into an academic paper. The articles present in the list are chosen from the main references generated, by function principal_lister(), of the package bibliorefer'. The generated file contains the list of metadata of the principal references in BibTex format. Massimo Aria, Corrado Cuccurullo. (2017) <doi:10.1016/j.joi.2017.08.007>. Caibo Zhou, Wenyan Song. (2021) <doi:10.1016/j.jclepro.2021.126943>. Hamid DerviÅ . (2019) <doi:10.5530/jscires.8.3.32>.
It provides multiple functions that are useful for ecological research and teaching statistics to ecologists. It is based on data analysis courses offered at the Instituto de Ecologà a AC (INECOL). For references and published evidence see, Manrique-Ascencio, et al (2024) <doi:10.1111/gcb.17282>, Manrique-Ascencio et al (2024) <doi:10.1111/plb.13683>, Ruiz-Guerra et al(2017) <doi:10.17129/botsci.812>, Juarez-Fragoso et al (2024) <doi:10.1007/s10980-024-01809-z>, Papaqui-Bello et al (2024) <doi:10.13102/sociobiology.v71i2.10503>.
This is designed for use with an arbitrary set of equations with an arbitrary set of unknowns. The user selects "fixed" values for enough unknowns to leave as many variables as there are equations, which in most cases means the system is properly defined and a unique solution exists. The function, the fixed values and initial values for the remaining unknowns are fed to a nonlinear backsolver. The original version of "TK!Solver" , now a product of Universal Technical Systems (<https://www.uts.com>) was the inspiration for this function.
This package provides tools to generate HTML interfaces for adaptive and non-adaptive tests using the shiny package (Chalmers (2016) <doi:10.18637/jss.v071.i05>). Suitable for applying unidimensional and multidimensional computerized adaptive tests (CAT) using item response theory methodology and for creating simple questionnaires forms to collect response data directly in R. Additionally, optimal test designs (e.g., "shadow testing") are supported for tests that contain a large number of item selection constraints. Finally, package contains tools useful for performing Monte Carlo simulations for studying test item banks.
Model based clustering using the multivariate multiple Scaled t (MST) and multivariate multiple scaled contaminated normal (MSCN) distributions. The MST is an extension of the multivariate Student-t distribution to include flexible tail behaviors, Forbes, F. & Wraith, D. (2014) <doi:10.1007/s11222-013-9414-4>. The MSCN represents a heavy-tailed generalization of the multivariate normal (MN) distribution to model elliptical contoured scatters in the presence of mild outliers (also referred to as "bad" points) and automatically detect bad points, Punzo, A. & Tortora, C. (2021) <doi:10.1177/1471082X19890935>.
This package provides a number of functions to facilitate the handling and production of reports using time series data. The package was developed to be understandable for beginners, so some functions aim to transform processes that would be complex into functions with a few lines. The main advantage of using the metools package is the ease of producing reports and working with time series using a few lines of code, so the code is clean and easy to understand/maintain. Learn more about the metools at <https://metoolsr.wordpress.com>.
This package provides functions for downloading, calibrating, and analyzing atmospheric isotope data bundled into the eddy covariance data products of the National Ecological Observatory Network (NEON) <https://www.neonscience.org>. Calibration tools are provided for carbon and water isotope products. Carbon isotope calibration details are found in Fiorella et al. (2021) <doi:10.1029/2020JG005862>, and the readme file at <https://github.com/lanl/NEONiso>. Tools for calibrating water isotope products have been added as of 0.6.0, but have known deficiencies and should be considered experimental and unsupported.
Calculates the periodogram of a time series, maximum-likelihood fits an Ornstein-Uhlenbeck state space (OUSS) null model and evaluates the statistical significance of periodogram peaks against the OUSS null hypothesis. The OUSS is a parsimonious model for stochastically fluctuating variables with linear stabilizing forces, subject to uncorrelated measurement errors. Contrary to the classical white noise null model for detecting cyclicity, the OUSS model can account for temporal correlations typically occurring in ecological and geological time series. Citation: Louca, Stilianos and Doebeli, Michael (2015) <doi:10.1890/14-0126.1>.
This package provides functions for landscape analysis and data retrieval. The package allows users to download environmental variables from global datasets (e.g., WorldClim, ESA WorldCover, Nighttime Lights), and to compute spatial and landscape metrics using a hexagonal grid system based on the H3 spatial index. It is useful for ecological modeling, biodiversity studies, and spatial data processing in landscape ecology. Fick and Hijmans (2017) <doi:10.1002/joc.5086>. Zanaga et al. (2022) <doi:10.5281/zenodo.7254221>. Uber Technologies Inc. (2022) "H3: Hexagonal hierarchical spatial index".
Inspired by space-time regressions often performed to assess the expansion of the Neolithic from the Near East to Europe (Pinhasi et al. 2005 <doi:10.1371/journal.pbio.0030410>). Test for significant correlations between the (earliest) radiocarbon dates of archaeological sites and their respective distances from a hypothetical center of origin. Both ordinary least squares (OLS) and reduced major axis (RMA) methods are supported (Russell et al. 2014 <doi:10.1371/journal.pone.0087854>). It is also possible to iterate over many sites to identify the most likely origin.
This package provides a collection of convenient functions for common statistical computations, which are not directly provided by R's base or stats packages. This package aims at providing, first, shortcuts for statistical measures, which otherwise could only be calculated with additional effort. Second, these shortcut functions are generic, and can be applied not only to vectors, but also to other objects as well. The focus of most functions lies on summary statistics or fit measures for regression models, including generalized linear models, mixed effects models and Bayesian models.
Many relevant applications in the environmental and socioeconomic sciences use areal data, such as biodiversity checklists, agricultural statistics, or socioeconomic surveys. For applications that surpass the spatial, temporal or thematic scope of any single data source, data must be integrated from several heterogeneous sources. Inconsistent concepts, definitions, or messy data tables make this a tedious and error-prone process. arealDB tackles those problems and helps the user to integrate a harmonised databases of areal data. Read the paper at Ehrmann, Seppelt & Meyer (2020) <doi:10.1016/j.envsoft.2020.104799>.
This package provides functions to simulate data sets from hierarchical ecological models, including all the simulations described in the two volume publication Applied Hierarchical Modeling in Ecology: Analysis of distribution, abundance and species richness in R and BUGS by Marc Kéry and Andy Royle: volume 1 (2016, ISBN: 978-0-12-801378-6) and volume 2 (2021, ISBN: 978-0-12-809585-0), <https://www.mbr-pwrc.usgs.gov/pubanalysis/keryroylebook/>. It also has all the utility functions and data sets needed to replicate the analyses shown in the books.
Bayesian kernel machine regression (from the bkmr package) is a Bayesian semi-parametric generalized linear model approach under identity and probit links. There are a number of functions in this package that extend Bayesian kernel machine regression fits to allow multiple-chain inference and diagnostics, which leverage functions from the future', rstan', and coda packages. Reference: Bobb, J. F., Henn, B. C., Valeri, L., & Coull, B. A. (2018). Statistical software for analyzing the health effects of multiple concurrent exposures via Bayesian kernel machine regression. ; <doi:10.1186/s12940-018-0413-y>.
Clusters longitudinal trajectories over time (can be unequally spaced, unequal length time series and/or partially overlapping series) on a common time axis. Performs k-means clustering on a single continuous variable measured over time, where each mean is defined by a thin plate spline fit to all points in a cluster. Distance is MSE across trajectory points to cluster spline. Provides graphs of derived cluster splines, silhouette plots, and Adjusted Rand Index evaluations of the number of clusters. Scales well to large data with multicore parallelism available to speed computation.
This package implements a modern, unified estimation strategy for common mediation estimands (natural effects, organic effects, interventional effects, and recanting twins) in combination with modified treatment policies as described in Liu, Williams, Rudolph, and DÃ az (2024) <doi:10.48550/arXiv.2408.14620>. Estimation makes use of recent advancements in Riesz-learning to estimate a set of required nuisance parameters with deep learning. The result is the capability to estimate mediation effects with binary, categorical, continuous, or multivariate exposures with high-dimensional mediators and mediator-outcome confounders using machine learning.
Quantitative characterization of the health impacts associated with exposure to chemical mixtures has received considerable attention in current environmental and epidemiological studies. CompMix package allows practitioners to estimate the health impacts from exposure to chemical mixtures data through various statistical approaches, including Lasso, Elastic net, Bayeisan kernel machine regression (BKMR), hierNet, Quantile g-computation, Weighted quantile sum (WQS) and Random forest. Hao W, Cathey A, Aung M, Boss J, Meeker J, Mukherjee B. (2024) "Statistical methods for chemical mixtures: a practitioners guide". <DOI:10.1101/2024.03.03.24303677>.
Estimation of distributed lag models (DLMs) based on a Bayesian additive regression trees framework. Includes several extensions of DLMs: treed DLMs and distributed lag mixture models (Mork and Wilson, 2023) <doi:10.1111/biom.13568>; treed distributed lag nonlinear models (Mork and Wilson, 2022) <doi:10.1093/biostatistics/kxaa051>; heterogeneous DLMs (Mork, et. al., 2024) <doi:10.1080/01621459.2023.2258595>; monotone DLMs (Mork and Wilson, 2024) <doi:10.1214/23-BA1412>. The package also includes visualization tools and a shiny interface to check model convergence and to help interpret results.
This package performs fragment analysis using genetic data coming from capillary electrophoresis machines. These are files with FSA extension which stands for FASTA-type file, and .txt files from Beckman CEQ 8000 system, both contain DNA fragment intensities read by machinery. In addition to visualization, it performs automatic scoring of SSRs (Sample Sequence Repeats; a type of genetic marker very common across the genome) and other type of PCR markers (standing for Polymerase Chain Reaction) in biparental populations such as F1, F2, BC (backcross), and diversity panels (collection of genetic diversity).