An R package for multiple imputation using chained random forests. Implemented methods can handle missing data in mixed types of variables by using prediction-based or node-based conditional distributions constructed using random forests. For prediction-based imputation, the method based on the empirical distribution of out-of-bag prediction errors of random forests and the method based on normality assumption for prediction errors of random forests are provided for imputing continuous variables. And the method based on predicted probabilities is provided for imputing categorical variables. For node-based imputation, the method based on the conditional distribution formed by the predicting nodes of random forests, and the method based on proximity measures of random forests are provided. More details of the statistical methods can be found in Hong et al. (2020) <arXiv:2004.14823>
.
Researchers commonly need to summarize scientific information, a process known as evidence synthesis'. The first stage of a synthesis process (such as a systematic review or meta-analysis) is to download a list of references from academic search engines such as Web of Knowledge or Scopus'. The traditional approach to systematic review is then to sort these data manually, first by locating and removing duplicated entries, and then screening to remove irrelevant content by viewing titles and abstracts (in that order). revtools provides interfaces for each of these tasks. An alternative approach, however, is to draw on tools from machine learning to visualise patterns in the corpus. In this case, you can use revtools to render ordinations of text drawn from article titles, keywords and abstracts, and interactively select or exclude individual references, words or topics.
This package provides a quick method for visualizing non-aggregated line-list or aggregated census data stratified by age and one or two categorical variables (e.g. gender and health status) with any number of values. It returns a ggplot object, allowing the user to further customize the output. This package is part of the R4Epis project <https://r4epis.netlify.app/>.
This package provides a variable selection method using B-Splines in multivariate nOnparametric
Regression models Based on partial dErivatives
Regularization (ABSORBER) implements a novel variable selection method in a nonlinear multivariate model using B-splines. For further details we refer the reader to the paper Savino, M. E. and Lévy-Leduc, C. (2024), <https://hal.science/hal-04434820>.
It performs All-Resolutions Inference (ARI) on functional Magnetic Resonance Image (fMRI
) data. As a main feature, it estimates lower bounds for the proportion of active voxels in a set of clusters as, for example, given by a cluster-wise analysis. The method is described in Rosenblatt, Finos, Weeda, Solari, Goeman (2018) <doi:10.1016/j.neuroimage.2018.07.060>.
This package provides an alternative approach to aoristic analyses for archaeological datasets by fitting Bayesian parametric growth models and non-parametric random-walk Intrinsic Conditional Autoregressive (ICAR) models on time frequency data (Crema (2024)<doi:10.1111/arcm.12984>). It handles event typo-chronology based timespans defined by start/end date as well as more complex user-provided vector of probabilities.
Autosimilarity curves, standardization of spatial extent, dissimilarity indexes that overweight rare species, phylogenetic and functional (pairwise and multisample) dissimilarity indexes and nestedness for phylogenetic, functional and other diversity metrics. The methods for phylogenetic and functional nestedness is described in Melo, Cianciaruso and Almeida-Neto (2014) <doi:10.1111/2041-210X.12185>. This should be a complement to available packages, particularly vegan'.
It allows to learn the structure of univariate time series, learning parameters and forecasting. Implements a model of Dynamic Bayesian Networks with temporal windows, with collections of linear regressors for Gaussian nodes, based on the introductory texts of Korb and Nicholson (2010) <doi:10.1201/b10391> and Nagarajan, Scutari and Lèbre (2013) <doi:10.1007/978-1-4614-6446-4>.
Extends the functionality of other plotting packages (notably ggplot2') to help facilitate the plotting of data over long time intervals, including, but not limited to, geological, evolutionary, and ecological data. The primary goal of deeptime is to enable users to add highly customizable timescales to their visualizations. Other functions are also included to assist with other areas of deep time visualization.
Estimates power by simulation for multivariate abundance data to be used for sample size estimates. Multivariate equivalence testing by simulation from a Gaussian copula model. The package also provides functions for parameterising multivariate effect sizes and simulating multivariate abundance data jointly. The discrete Gaussian copula approach is described in Popovic et al. (2018) <doi:10.1016/j.jmva.2017.12.002>.
Augments the eiCompare
package's Racially Polarized Voting (RPV) functionality to streamline analyses and visualizations used to support voting rights and redistricting litigation. The package implements methods described in Barreto, M., Collingwood, L., Garcia-Rios, S., & Oskooii, K. A. (2022). "Estimating Candidate Support in Voting Rights Act Cases: Comparing Iterative EI and EI-RÃ C Methods" <doi:10.1177/0049124119852394>.
Univariate and multivariate methods for compositional data analysis, based on logratios. The package implements the approach in the book Compositional Data Analysis in Practice by Michael Greenacre (2018), where accent is given to simple pairwise logratios. Selection can be made of logratios that account for a maximum percentage of logratio variance. Various multivariate analyses of logratios are included in the package.
Computes the expectation of the number of transmissions and receptions considering a Hop-by-Hop transport model with limited number of retransmissions per packet. It provides the theoretical results shown in Palma et. al.(2016) <DOI:10.1109/TLA.2016.7555237> and also estimated values based on Monte Carlo simulations. It is also possible to consider random data and ACK probabilities.
Kernel-based Tweedie compound Poisson gamma model using high-dimensional predictors for the analyses of zero-inflated response variables. The package features built-in estimation, prediction and cross-validation tools and supports choice of different kernel functions. For more details, please see Yi Lian, Archer Yi Yang, Boxiang Wang, Peng Shi & Robert William Platt (2023) <doi:10.1080/00401706.2022.2156615>.
This package implements a local likelihood estimator for the dependence parameter in bivariate conditional copula models. Copula family and local likelihood bandwidth parameters are selected by leave-one-out cross-validation. The models are implemented in TMB', meaning that the local score function is efficiently calculated via automated differentiation (AD), such that quasi-Newton algorithms may be used for parameter estimation.
Extends the functionality of the tourr package by an interactive graphical user interface. The interactivity allows users to effortlessly refine their tourr results by manual intervention, which allows for integration of expert knowledge and aids the interpretation of results. For more information on tourr see Wickham et. al (2011) <doi:10.18637/jss.v040.i02> or <https://github.com/ggobi/tourr>.
This package provides the facility to calculate non-isotropic accumulated cost surface, least-cost paths, least-cost corridors, least-cost networks using a number of human-movement-related cost functions that can be selected by the user. It just requires a Digital Terrain Model, a start location and (optionally) destination locations. See Alberti (2019) <doi:10.1016/j.softx.2019.100331>.
This package provides a metadata structure for clinical data analysis and reporting based on Analysis Data Model (ADaM
) datasets. The package simplifies clinical analysis and reporting tool development by defining standardized inputs, outputs, and workflow. The package can be used to create analysis and reporting planning grid, mock table, and validated analysis and reporting results based on consistent inputs.
For single tensor data, any matrix factorization method can be specified the matricised tensor in each dimension by Multi-way Component Analysis (MWCA). An originally extended MWCA is also implemented to specify and decompose multiple matrices and tensors simultaneously (CoupledMWCA
). See the reference section of GitHub
README.md <https://github.com/rikenbit/mwTensor>
, for details of the methods.
Implementation of Sequential BATTing (bootstrapping and aggregating of thresholds from trees) for developing threshold-based multivariate (prognostic/predictive) biomarker signatures. Variable selection is automatically built-in. Final signatures are returned with interaction plots for predictive signatures. Cross-validation performance evaluation and testing dataset results are also output. Detail algorithms are described in Huang et al (2017) <doi:10.1002/sim.7236>.
This package provides functions for the analysis of time series using copula models. The package is based on methodology described in the following references. McNeil
, A.J. (2021) <doi:10.3390/risks9010014>, Bladt, M., & McNeil
, A.J. (2021) <doi:10.1016/j.ecosta.2021.07.004>, Bladt, M., & McNeil
, A.J. (2022) <doi:10.1515/demo-2022-0105>.
Calculates one-sample unbiased central moment estimates and two-sample pooled estimates up to 6th order, including estimates of powers and products of central moments. Provides the machinery for obtaining unbiased central moment estimators beyond 6th order by generating expressions for expectations of raw sample moments and their powers and products. Gerlovina and Hubbard (2019) <doi:10.1080/25742558.2019.1701917>.
The implemented R6 class SCM aims to simplify working with structural causal models. The missing data mechanism can be defined as a part of the structural model. The class contains methods for 1) defining a structural causal model via functions, text or conditional probability tables, 2) printing basic information on the model, 3) plotting the graph for the model using packages igraph or qgraph', 4) simulating data from the model, 5) applying an intervention, 6) checking the identifiability of a query using the R packages causaleffect and dosearch', 7) defining the missing data mechanism, 8) simulating incomplete data from the model according to the specified missing data mechanism and 9) checking the identifiability in a missing data problem using the R package dosearch'. In addition, there are functions for running experiments and doing counterfactual inference using simulation.
Flexible framework for ecological restoration planning. It aims to identify priority areas for restoration efforts using optimization algorithms (based on Justeau-Allaire et al. 2021 <doi:10.1111/1365-2664.13803>). Priority areas can be identified by maximizing landscape indices, such as the effective mesh size (Jaeger 2000 <doi:10.1023/A:1008129329289>), or the integral index of connectivity (Pascual-Hortal & Saura 2006 <doi:10.1007/s10980-006-0013-z>). Additionally, constraints can be used to ensure that priority areas exhibit particular characteristics (e.g., ensure that particular places are not selected for restoration, ensure that priority areas form a single contiguous network). Furthermore, multiple near-optimal solutions can be generated to explore multiple options in restoration planning. The package leverages the Choco-solver software to perform optimization using constraint programming (CP) techniques (<https://choco-solver.org/>).