This package provides a computationally efficient way of fitting weighted linear fixed effects estimators for causal inference with various weighting schemes. Weighted linear fixed effects estimators can be used to estimate the average treatment effects under different identification strategies. This includes stratified randomized experiments, matching and stratification for observational studies, first differencing, and difference-in-differences. The package implements methods described in Imai and Kim (2017) "When should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?", available at <https://imai.fas.harvard.edu/research/FEmatch.html>.
Calculates a number of valuation adjustments including CVA, DVA, FBA, FCA, MVA and KVA. A two-way margin agreement has been implemented. For the KVA calculation four regulatory frameworks are supported: CEM, (simplified) SA-CCR, OEM and IMM. The probability of default is implied through the credit spreads curve. The package supports an exposure calculation based on SA-CCR which includes several trade types and a simulated path which is currently available only for Interest Rate Swaps. The latest regulatory capital charge methodologies have been implementing including BA-CVA & SA-CVA.
Analyze count time series with excess zeros. Two types of statistical models are supported: Markov regression and state-space models. They are also known as observation-driven and parameter-driven models respectively in the time series literature. The functions used for Markov regression or observation-driven models can also be used to fit ordinary regression models with independent data under the zero-inflated Poisson (ZIP) or zero-inflated negative binomial (ZINB) assumption. The package also contains miscellaneous functions to compute density, distribution, quantile, and generate random numbers from ZIP and ZINB distributions.
Set of functions to calculate Benthic Biotic Indices from composition data, obtained whether from morphotaxonomic inventories or sequencing data. Based on reference ecological weights publicly available for a set of commonly used marine biotic indices, such as AMBI (A Marine Biotic Index, Borja et al., 2000) <doi:10.1016/S0025-326X(00)00061-8> NSI (Norwegian Sensitivity Index) and ISI (Indicator Species Index) (Rygg 2013, <ISBN:978-82-577-6210-0>). It provides the ecological quality status of the samples based on each BBI as well as the normalized Ecological Quality Ratio.
Estimate different types of cluster robust standard errors (CR0, CR1, CR2) with degrees of freedom adjustments. Standard errors are computed based on Liang and Zeger (1986) <doi:10.1093/biomet/73.1.13> and Bell and McCaffrey
<https://www150.statcan.gc.ca/n1/en/pub/12-001-x/2002002/article/9058-eng.pdf?st=NxMjN1YZ>
. Functions used in Huang and Li <doi:10.3758/s13428-021-01627-0>, Huang, Wiedermann', and Zhang <doi:10.1080/00273171.2022.2077290>, and Huang, Zhang', and Li (forthcoming: Journal of Research on Educational Effectiveness).
Create stunning network experiences powered by the G6 graph visualisation engine JavaScript
library <https://g6.antv.antgroup.com/en>. In shiny mode, modify your graph directly from the server function to dynamically interact with nodes and edges. Select your favorite layout among 20 choices. 15 behaviors are available such as interactive edge creation, collapse-expand and brush select. 17 plugins designed to improve the user experience such as a mini-map, toolbars and grid lines. Customise the look and feel of your graph with comprehensive options for nodes, edges and more.
An implementation of the Ordered Forest estimator as developed in Lechner & Okasa (2019) <arXiv:1907.02436>
. The Ordered Forest flexibly estimates the conditional probabilities of models with ordered categorical outcomes (so-called ordered choice models). Additionally to common machine learning algorithms the orf package provides functions for estimating marginal effects as well as statistical inference thereof and thus provides similar output as in standard econometric models for ordered choice. The core forest algorithm relies on the fast C++ forest implementation from the ranger package (Wright & Ziegler, 2017) <arXiv:1508.04409>
.
Numerical derivatives through finite-difference approximations can be calculated using the pnd package with parallel capabilities and optimal step-size selection to improve accuracy. These functions facilitate efficient computation of derivatives, gradients, Jacobians, and Hessians, allowing for more evaluations to reduce the mathematical and machine errors. Designed for compatibility with the numDeriv
package, which has not received updates in several years, it introduces advanced features such as computing derivatives of arbitrary order, improving the accuracy of Hessian approximations by avoiding repeated differencing, and parallelising slow functions on Windows, Mac, and Linux.
This package provides tools for Topological Data Analysis. The package focuses on statistical analysis of persistent homology and density clustering. For that, this package provides an R interface for the efficient algorithms of the C++ libraries GUDHI <https://project.inria.fr/gudhi/software/>, Dionysus <https://www.mrzv.org/software/dionysus/>, and PHAT <https://bitbucket.org/phat-code/phat/>. This package also implements methods from Fasy et al. (2014) <doi:10.1214/14-AOS1252> and Chazal et al. (2015) <doi:10.20382/jocg.v6i2a8> for analyzing the statistical significance of persistent homology features.
Plant ecologists often need to collect "traits" data about plant species which are often scattered among various databases: TR8 contains a set of tools which take care of automatically retrieving some of those functional traits data for plant species from publicly available databases (The Ecological Flora of the British Isles, LEDA traitbase, Ellenberg values for Italian Flora, Mycorrhizal intensity databases, BROT, PLANTS, Jepson Flora Project). The TR8 name, inspired by "car plates" jokes, was chosen since it both reminds of the main object of the package and is extremely short to type.
This package provides methods for nonparametric modeling of binomial and multinomial success probabilities via the Beta Kernel Process and its extension, the Dirichlet Kernel Process. Supports model fitting, predictive inference with uncertainty quantification, posterior simulation, and visualization in one- and two-dimensional input spaces. The package implements multiple kernel functions (Gaussian, Matern 5/2, and Matern 3/2), and performs hyperparameter optimization using multi-start gradient-based search. Applications include spatial statistics, probabilistic classification, and Bayesian experimental design. For more details, see MacKenzie
, Trafalis, and Barker (2014) <doi:10.1002/sam.11241>.
This package provides methods for learning causal relationships among a set of foreground variables X based on signals from a (potentially much larger) set of background variables Z, which are known non-descendants of X. The confounder blanket learner (CBL) uses sparse regression techniques to simultaneously perform many conditional independence tests, with complementary pairs stability selection to guarantee finite sample error control. CBL is sound and complete with respect to a so-called "lazy oracle", and works with both linear and nonlinear systems. For details, see Watson & Silva (2022) <arXiv:2205.05715>
.
This package contains different algorithms and construction methods for optimal Latin hypercube designs (LHDs) with flexible sizes. Our package is comprehensive since it is capable of generating maximin distance LHDs, maximum projection LHDs, and orthogonal and nearly orthogonal LHDs. Detailed comparisons and summary of all the algorithms and construction methods in this package can be found at Hongzhi Wang, Qian Xiao and Abhyuday Mandal (2021) <doi:10.48550/arXiv.2010.09154>
. This package is particularly useful in the area of Design and Analysis of Experiments (DAE). More specifically, design of computer experiments.
Generalised additive P-spline regression models estimation using the separation of overlapping precision matrices (SOP) method. Estimation is based on the equivalence between P-splines and linear mixed models, and variance/smoothing parameters are estimated based on restricted maximum likelihood (REML). The package enables users to estimate P-spline models with overlapping penalties. Based on the work described in Rodriguez-Alvarez et al. (2015) <doi:10.1007/s11222-014-9464-2>; Rodriguez-Alvarez et al. (2019) <doi:10.1007/s11222-018-9818-2>, and Eilers and Marx (1996) <doi:10.1214/ss/1038425655>.
The Truncated Factor Model is a statistical model designed to handle specific data structures in data analysis. This R package focuses on the Sparse Online Principal Component Estimation method, which is used to calculate data such as the loading matrix and specific variance matrix for truncated data, thereby better explaining the relationship between common factors and original variables. Additionally, the R package also provides other equations for comparison with the Sparse Online Principal Component Estimation method.The philosophy of the package is described in thesis. (2023) <doi:10.1007/s00180-022-01270-z>.
This package provides an implementation of efficient approximate leave-one-out (LOO) cross-validation for Bayesian models fit using Markov chain Monte Carlo, as described in doi:10.1007/s11222-016-9696-4. The approximation uses Pareto smoothed importance sampling (PSIS), a new procedure for regularizing importance weights. As a byproduct of the calculations, we also obtain approximate standard errors for estimated predictive errors and for the comparison of predictive errors between models. The package also provides methods for using stacking and other model weighting techniques to average Bayesian predictive distributions.
Parametric time warping aligns patterns. It aims to put corresponding features at the same locations. The algorithm searches for an optimal polynomial describing the warping. It is possible to align one sample to a reference, several samples to the same reference, or several samples to several references. One can choose between calculating individual warpings, or one global warping for a set of samples and one reference. Two optimization criteria are implemented: RMS error and WCC. Both warping of peak profiles and of peak lists are supported.
Fits boundary line models to datasets as proposed by Webb (1972) <doi:10.1080/00221589.1972.11514472> and makes statistical inferences about their parameters. Provides additional tools for testing datasets for evidence of boundary presence and selecting initial starting values for model optimization prior to fitting the boundary line models. It also includes tools for conducting post-hoc analyses such as predicting boundary values and identifying the most limiting factor (Miti, Milne, Giller, Lark (2024) <doi:10.1016/j.fcr.2024.109365>). This ensures a comprehensive analysis for datasets that exhibit upper boundary structures.
Diffusion Weighted Imaging (DWI) is a Magnetic Resonance Imaging modality, that measures diffusion of water in tissues like the human brain. The package contains R-functions to process diffusion-weighted data. The functionality includes diffusion tensor imaging (DTI), diffusion kurtosis imaging (DKI), modeling for high angular resolution diffusion weighted imaging (HARDI) using Q-ball-reconstruction and tensor mixture models, several methods for structural adaptive smoothing including POAS and msPOAS
, and a streamline fiber tracking for tensor and tensor mixture models. The package provides functionality to manipulate and visualize results in 2D and 3D.
Goodness-of-fit tests for selection of r in the r-largest order statistics (GEVr) model. Goodness-of-fit tests for threshold selection in the Generalized Pareto distribution (GPD). Random number generation and density functions for the GEVr distribution. Profile likelihood for return level estimation using the GEVr and Generalized Pareto distributions. P-value adjustments for sequential, multiple testing error control. Non-stationary fitting of GEVr and GPD. Bader, B., Yan, J. & Zhang, X. (2016) <doi:10.1007/s11222-016-9697-3>. Bader, B., Yan, J. & Zhang, X. (2018) <doi:10.1214/17-AOAS1092>.
For multiscale analysis, this package carries out ensemble patch transform, its visualization and multiscale decomposition. The detailed procedure is described in Kim et al. (2020), and Oh and Kim (2020). D. Kim, G. Choi, H.-S. Oh, Ensemble patch transformation: a flexible framework for decomposition and filtering of signal, EURASIP Journal on Advances in Signal Processing 30 (2020) 1-27 <doi:10.1186/s13634-020-00690-7>. H.-S. Oh, D. Kim, Image decomposition by bidimensional ensemble patch transform, Pattern Recognition Letters 135 (2020) 173-179 <doi:10.1016/j.patrec.2020.03.029>.
Estimation of Rosenthal's fail safe number including confidence intervals. The relevant papers are the following. Konstantinos C. Fragkos, Michail Tsagris and Christos C. Frangos (2014). "Publication Bias in Meta-Analysis: Confidence Intervals for Rosenthal's Fail-Safe Number". International Scholarly Research Notices, Volume 2014. <doi:10.1155/2014/825383>. Konstantinos C. Fragkos, Michail Tsagris and Christos C. Frangos (2017). "Exploring the distribution for the estimator of Rosenthal's fail-safe number of unpublished studies in meta-analysis". Communications in Statistics-Theory and Methods, 46(11):5672--5684. <doi:10.1080/03610926.2015.1109664>.
This package provides a comprehensive suite for assessing multivariate normality using six statistical tests (Mardia, Henzeâ Zirkler, Henzeâ Wagner, Royston, Doornikâ Hansen, Energy). Also includes univariate diagnostics, bivariate density visualization, robust outlier detection, power transformations (e.g., Boxâ Cox, Yeoâ Johnson), and imputation strategies ("mean", "median", "mice") for handling missing data. Bootstrap resampling is supported for selected tests to improve p-value accuracy in small samples. Diagnostic plots are available via both ggplot2 and interactive plotly visualizations. See Korkmaz et al. (2014) <https://journal.r-project.org/archive/2014-2/korkmaz-goksuluk-zararsiz.pdf>.
This package provides advanced algorithms for analyzing pointcloud data in forestry applications. Key features include fast voxelization of large datasets; segmentation of point clouds into forest floor, understorey, canopy, and wood components. The package enables efficient processing of large-scale forest pointcloud data, offering insights into forest structure, connectivity, and fire risk assessment. Algorithms to analyze pointcloud data (.xyz input file). For more details, see Ferrara & Arrizza (2025) <https://hdl.handle.net/20.500.14243/533471>. For single tree segmentation details, see Ferrara et al. (2018) <doi:10.1016/j.agrformet.2018.04.008>.