This package provides a computationally efficient way of fitting weighted linear fixed effects estimators for causal inference with various weighting schemes. Weighted linear fixed effects estimators can be used to estimate the average treatment effects under different identification strategies. This includes stratified randomized experiments, matching and stratification for observational studies, first differencing, and difference-in-differences. The package implements methods described in Imai and Kim (2017) "When should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?", available at <https://imai.fas.harvard.edu/research/FEmatch.html>.
Calculates a number of valuation adjustments including CVA, DVA, FBA, FCA, MVA and KVA. A two-way margin agreement has been implemented. For the KVA calculation four regulatory frameworks are supported: CEM, (simplified) SA-CCR, OEM and IMM. The probability of default is implied through the credit spreads curve. The package supports an exposure calculation based on SA-CCR which includes several trade types and a simulated path which is currently available only for Interest Rate Swaps. The latest regulatory capital charge methodologies have been implementing including BA-CVA & SA-CVA.
Set of functions to calculate Benthic Biotic Indices from composition data, obtained whether from morphotaxonomic inventories or sequencing data. Based on reference ecological weights publicly available for a set of commonly used marine biotic indices, such as AMBI (A Marine Biotic Index, Borja et al., 2000) <doi:10.1016/S0025-326X(00)00061-8> NSI (Norwegian Sensitivity Index) and ISI (Indicator Species Index) (Rygg 2013, <ISBN:978-82-577-6210-0>). It provides the ecological quality status of the samples based on each BBI as well as the normalized Ecological Quality Ratio.
Estimate different types of cluster robust standard errors (CR0, CR1, CR2) with degrees of freedom adjustments. Standard errors are computed based on Liang and Zeger (1986) <doi:10.1093/biomet/73.1.13> and Bell and McCaffrey <https://www150.statcan.gc.ca/n1/en/pub/12-001-x/2002002/article/9058-eng.pdf?st=NxMjN1YZ>. Functions used in Huang and Li <doi:10.3758/s13428-021-01627-0>, Huang, Wiedermann', and Zhang <doi:10.1080/00273171.2022.2077290>, and Huang, Zhang', and Li (forthcoming: Journal of Research on Educational Effectiveness).
Create stunning network experiences powered by the G6 graph visualisation engine JavaScript library <https://g6.antv.antgroup.com/en>. In shiny mode, modify your graph directly from the server function to dynamically interact with nodes and edges. Select your favorite layout among 20 choices. 15 behaviors are available such as interactive edge creation, collapse-expand and brush select. 17 plugins designed to improve the user experience such as a mini-map, toolbars and grid lines. Customise the look and feel of your graph with comprehensive options for nodes, edges and more.
Calibrate and apply multivariate bias correction algorithms for climate model simulations of multiple climate variables. Three methods described by Cannon (2016) <doi:10.1175/JCLI-D-15-0679.1> and Cannon (2018) <doi:10.1007/s00382-017-3580-6> are implemented -- (i) MBC Pearson correlation (MBCp), (ii) MBC rank correlation (MBCr), and (iii) MBC N-dimensional PDF transform (MBCn) -- as is the Rank Resampling for Distributions and Dependences (R2D2) method. An additional multivariate rescaling method based on the linear Monge-Kantorovich map for Gaussian optimal transport of dependence structure is also included.
Numerical derivatives through finite-difference approximations can be calculated using the pnd package with parallel capabilities and optimal step-size selection to improve accuracy. These functions facilitate efficient computation of derivatives, gradients, Jacobians, and Hessians, allowing for more evaluations to reduce the mathematical and machine errors. Designed for compatibility with the numDeriv package, which has not received updates in several years, it introduces advanced features such as computing derivatives of arbitrary order, improving the accuracy of Hessian approximations by avoiding repeated differencing, and parallelising slow functions on Windows, Mac, and Linux.
Plant ecologists often need to collect "traits" data about plant species which are often scattered among various databases: TR8 contains a set of tools which take care of automatically retrieving some of those functional traits data for plant species from publicly available databases (The Ecological Flora of the British Isles, LEDA traitbase, Ellenberg values for Italian Flora, Mycorrhizal intensity databases, BROT, PLANTS, Jepson Flora Project). The TR8 name, inspired by "car plates" jokes, was chosen since it both reminds of the main object of the package and is extremely short to type.
This package provides tools for Topological Data Analysis. The package focuses on statistical analysis of persistent homology and density clustering. For that, this package provides an R interface for the efficient algorithms of the C++ libraries GUDHI <https://project.inria.fr/gudhi/software/>, Dionysus <https://www.mrzv.org/software/dionysus/>, and PHAT <https://bitbucket.org/phat-code/phat/>. This package also implements methods from Fasy et al. (2014) <doi:10.1214/14-AOS1252> and Chazal et al. (2015) <doi:10.20382/jocg.v6i2a8> for analyzing the statistical significance of persistent homology features.
This package provides an implementation of efficient approximate leave-one-out (LOO) cross-validation for Bayesian models fit using Markov chain Monte Carlo, as described in doi:10.1007/s11222-016-9696-4. The approximation uses Pareto smoothed importance sampling (PSIS), a new procedure for regularizing importance weights. As a byproduct of the calculations, we also obtain approximate standard errors for estimated predictive errors and for the comparison of predictive errors between models. The package also provides methods for using stacking and other model weighting techniques to average Bayesian predictive distributions.
Parametric time warping aligns patterns. It aims to put corresponding features at the same locations. The algorithm searches for an optimal polynomial describing the warping. It is possible to align one sample to a reference, several samples to the same reference, or several samples to several references. One can choose between calculating individual warpings, or one global warping for a set of samples and one reference. Two optimization criteria are implemented: RMS error and WCC. Both warping of peak profiles and of peak lists are supported.
This package provides methods for learning causal relationships among a set of foreground variables X based on signals from a (potentially much larger) set of background variables Z, which are known non-descendants of X. The confounder blanket learner (CBL) uses sparse regression techniques to simultaneously perform many conditional independence tests, with complementary pairs stability selection to guarantee finite sample error control. CBL is sound and complete with respect to a so-called "lazy oracle", and works with both linear and nonlinear systems. For details, see Watson & Silva (2022) <arXiv:2205.05715>.
This package contains different algorithms and construction methods for optimal Latin hypercube designs (LHDs) with flexible sizes. Our package is comprehensive since it is capable of generating maximin distance LHDs, maximum projection LHDs, and orthogonal and nearly orthogonal LHDs. Detailed comparisons and summary of all the algorithms and construction methods in this package can be found at Hongzhi Wang, Qian Xiao and Abhyuday Mandal (2021) <doi:10.48550/arXiv.2010.09154>. This package is particularly useful in the area of Design and Analysis of Experiments (DAE). More specifically, design of computer experiments.
Generalised additive P-spline regression models estimation using the separation of overlapping precision matrices (SOP) method. Estimation is based on the equivalence between P-splines and linear mixed models, and variance/smoothing parameters are estimated based on restricted maximum likelihood (REML). The package enables users to estimate P-spline models with overlapping penalties. Based on the work described in Rodriguez-Alvarez et al. (2015) <doi:10.1007/s11222-014-9464-2>; Rodriguez-Alvarez et al. (2019) <doi:10.1007/s11222-018-9818-2>, and Eilers and Marx (1996) <doi:10.1214/ss/1038425655>.
Fits boundary line models to datasets as proposed by Webb (1972) <doi:10.1080/00221589.1972.11514472> and makes statistical inferences about their parameters. Provides additional tools for testing datasets for evidence of boundary presence and selecting initial starting values for model optimization prior to fitting the boundary line models. It also includes tools for conducting post-hoc analyses such as predicting boundary values and identifying the most limiting factor (Miti, Milne, Giller, Lark (2024) <doi:10.1016/j.fcr.2024.109365>). This ensures a comprehensive analysis for datasets that exhibit upper boundary structures.
Diffusion Weighted Imaging (DWI) is a Magnetic Resonance Imaging modality, that measures diffusion of water in tissues like the human brain. The package contains R-functions to process diffusion-weighted data. The functionality includes diffusion tensor imaging (DTI), diffusion kurtosis imaging (DKI), modeling for high angular resolution diffusion weighted imaging (HARDI) using Q-ball-reconstruction and tensor mixture models, several methods for structural adaptive smoothing including POAS and msPOAS, and a streamline fiber tracking for tensor and tensor mixture models. The package provides functionality to manipulate and visualize results in 2D and 3D.
Goodness-of-fit tests for selection of r in the r-largest order statistics (GEVr) model. Goodness-of-fit tests for threshold selection in the Generalized Pareto distribution (GPD). Random number generation and density functions for the GEVr distribution. Profile likelihood for return level estimation using the GEVr and Generalized Pareto distributions. P-value adjustments for sequential, multiple testing error control. Non-stationary fitting of GEVr and GPD. Bader, B., Yan, J. & Zhang, X. (2016) <doi:10.1007/s11222-016-9697-3>. Bader, B., Yan, J. & Zhang, X. (2018) <doi:10.1214/17-AOAS1092>.
For multiscale analysis, this package carries out ensemble patch transform, its visualization and multiscale decomposition. The detailed procedure is described in Kim et al. (2020), and Oh and Kim (2020). D. Kim, G. Choi, H.-S. Oh, Ensemble patch transformation: a flexible framework for decomposition and filtering of signal, EURASIP Journal on Advances in Signal Processing 30 (2020) 1-27 <doi:10.1186/s13634-020-00690-7>. H.-S. Oh, D. Kim, Image decomposition by bidimensional ensemble patch transform, Pattern Recognition Letters 135 (2020) 173-179 <doi:10.1016/j.patrec.2020.03.029>.
Estimation of Rosenthal's fail safe number including confidence intervals. The relevant papers are the following. Konstantinos C. Fragkos, Michail Tsagris and Christos C. Frangos (2014). "Publication Bias in Meta-Analysis: Confidence Intervals for Rosenthal's Fail-Safe Number". International Scholarly Research Notices, Volume 2014. <doi:10.1155/2014/825383>. Konstantinos C. Fragkos, Michail Tsagris and Christos C. Frangos (2017). "Exploring the distribution for the estimator of Rosenthal's fail-safe number of unpublished studies in meta-analysis". Communications in Statistics-Theory and Methods, 46(11):5672--5684. <doi:10.1080/03610926.2015.1109664>.
This tool computes the probability of detection (POD) curve and the limit of detection (LOD), i.e. the number of copies of the target DNA sequence required to ensure a 95 % probability of detection (LOD95). Other quantiles of the LOD can be specified. This is a reimplementation of the mathematical-statistical modelling of the validation of qualitative polymerase chain reaction (PCR) methods within a single laboratory as provided by the commercial tool PROLab <http://quodata.de/>. The modelling itself has been described by Uhlig et al. (2015) <doi:10.1007/s00769-015-1112-9>.
The overall performance of soil ecosystem services and productivity greatly relies on soil health, making it a crucial indicator. The evaluation of soil physical, chemical, and biological parameters is necessary to determine the overall soil quality index. In our package, three commonly used methods, including linear scoring, regression-based, and principal component-based soil quality indexing, are employed to calculate the soil quality index. This package has been developed using concept of Bastida et al. (2008) and Doran and Parkin (1994) <doi:10.1016/j.geoderma.2008.08.007> <doi:10.2136/sssaspecpub35.c1>.
This package provides an interface to a large number of classification and regression techniques. These techniques include machine-readable parameter descriptions. There is also an experimental extension for survival analysis, clustering and general, example-specific cost-sensitive learning. Also included:
Generic resampling, including cross-validation, bootstrapping and subsampling;
Hyperparameter tuning with modern optimization techniques, for single- and multi-objective problems;
Filter and wrapper methods for feature selection;
Extension of basic learners with additional operations common in machine learning, also allowing for easy nested resampling.
Most operations can be parallelized.
Package ACV (short for Affine Cross-Validation) offers an improved time-series cross-validation loss estimator which utilizes both in-sample and out-of-sample forecasting performance via a carefully constructed affine weighting scheme. Under the assumption of stationarity, the estimator is the best linear unbiased estimator of the out-of-sample loss. Besides that, the package also offers improved versions of Diebold-Mariano and Ibragimov-Muller tests of equal predictive ability which deliver more power relative to their conventional counterparts. For more information, see the accompanying article Stanek (2021) <doi:10.2139/ssrn.3996166>.
This package provides a comprehensive suite for assessing multivariate normality using six statistical tests (Mardia, Henzeâ Zirkler, Henzeâ Wagner, Royston, Doornikâ Hansen, Energy). Also includes univariate diagnostics, bivariate density visualization, robust outlier detection, power transformations (e.g., Boxâ Cox, Yeoâ Johnson), and imputation strategies ("mean", "median", "mice") for handling missing data. Bootstrap resampling is supported for selected tests to improve p-value accuracy in small samples. Diagnostic plots are available via both ggplot2 and interactive plotly visualizations. See Korkmaz et al. (2014) <https://journal.r-project.org/articles/RJ-2014-031/RJ-2014-031.pdf>.