This package is a port of the S+ "Robust Library". It provides methods for robust statistics, notably for robust regression and robust multivariate analysis.
Robustness -- eXperimental
', eXtraneous
', or eXtraordinary
Functionality for Robust Statistics. Hence methods which are not well established, often related to methods in package robustbase'. Amazingly, BACON()
', originally by Billor, Hadi, and Velleman (2000) <doi:10.1016/S0167-9473(99)00101-2> has become established in places. The "barrow wheel" `rbwheel()
` is from Stahel and Mächler (2009) <doi:10.1111/j.1467-9868.2009.00706.x>.
Computationally efficient tool for performing variable selection and obtaining robust estimates, which implements robust variable selection procedure proposed by Wang, X., Jiang, Y., Wang, S., Zhang, H. (2013) <doi:10.1080/01621459.2013.766613>. Users can enjoy the near optimal, consistent, and oracle properties of the procedures.
Inference for the treatment effect with possibly invalid instrumental variables via TSHT('Guo et al. (2016) <arXiv:1603.05224>
) and SearchingSampling('Guo
(2021) <arXiv:2104.06911>
), which are effective for both low- and high-dimensional covariates and instrumental variables; test of endogeneity in high dimensions ('Guo et al. (2016) <arXiv:1609.06713>
).
Robust methods for high-dimensional data, in particular linear model selection techniques based on least angle regression and sparse regression. Specifically, the package implements robust least angle regression (Khan, Van Aelst & Zamar, 2007; <doi:10.1198/016214507000000950>), (robust) groupwise least angle regression (Alfons, Croux & Gelper, 2016; <doi:10.1016/j.csda.2015.02.007>), and sparse least trimmed squares regression (Alfons, Croux & Gelper, 2013; <doi:10.1214/12-AOAS575>).
Robust mixture discriminant analysis (RMDA), proposed in Bouveyron & Girard, 2009 <doi:10.1016/j.patcog.2009.03.027>, allows to build a robust supervised classifier from learning data with label noise. The idea of the proposed method is to confront an unsupervised modeling of the data with the supervised information carried by the labels of the learning data in order to detect inconsistencies. The method is able afterward to build a robust classifier taking into account the detected inconsistencies into the labels.
Robust tests (RW and RF) are provided for testing the equality of two long-tailed symmetric (LTS) means when the variances are unknown and arbitrary. RW test is a robust version of Welch's two sample t test and the RF is a robust fiducial based test. The RW and RF tests are proposed using the adaptive modified maximum likelihood (AMML) estimators derived by Tiku and Surucu (2009) <doi:10.1016/j.spl.2008.12.001> and Donmez (2010) <https://open.metu.edu.tr/bitstream/handle/11511/19440/index.pdf>.
Outliers virtually exist in any datasets of any application field. To avoid the impact of outliers, we need to use robust estimators. Classical estimators of multivariate mean and covariance matrix are the sample mean and the sample covariance matrix. Outliers will affect the sample mean and the sample covariance matrix, and thus they will affect the classical factor analysis which depends on the classical estimators (Pison, G., Rousseeuw, P.J., Filzmoser, P. and Croux, C. (2003) <doi:10.1016/S0047-259X(02)00007-6>). So it is necessary to use the robust estimators of the sample mean and the sample covariance matrix. There are several robust estimators in the literature: Minimum Covariance Determinant estimator, Orthogonalized Gnanadesikan-Kettenring, Minimum Volume Ellipsoid, M, S, and Stahel-Donoho. The most direct way to make multivariate analysis more robust is to replace the sample mean and the sample covariance matrix of the classical estimators to robust estimators (Maronna, R.A., Martin, D. and Yohai, V. (2006) <doi:10.1002/0470010940>) (Todorov, V. and Filzmoser, P. (2009) <doi:10.18637/jss.v032.i03>), which is our choice of robust factor analysis. We created an object oriented solution for robust factor analysis based on new S4 classes.
Linear regression functions using Huber and bisquare psi functions. Optimal weights are calculated using IRLS algorithm.
Collection of methods for robust covariance and (sparse) precision matrix estimation based on Loh and Tan (2018) <doi:10.1214/18-EJS1427>.
R functions for the computation of the truncated maximum likelihood and the robust accelerated failure time regression for gaussian and log-Weibull case.
This package implements the Robust Scoring Equations estimator to fit linear mixed effects models robustly. Robustness is achieved by modification of the scoring equations combined with the Design Adaptive Scale approach.
Testing homogeneity for generalized exponential tilt model. This package includes a collection of functions for (1) implementing methods for testing homogeneity for generalized exponential tilt model; and (2) implementing existing methods under comparison.
This provides a robust estimator for stochastic frontier models, employing the Minimum Density Power Divergence Estimator (MDPDE) for enhanced robustness against outliers. Additionally, it includes a function to recommend the optimal tuning parameter, alpha, which controls the robustness of the MDPDE. The methods implemented in this package are based on Song et al. (2017) <doi:10.1016/j.csda.2016.08.005>.
Data sets are often corrupted by outliers. When data are multivariate outliers can be classified as case-wise or cell-wise. The latters are particularly challenge to handle. We implement a robust estimation procedure for Seemingly Unrelated Regression Models which is able to cope well with both type of outliers. Giovanni Saraceno, Fatemah Alqallaf, Claudio Agostinelli (2021) <doi:10.48550/arXiv.2107.00975>
.
Implementation of corrected two-sample tests. A corrected version of the Pearson and Kendall correlation tests, the Mann-Whitney (Wilcoxon) rank sum test, the Wilcoxon signed rank test and a variance test are implemented. The package also proposes a test for the median and an independence test between two continuous variables of Kolmogorov-Smirnov's type. All these corrected tests are asymptotically calibrated in the sense that the probability of rejection under the null hypothesis is asymptotically equal to the level of the test. See <doi:10.48550/arXiv.2211.08784>
for more details on the statistical tests.
This package provides a collection of functions to compute the Rao-Stirling diversity index (Porter and Rafols, 2009) <DOI:10.1007/s11192-008-2197-2> and its extension to acknowledge missing data (i.e., uncategorized references) by calculating its interval of uncertainty using mathematical optimization as proposed in Calatrava et al. (2016) <DOI:10.1007/s11192-016-1842-4>. The Rao-Stirling diversity index is a well-established bibliometric indicator to measure the interdisciplinarity of scientific publications. Apart from the obligatory dataset of publications with their respective references and a taxonomy of disciplines that categorizes references as well as a measure of similarity between the disciplines, the Rao-Stirling diversity index requires a complete categorization of all references of a publication into disciplines. Thus, it fails for a incomplete categorization; in this case, the robust extension has to be used, which encodes the uncertainty caused by missing bibliographic data as an uncertainty interval. Classification / ACM - 2012: Information systems ~ Similarity measures, Theory of computation ~ Quadratic programming, Applied computing ~ Digital libraries and archives.
This package analyzes data with robust methods such as regression methodology including model selections and multivariate statistics.
Bayesian robust fitting of linear mixed effects models through weighted likelihood equations and approximate Bayesian computation as proposed by Ruli et al. (2017) <arXiv:1706.01752>
.
This package implements two-sample tests for paired data with missing values (Fong, Huang, Lemos and McElrath
2018, Biostatics, <doi:10.1093/biostatistics/kxx039>) and modified Wilcoxon-Mann-Whitney two sample location test, also known as the Fligner-Policello test.
Robust parameter estimation and prediction of Gaussian stochastic process emulators. It allows for robust parameter estimation and prediction using Gaussian stochastic process emulator. It also implements the parallel partial Gaussian stochastic process emulator for computer model with massive outputs See the reference: Mengyang Gu and Jim Berger, 2016, Annals of Applied Statistics; Mengyang Gu, Xiaojing Wang and Jim Berger, 2018, Annals of Statistics.
Robust inference methods for fixed-effect and random-effects models of meta-analysis are implementable. The robust methods are developed using the density power divergence that is a robust estimating criterion developed in machine learning theory, and can effectively circumvent biases and misleading results caused by influential outliers. The density power divergence is originally introduced by Basu et al. (1998) <doi:10.1093/biomet/85.3.549>, and the meta-analysis methods are developed by Noma et al. (2022) <forthcoming>.
An implementation of easy tools for outlier robust inference in two-stage least squares (2SLS) models. The user specifies a reference distribution against which observations are classified as outliers or not. After removing the outliers, adjusted standard errors are automatically provided. Furthermore, several statistical tests for the false outlier detection rate can be calculated. The outlier removing algorithm can be iterated a fixed number of times or until the procedure converges. The algorithms and robust inference are described in more detail in Jiao (2019) <https://drive.google.com/file/d/1qPxDJnLlzLqdk94X9wwVASptf1MPpI2w/view>
.
This package provides functions for fitting a linear regression model with ARIMA errors using a filtered tau-estimate. The methodology is described in Maronna et al (2017, ISBN:9781119214687).