Based on different statistical definitions of discrimination, several methods have been proposed to detect and mitigate social inequality in machine learning models. This package aims to provide an alternative to fairness treatment in predictive models. The ROC method implemented in this package is described by Kamiran, Karim and Zhang (2012) <https://ieeexplore.ieee.org/document/6413831/>.
Preprocess numeric data matrices into desired transformed representations. Standardization, Unitization, Cubitization and adaptive intervals are offered.
Multi-state models are essential tools in longitudinal data analysis. One primary goal of these models is the estimation of transition probabilities, a critical metric for predicting clinical prognosis across various stages of diseases or medical conditions. Traditionally, inference in multi-state models relies on the Aalen-Johansen (AJ) estimator which is consistent under the Markov assumption. However, in many practical applications, the Markovian nature of the process is often not guaranteed, limiting the applicability of the AJ estimator in more complex scenarios. This package extends the landmark Aalen-Johansen estimator (Putter, H, Spitoni, C (2018) <doi:10.1177/0962280216674497>) incorporating presmoothing techniques described by Soutinho, Meira-Machado and Oliveira (2020) <doi:10.1080/03610918.2020.1762895>, offering a robust alternative for estimating transition probabilities in non-Markovian multi-state models with multiple states and potential reversible transitions.
This package provides a library of core pre-processing and normalization routines.
Consider a linear predictive regression setting with a potentially large set of candidate predictors. This work is concerned with detecting the presence of out of sample predictability based on out of sample mean squared error comparisons given in Gonzalo and Pitarakis (2023) <doi:10.1016/j.ijforecast.2023.10.005>.
An implementation of reliability estimation methods described in the paper (Bosnic, Z., & Kononenko, I. (2008) <doi:10.1007/s10489-007-0084-9>), which allows you to test the reliability of a single predicted instance made by your model and prediction function. It also allows you to make a correlation test to estimate which reliability estimate is the most accurate for your model.
This package provides a set of functions useful when evaluating the results of presence-absence models. Package includes functions for calculating threshold dependent measures such as confusion matrices, pcc, sensitivity, specificity, and Kappa, and produces plots of each measure as the threshold is varied. It will calculate optimal threshold choice according to a choice of optimization criteria. It also includes functions to plot the threshold independent ROC curves along with the associated AUC (area under the curve).
This package provides a selection of tools that make it easier to place elements onto a (base R) plot exactly where you want them. It allows users to identify points and distances on a plot in terms of inches, pixels, margin lines, data units, and proportions of the plotting space, all in a manner more simple than manipulating par()
.
This package provides a common problem faced by journal reviewers and authors is the question of whether the results of a replication study are consistent with the original published study. One solution to this problem is to examine the effect size from the original study and generate the range of effect sizes that could reasonably be obtained (due to random sampling) in a replication attempt (i.e., calculate a prediction interval). This package has functions that calculate the prediction interval for the correlation (i.e., r), standardized mean difference (i.e., d-value), and mean.
Dynamize headers or R code within Rmd files to prevent proliferation of Rmd files for similar reports. Add in external HTML document within rmarkdown rendered HTML doc.
In this record linkage package, data preprocessing has been meticulously executed to cover a wide range of datasets, ensuring that variable names are standardized using synonyms. This approach facilitates seamless data integration and analysis across various datasets. While users have the flexibility to modify variable names, the system intelligently ensures that changes are only permitted when they do not compromise data consistency or essential variable essence.