This package implements multiple performance measures for supervised learning. It includes over 40 measures for regression and classification. Additionally, meta information about the performance measures can be queried, e.g. what the best and worst possible performances scores are.
mlr3learners
extends mlr3
and mlr3proba
with interfaces to essential machine learning packages on CRAN. This includes, but is not limited to: (penalized) linear and logistic regression, linear and quadratic discriminant analysis, k-nearest neighbors, naive Bayes, support vector machines, and gradient boosting.
Integrates fairness auditing and bias mitigation methods for the mlr3 ecosystem. This includes fairness metrics, reporting tools, visualizations and bias mitigation techniques such as "Reweighing" described in Kamiran, Calders (2012) <doi:10.1007/s10115-011-0463-8> and "Equalized Odds" described in Hardt et al. (2016) <https://papers.nips.cc/paper/2016/file/9d2682367c3935defcb1f9e247a97c0d-Paper.pdf>. Integration with mlr3 allows for auditing of ML models as well as convenient joint tuning of machine learning algorithms and debiasing methods.
This package implements methods for post-hoc analysis and visualisation of benchmark experiments, for mlr3 and beyond.
Extends the mlr3 package with a connector to the package batchtools'. This allows to run large-scale benchmark experiments on scheduled high-performance computing clusters.
This package implements a successive halving and hyperband optimization algorithm for the mlr3 ecosystem. The implementation in mlr3hyperband features improved scheduling and parallelizes the evaluation of configurations. The package includes tuners for hyperparameter optimization in mlr3tuning and optimizers for black-box optimization in bbotk.
mlr3pipelines
enriches mlr3
with a diverse set of pipelining operators (PipeOps) that can be composed into graphs. Operations exist for data preprocessing, model fitting, and ensemble learning. Graphs can themselves be treated as mlr3
Learners and can therefore be resampled, benchmarked, and tuned.
This package provides a supervised learning algorithm inputs a train set, and outputs a prediction function, which can be used on a test set. If each data point belongs to a subset (such as geographic region, year, etc), then how do we know if subsets are similar enough so that we can get accurate predictions on one subset, after training on Other subsets? And how do we know if training on All subsets would improve prediction accuracy, relative to training on the Same subset? SOAK, Same/Other/All K-fold cross-validation, <doi:10.48550/arXiv.2410.08643>
can be used to answer these question, by fixing a test subset, training models on Same/Other/All subsets, and then comparing test error rates (Same versus Other and Same versus All). Also provides code for estimating how many train samples are required to get accurate predictions on a test set.
An implementation of the Super Learner prediction algorithm from van der Laan, Polley, and Hubbard (2007) <doi:10.2202/1544-6115.1309 using the mlr3 framework.
This package is a collection of search spaces for hyperparameter optimization in the mlr3 ecosystem. It features ready-to-use search spaces for many popular machine learning algorithms. The search spaces are from scientific articles and work for a wide range of data sets.
Extends the mlr3 machine learning framework with spatio-temporal resampling methods to account for the presence of spatiotemporal autocorrelation (STAC) in predictor variables. STAC may cause highly biased performance estimates in cross-validation if ignored. A JSS article is available at <doi:10.18637/jss.v111.i07>.