This package contain data sets and utilities from Project MOSAIC used to teach mathematics, statistics, computation and modeling. Project MOSAIC is a community of educators working to tie together aspects of quantitative work that students in science, technology, engineering and mathematics will need in their professional lives, but which are usually taught in isolation, if at all.
The tictoc package provides the timing functions tic
and toc
that can be nested. It provides an alternative to system.time()
with a different syntax similar to that in another well-known software package. tic
and toc
are easy to use, and are especially useful when timing several sections in more than a few lines of code.
This package implements two methods of estimating runs scored in a softball scenario: (1) theoretical expectation using discrete Markov chains and (2) empirical distribution using multinomial random simulation. Scores are based on player-specific input probabilities (out, single, double, triple, walk, and homerun). Optional inputs include probability of attempting a steal, probability of succeeding in an attempted steal, and an indicator of whether a player is "fast" (e.g. the player could stretch home). These probabilities may be calculated from common player statistics that are publicly available on team's webpages. Scores are evaluated based on a nine-player lineup and may be used to compare lineups, evaluate base scenarios, and compare the offensive potential of individual players. Manuscript forthcoming. See Bukiet & Harold (1997) <doi:10.1287/opre.45.1.14> for implementation of discrete Markov chains.
Collection of methods for rating matrix completion, which is a statistical framework for recommender systems. Another relevant application is the imputation of rating-scale survey data in the social and behavioral sciences. Note that matrix completion and imputation are synonymous terms used in different streams of the literature. The main functionality implements robust matrix completion for discrete rating-scale data with a low-rank constraint on a latent continuous matrix (Archimbaud, Alfons, and Wilms (2025) <doi:10.48550/arXiv.2412.20802>
). In addition, the package provides wrapper functions for softImpute
(Mazumder, Hastie, and Tibshirani, 2010, <https://www.jmlr.org/papers/v11/mazumder10a.html>; Hastie, Mazumder, Lee, Zadeh, 2015, <https://www.jmlr.org/papers/v16/hastie15a.html>) for easy tuning of the regularization parameter, as well as benchmark methods such as median imputation and mode imputation.
The functions in this package compute robust estimators by minimizing a kernel-based distance known as MMD (Maximum Mean Discrepancy) between the sample and a statistical model. Recent works proved that these estimators enjoy a universal consistency property, and are extremely robust to outliers. Various optimization algorithms are implemented: stochastic gradient is available for most models, but the package also allows gradient descent in a few models for which an exact formula is available for the gradient. In terms of distribution fit, a large number of continuous and discrete distributions are available: Gaussian, exponential, uniform, gamma, Poisson, geometric, etc. In terms of regression, the models available are: linear, logistic, gamma, beta and Poisson. Alquier, P. and Gerber, M. (2024) <doi:10.1093/biomet/asad031> Cherief-Abdellatif, B.-E. and Alquier, P. (2022) <doi:10.3150/21-BEJ1338>.
ATPOL is a rectangular grid system used for botanical studies in Poland. The ATPOL grid was developed in Institute of Botany, Jagiellonian University, Krakow, Poland in 70. Since then it is widely used to represent distribution of plants in Poland. atpolR
provides functions to translate geographic coordinates to the grid and vice versa. It also allows to create a choreograph map.
Allows the user to manage easily R packages removal and installation. It offers many functions to display installed packages according to specific dates and removes them if needed. The user is always prompted when running the removal functions in order to confirm the required action. It also provides functions that will install Github starred R packages whether available on CRAN or not.
Compile inline C code and easily call with automatically generated wrapper functions. By allowing user-defined headers and compilation flags (preprocessor, compiler and linking flags) the user can configure optimization options and linking to third party libraries. Multiple functions may be defined in a single block of code - which may be defined in a string or a path to a source file.
Model-free selection of covariates under unconfoundedness for situations where the parameter of interest is an average causal effect. This package is based on model-free backward elimination algorithms proposed in de Luna, Waernbaum and Richardson (2011). Marginal co-ordinate hypothesis testing is used in situations where all covariates are continuous while kernel-based smoothing appropriate for mixed data is used otherwise.
This package provides a function to query and extract data from the US Energy Information Administration ('EIA') API V2 <https://www.eia.gov/opendata/>. The EIA API provides a variety of information, in a time series format, about the energy sector in the US. The API is open, free, and requires an access key and registration at <https://www.eia.gov/opendata/>.
Systematic fit of hundreds of theoretical univariate distributions to empirical data via maximum likelihood estimation. Fits are reported and summarized by a data.frame, a csv file or a shiny app (here with additional features like visual representation of fits). All output formats provide assessment of goodness-of-fit by the following methods: Kolmogorov-Smirnov test, Shapiro-Wilks test, Anderson-Darling test.
It allows running gretl (<http://gretl.sourceforge.net/index.html>) program from R, R Markdown and Quarto. gretl ('Gnu Regression, Econometrics', and Time-series Library) is a statistical software for Econometric analysis. This package does not only integrate gretl and R but also serves as a gretl Knit-Engine for knitr package. Write all your gretl commands in R', R Markdown chunk.
Using overlap grouped-lasso penalties, gamsel selects whether a term in a gam is nonzero, linear, or a non-linear spline (up to a specified max df per variable). It fits the entire regularization path on a grid of values for the overall penalty lambda, both for gaussian and binomial families. See <doi:10.48550/arXiv.1506.03850>
for more details.
Simulation, estimation and testing for geopolitical volatility (GEOVOL) based on the global common volatility model of Engle and Campos-Martins (2023) <doi:10.1016/j.jfineco.2022.09.009>. GEOVOL is modelled as a latent multiplicative volatility factor with heterogeneous factor loadings. Estimation is carried out as a maximization-maximization procedure, where GEOVOL and the GEOVOL loadings are estimated iteratively until convergence.
Aligning multiple visualisations by utilising generalised orthogonal Procrustes analysis (GPA) before combining coordinates into a single biplot display as described in Nienkemper-Swanepoel, le Roux and Lubbe (2023)<doi:10.1080/03610918.2021.1914089>. This is mainly suitable to combine visualisations constructed from multiple imputations, however, it can be generalised to combine variations of visualisations from the same datasets (i.e. resamples).
GitHub
apps provide a powerful way to manage fine grained programmatic access to specific git repositories, without having to create dummy users, and which are safer than a personal access token for automated tasks. This package extends the gh package to let you authenticate and interact with GitHub
<https://docs.github.com/en/rest/overview> in R as an app.
This package provides a key-value store data structure. The keys are integers and the values can be any R object. This is like a list but indexed by a set of integers, not necessarily contiguous and possibly negative. The implementation uses a R6 class. These containers are not faster than lists but their usage can be more convenient for certain situations.
SQL back-end to dplyr for Apache Impala, the massively parallel processing query engine for Apache Hadoop'. Impala enables low-latency SQL queries on data stored in the Hadoop Distributed File System (HDFS)', Apache HBase', Apache Kudu', Amazon Simple Storage Service (S3)', Microsoft Azure Data Lake Store (ADLS)', and Dell EMC Isilon'. See <https://impala.apache.org> for more information about Impala.
Computes and decomposes Gini, Bonferroni and Zenga 2007 point and synthetic concentration indexes. Decompositions are intended: by sources, by subpopulations and by sources and subpopulations jointly. References, Zenga M. M.(2007) <doi:10.1400/209575> Zenga M. (2015) <doi:10.1400/246627> Zenga M., Valli I. (2017) <doi:10.26350/999999_000005> Zenga M., Valli I. (2018) <doi:10.26350/999999_000011>.
The goal of LCMSQA is to make it easy to check the quality of liquid chromatograph/mass spectrometry (LC/MS) experiments using a shiny application. This package provides interactive data visualizations for quality control (QC) samples, including total ion current chromatogram (TIC), base peak chromatogram (BPC), mass spectrum, extracted ion chromatogram (XIC), and feature detection results from internal standards or known metabolites.
This package provides a set of utility functions for analysing and modelling data from continuous report short-term memory experiments using either the 2-component mixture model of Zhang and Luck (2008) <doi:10.1038/nature06860> or the 3-component mixture model of Bays et al. (2009) <doi:10.1167/9.10.7>. Users are also able to simulate from these models.
This package provides a HTML widget rendering the Monaco editor. The Monaco editor is the code editor which powers VS Code'. It is particularly well developed for JavaScript
'. In addition to the built-in features of the Monaco editor, the widget allows to prettify multiple languages, to view the HTML rendering of Markdown code, and to view and resize SVG images.
It contains the function to apply MARMoT
balancing technique discussed in: Silan, Boccuzzo, Arpino (2021) <DOI:10.1002/sim.9192>, Silan, Belloni, Boccuzzo, (2023) <DOI:10.1007/s10260-023-00695-0>; furthermore it contains a function for computing the Deloof's approximation of the average rank (and also a parallelized version) and a function to compute the Absolute Standardized Bias.
Inbreeding-purging analysis of pedigreed populations, including the computation of the inbreeding coefficient, partial, ancestral and purged inbreeding coefficients, and measures of the opportunity of purging related to the individual reduction of inbreeding load. In addition, functions to calculate the effective population size and other parameters relevant to population genetics are included. See López-Cortegano E. (2021) <doi:10.1093/bioinformatics/btab599>.