Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
Fast distributed/parallel estimation for multinomial logistic regression via Poisson factorization and the gamlr package. For details see: Taddy (2015, AoAS), Distributed Multinomial Regression, <doi:10.48550/arXiv.1311.6139>.
Create shareable data sets from raw data files that contain protected elements. Relying on master crosswalk files that list restricted variables, package functions warn users about possible violations of data usage agreement and prevent writing protected elements.
This package implements the doubly robust distribution balancing weighting proposed by Katsumata (2024) <doi:10.1017/psrm.2024.23>, which improves the augmented inverse probability weighting (AIPW) by estimating propensity scores with estimating equations suitable for the pre-specified parameter of interest (e.g., the average treatment effects or the average treatment effects on the treated) and estimating outcome models with the estimated inverse probability weights. It also implements the covariate balancing propensity score proposed by Imai and Ratkovic (2014) <doi:10.1111/rssb.12027> and the entropy balancing weighting proposed by Hainmueller (2012) <doi:10.1093/pan/mpr025>, both of which use covariate balancing conditions in propensity score estimation. The point estimate of the parameter of interest and its uncertainty as well as coefficients for propensity score estimation and outcome regression are produced using the M-estimation. The same functions can be used to estimate average outcomes in missing outcome cases.
Several quality measurements for investigating the performance of dimensionality reduction methods are provided here. In addition a new quality measurement called Gabriel classification error is made accessible, which was published in Thrun, M. C., Märte, J., & Stier, Q: "Analyzing Quality Measurements for Dimensionality Reduction" (2023), Machine Learning and Knowledge Extraction (MAKE), <DOI:10.3390/make5030056>.
This package provides a metapackage that brings together a curated collection of R packages containing domain-specific datasets. It includes time series data, educational metrics, crime records, medical datasets, and oncology research data. Designed to provide researchers, analysts, educators, and data scientists with centralized access to structured and well-documented datasets, this metapackage facilitates reproducible research, data exploration, and teaching applications across a wide range of domains. Included packages: - timeSeriesDataSets': Time series data from economics, finance, energy, and healthcare. - educationR': Datasets related to education, learning outcomes, and school metrics. - crimedatasets': Datasets on global and local crime and criminal behavior. - MedDataSets': Datasets related to medicine, public health, treatments, and clinical trials. - OncoDataSets': Datasets focused on cancer research, survival, genetics, and biomarkers.
Quality control and formatting tools developed for the Copernicus Data Rescue Service. The package includes functions to handle the Station Exchange Format (SEF), various statistical tests for climate data at daily and sub-daily resolution, as well as functions to plot the data. For more information and documentation see <https://datarescue.climate.copernicus.eu/st_data-quality-control>.
An easy-to-use yet powerful system for plotting grouped data effect sizes. Various types of effect size can be estimated, then plotted together with a representation of the original data. Select from many possible data representations (box plots, violin plots, raw data points etc.), and combine as desired. Durga plots are implemented in base R, so are compatible with base R methods for combining plots, such as layout()'. See Khan & McLean (2023) <doi:10.1101/2023.02.06.526960>.
Data whitening is a widely used preprocessing step to remove correlation structure since statistical models often assume independence. Here we use a probabilistic model of the observed data to apply a whitening transformation. This Gaussian Inverse Wishart Empirical Bayes model substantially reduces computational complexity, and regularizes the eigen-values of the sample covariance matrix to improve out-of-sample performance.
Automatic generation of finite state machine models of dynamic decision-making that both have strong predictive power and are interpretable in human terms. We use an efficient model representation and a genetic algorithm-based estimation process to generate simple deterministic approximations that explain most of the structure of complex stochastic processes. We have applied the software to empirical data, and demonstrated it's ability to recover known data-generating processes by simulating data with agent-based models and correctly deriving the underlying decision models for multiple agent models and degrees of stochasticity.
An R interface to the codediff JavaScript library (a copy of which is included in the package, see <https://github.com/danvk/codediff.js> for information). Allows for visualization of the difference between 2 files, usually text files or R scripts, in a browser.
The DWD provides gridded radar data for Germany in binary format. dwdradar reads these files and enables a fast conversion into numerical format.
This package provides methods for analyzing the dispersion of tabular datasets with batched and ordered samples. Based on convex hull or integrated covariance Mahalanobis, several indicators are implemented for inter and intra batch dispersion analysis. It is designed to facilitate robust statistical assessment of data variability, supporting applications in exploratory data analysis and quality control, for such datasets as the one found in metabololomics studies. For more details see Salanon (2024) <doi:10.1016/j.chemolab.2024.105148> and Salanon (2025) <doi:10.1101/2025.08.01.668073>.
An implementation of common higher order functions with syntactic sugar for anonymous function. Provides also a link to dplyr and data.table for common transformations on data frames to work around non standard evaluation by default.
Implementation of DetMCD, a new algorithm for robust and deterministic estimation of location and scatter. The benefits of robust and deterministic estimation are explained in Hubert, Rousseeuw and Verdonck (2012) <doi:10.1080/10618600.2012.672100>.
Computes the first stage GMM estimate of a dynamic linear model with p lags of the dependent variables.
Datasets and functions that can be used for data analysis practice, homework and projects in data science courses and workshops. 26 datasets are available for case studies in data visualization, statistical inference, modeling, linear regression, data wrangling and machine learning.
Access diverse ggplot2'-compatible color palettes for simplified data visualization.
This package provides a concise check of the format of one or multiple input arguments (data type, length or value) is provided. Since multiple input arguments can be tested simultaneously, a lengthly list of checks at the beginning of your function can be avoided, hereby enhancing the readability and maintainability of your code.
Find, visualize and explore patterns of differential taxa in vegetation data (namely in a phytosociological table), using the Differential Value (DiffVal). Patterns are searched through mathematical optimization algorithms. Ultimately, Total Differential Value (TDV) optimization aims at obtaining classifications of vegetation data based on differential taxa, as in the traditional geobotanical approach (Monteiro-Henriques 2025, <doi:10.3897/VCS.140466>). The Gurobi optimizer, as well as the R package gurobi', can be installed from <https://www.gurobi.com/products/gurobi-optimizer/>. The useful vignette Gurobi Installation Guide, from package prioritizr', can be found here: <https://prioritizr.net/articles/gurobi_installation_guide.html>.
Transform newswire and earnings call transcripts as PDF obtained from Nexis Uni to R data frames. Various newswires and FairDisclosure earnings call formats are supported. Further, users can apply several pre-defined dictionaries on the data based on Graffin et al. (2016)<doi:10.5465/amj.2013.0288> and Gamache et al. (2015)<doi:10.5465/amj.2013.0377>.
Dynamic slicing is a method designed for dependency detection between a categorical variable and a continuous variable. It could be applied for non-parametric hypothesis testing and gene set enrichment analysis.
Efficient Global Optimization (EGO) algorithm as described in "Roustant et al. (2012)" <doi:10.18637/jss.v051.i01> and adaptations for problems with noise ("Picheny and Ginsbourger, 2012") <doi:10.1016/j.csda.2013.03.018>, parallel infill, and problems with constraints.
Efficiently create dummies of all factors and character vectors in a data frame. Support is included for learning the categories on one data set (e.g., a training set) and deploying them on another (e.g., a test set).
Implementation of the double/debiased machine learning framework of Chernozhukov et al. (2018) <doi:10.1111/ectj.12097> for partially linear regression models, partially linear instrumental variable regression models, interactive regression models and interactive instrumental variable regression models. DoubleML allows estimation of the nuisance parts in these models by machine learning methods and computation of the Neyman orthogonal score functions. DoubleML is built on top of mlr3 and the mlr3 ecosystem. The object-oriented implementation of DoubleML based on the R6 package is very flexible. More information available in the publication in the Journal of Statistical Software: <doi:10.18637/jss.v108.i03>.