Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
Fast and easy computation of Euclidean Minimum Spanning Trees (EMST) from data, relying on the R API for mlpack - the C++ Machine Learning Library (Curtin et. al., 2013). emstreeR uses the Dual-Tree Boruvka (March, Ram, Gray, 2010, <doi:10.1145/1835804.1835882>), which is theoretically and empirically the fastest algorithm for computing an EMST. This package also provides functions and an S3 method for readily visualizing Minimum Spanning Trees (MST) using either the style of the base', scatterplot3d', or ggplot2 libraries; and functions to export the MST output to shapefiles.
This package performs a compact genetic algorithm search to reduce errors-in-variables bias in linear regression. The algorithm estimates the regression parameters with lower biases and higher variances but mean-square errors (MSEs) are reduced.
An ensemble method for the statistical detection of a rare class in two-class classification problems. The method uses an ensemble of classifiers where the constituent models of the ensemble use disjoint subsets (phalanxes) of explanatory variables. We provide an implementation of the phalanx-formation algorithm. Please see Tomal et al. (2015) <doi:10.1214/14-AOAS778>, Tomal et al. (2016) <doi:10.1021/acs.jcim.5b00663>, and Tomal et al. (2019) <arXiv:1706.06971> for more details.
Analysis of experimental results and automatic report generation in both interactive HTML and LaTeX. This package ships with a rich interface for data modeling and built in functions for the rapid application of statistical tests and generation of common plots and tables with publish-ready quality.
An implementation of 1) the tail pairwise dependence matrix (TPDM) as described in Jiang & Cooley (2020) <doi:10.1175/JCLI-D-19-0413.1> 2) the extremal pattern index (EPI) as described in Szemkus & Friederichs ('Spatial patterns and indices for heatwave and droughts over Europe using a decomposition of extremal dependency'; submitted to ASCMO 2023).
Provide an optimal histogram, in the sense of probability density estimation and features detection, by means of multiscale variational inference. In other words, the resulting histogram servers as an optimal density estimator, and meanwhile recovers the features, such as increases or modes, with both false positive and false negative controls. Moreover, it provides a parsimonious representation in terms of the number of blocks, which simplifies data interpretation. The only assumption for the method is that data points are independent and identically distributed, so it applies to fairly general situations, including continuous distributions, discrete distributions, and mixtures of both. For details see Li, Munk, Sieling and Walther (2016) <arXiv:1612.07216>.
This package provides a variety of methods are provided to estimate and visualize distributional differences in terms of effect sizes. Particular emphasis is upon evaluating differences between two or more distributions across the entire scale, rather than at a single point (e.g., differences in means). For example, Probability-Probability (PP) plots display the difference between two or more distributions, matched by their empirical CDFs (see Ho and Reardon, 2012; <doi:10.3102/1076998611411918>), allowing for examinations of where on the scale distributional differences are largest or smallest. The area under the PP curve (AUC) is an effect-size metric, corresponding to the probability that a randomly selected observation from the x-axis distribution will have a higher value than a randomly selected observation from the y-axis distribution. Binned effect size plots are also available, in which the distributions are split into bins (set by the user) and separate effect sizes (Cohen's d) are produced for each bin - again providing a means to evaluate the consistency (or lack thereof) of the difference between two or more distributions at different points on the scale. Evaluation of empirical CDFs is also provided, with built-in arguments for providing annotations to help evaluate distributional differences at specific points (e.g., semi-transparent shading). All function take a consistent argument structure. Calculation of specific effect sizes is also possible. The following effect sizes are estimable: (a) Cohen's d, (b) Hedges g, (c) percentage above a cut, (d) transformed (normalized) percentage above a cut, (e) area under the PP curve, and (f) the V statistic (see Ho, 2009; <doi:10.3102/1076998609332755>), which essentially transforms the area under the curve to standard deviation units. By default, effect sizes are calculated for all possible pairwise comparisons, but a reference group (distribution) can be specified.
Reads water network simulation data in Epanet text-based .inp and .rpt formats into R. Also reads results from Epanet-msx'. Provides basic summary information and plots. The README file has a quick introduction. See <http://www2.epa.gov/water-research/epanet> for more information on the Epanet software for modeling hydraulic and water quality behavior of water piping systems.
This package performs frequentist inference for the extremal index of a stationary time series. Two types of methodology are used. One type is based on a model that relates the distribution of block maxima to the marginal distribution of series and leads to the semiparametric maxima estimators described in Northrop (2015) <doi:10.1007/s10687-015-0221-5> and Berghaus and Bucher (2018) <doi:10.1214/17-AOS1621>. Sliding block maxima are used to increase precision of estimation. A graphical block size diagnostic is provided. The other type of methodology uses a model for the distribution of threshold inter-exceedance times (Ferro and Segers (2003) <doi:10.1111/1467-9868.00401>). Three versions of this type of approach are provided: the iterated weight least squares approach of Suveges (2007) <doi:10.1007/s10687-007-0034-2>, the K-gaps model of Suveges and Davison (2010) <doi:10.1214/09-AOAS292> and a similar approach of Holesovsky and Fusek (2020) <doi:10.1007/s10687-020-00374-3> that we refer to as D-gaps. For the K-gaps and D-gaps models this package allows missing values in the data, can accommodate independent subsets of data, such as monthly or seasonal time series from different years, and can incorporate information from right-censored inter-exceedance times. Graphical diagnostics for the threshold level and the respective tuning parameters K and D are provided.
Analysis and visualization of similarities between epilepsy ontologies based on text mining results by comparing ranked lists of co-occurring drug terms in the BioASQ corpus. The ranked result lists of neurological drug terms co-occurring with terms from the epilepsy ontologies EpSO, ESSO, EPILONT, EPISEM and FENICS undergo further analysis. The source data to create the ranked lists of drug names is produced using the text mining workflows described in Mueller, Bernd and Hagelstein, Alexandra (2016) <doi:10.4126/FRL01-006408558>, Mueller, Bernd et al. (2017) <doi:10.1007/978-3-319-58694-6_22>, Mueller, Bernd and Rebholz-Schuhmann, Dietrich (2020) <doi:10.1007/978-3-030-43887-6_52>, and Mueller, Bernd et al. (2022) <doi:10.1186/s13326-021-00258-w>.
Structure mining from XGBoost and LightGBM models. Key functionalities of this package cover: visualisation of tree-based ensembles models, identification of interactions, measuring of variable importance, measuring of interaction importance, explanation of single prediction with break down plots (based on xgboostExplainer and iBreakDown packages). To download the LightGBM use the following link: <https://github.com/Microsoft/LightGBM>. EIX is a part of the DrWhy.AI universe.
Exploratory principal component analysis for large-scale dataset, including sparse principal component analysis and sparse matrix approximation.
Misc functions programmed by Eduard Szöcs. Provides read_regnie() to read gridded precipitation data from German Weather Service (DWD, see <http://www.dwd.de/> for more information).
Presents a statistical method that uses a recursive algorithm for signal extraction. The method handles a non-parametric estimation for the correlation of the errors. See "Krivobokova", "Serra", "Rosales" and "Klockmann" (2021) <arXiv:1812.06948> for details.
This is a (somewhat bizarre) collection of functions written to do various sorts of statistical election audits. There are also functions to generate simulated voting data, including methods to simulation different types of voting errors which allow for simulations for checking the characteristics of these methods.
By overloading the R help() function, this package allows users to use "docstring" style comments within their own defined functions. The package also provides additional functions to mimic the R basic example() function and the prototyping of packages.
This package provides a non-parametric framework based on estimation statistics principle. Its main purpose is to infer orders of empirical distributions from different categories based on a probability of finding a value in one distribution that is greater than an expectation of another distribution. Given a set of ordered-pair of real-category values the framework is capable of 1) inferring orders of domination of categories and representing orders in the form of a graph; 2) estimating magnitude of difference between a pair of categories in forms of mean-difference confidence intervals; and 3) visualizing domination orders and magnitudes of difference of categories. The publication of this package is at Chainarong Amornbunchornvej, Navaporn Surasvadi, Anon Plangprasopchok, and Suttipong Thajchayapong (2020) <doi:10.1016/j.heliyon.2020.e05435>.
The epilogi variable selection algorithm is implemented for the case of continuous response and predictor variables. The relevant paper is: Lakiotaki K., Papadovasilakis Z., Lagani V., Fafalios S., Charonyktakis P., Tsagris M. and Tsamardinos I. (2023). "Automated machine learning for Genome Wide Association Studies". Bioinformatics, 39(9): btad545. <doi:10.1093/bioinformatics/btad545>.
Modular implementation of the Differential Evolution algorithm for experimenting with different types of operators.
Utilities for managing egocentrically sampled network data and a wrapper around the ergm package to facilitate ERGM inference and simulation from such data. See Krivitsky and Morris (2017) <doi:10.1214/16-AOAS1010>.
This package provides a comprehensive collection of datasets related to education, covering topics such as student performance, learning methods, test scores, absenteeism, and other educational metrics. This package serves as a resource for educational researchers, data analysts, and statisticians to explore and analyze data in the field of education.
This package provides statistical methods for estimating bivariate dependency (correlation) from marginal summary statistics across multiple studies. The package supports three modules: (1) bivariate correlation estimation for binary outcomes, (2) bivariate correlation estimation for continuous outcomes, and (3) estimation of component-wise means and variances under a conditional two-component Gaussian mixture model for a continuous variable stratified by a binary class label. These methods enable privacy-preserving joint estimation when individual-level data are unavailable. The approaches are detailed in Shang, Tsao, and Zhang (2025a) <doi:10.48550/arXiv.2505.03995> and Shang, Tsao, and Zhang (2025b) <doi:10.48550/arXiv.2508.02057>.
This package contains utilities and functions for the cleaning, processing and management of patient level public health data for surveillance and analysis held by the UK Health Security Agency, UKHSA.
This package contains all data sets for Exam PA: Predictive Analytics at <https://exampa.net/>.