Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel search send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
Fits latent Dirichlet allocation (LDA), supervised topic models, and multilevel supervised topic models for text data with multiple outcome variables. Core estimation routines are implemented in C++ using the Rcpp ecosystem. For topic models, see Blei et al. (2003) <https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf>. For supervised topic models, see Blei and McAuliffe (2007) <https://papers.nips.cc/paper_files/paper/2007/hash/d56b9fc4b0f1be8871f5e1c40c0067e7-Abstract.html>.
Statistical tests for validating multispecies coalescent gene tree simulators, using pairwise distances and rooted triple counts. See Allman ES, Baños HD, Rhodes JA 2023. Testing multispecies coalescent simulators using summary statistics, IEEE/ACM Trans Comput Biol Bioinformat, 20(2):1613â 1618. <doi:10.1109/TCBB.2022.3177956>.
Convenient wrapper functions for the analysis of matrix-assisted laser desorption/ionization-time-of-flight (MALDI-TOF) spectra data in order to select only representative spectra (also called cherry-pick). The package covers the preprocessing and dereplication steps (based on Strejcek, Smrhova, Junkova and Uhlik (2018) <doi:10.3389/fmicb.2018.01294>) needed to cluster MALDI-TOF spectra before the final cherry-picking step. It enables the easy exclusion of spectra and/or clusters to accommodate complex cherry-picking strategies. Alternatively, cherry-picking using taxonomic identification MALDI-TOF data is made easy with functions to import inconsistently formatted reports.
Using this package, one can determine the minimum sample size required so that the mean square error of the sample mean and the population mean of a distribution becomes less than some pre-determined epsilon, i.e. it helps the user to determine the minimum sample size required to attain the pre-fixed precision level by minimizing the difference between the sample mean and population mean.
Evaluate whether a microbiome sample is a mixture of two samples, by fitting a model for the number of read counts as a function of single nucleotide polymorphism (SNP) allele and the genotypes of two potential source samples. Lobo et al. (2021) <doi:10.1093/g3journal/jkab308>.
Given a set of data points, a clustering is defined as a disjoint partition where each pair of sets in a partition has no overlapping elements. This package provides 25 methods that play a role somewhat similar to distance or metric that measures similarity of two clusterings - or partitions. For a more detailed description, see Meila, M. (2005) <doi:10.1145/1102351.1102424>.
Allows the user to create graphs with multiple layers. The user can also modify the layers, the nodes, and the edges. The graph can also be visualized. Zaynab Hammoud and Frank Kramer (2018) <doi:10.3390/genes9110519>. More about multilayered graphs and their usage can be found in our review paper: Zaynab Hammoud and Frank Kramer (2020) <doi:10.1186/s41044-020-00046-0>.
The main function MMEst() performs (Restricted) Maximum Likelihood in a variance component mixed models using a Min-Max (MM) algorithm (Laporte, F., Charcosset, A. & Mary-Huard, T. (2022) <doi:10.1371/journal.pcbi.1009659>).
Implementation of commonly used p-value-based and parametric multiple testing procedures (computation of adjusted p-values and simultaneous confidence intervals) and parallel gatekeeping procedures based on the methodology presented in the book "Multiple Testing Problems in Pharmaceutical Statistics" (edited by Alex Dmitrienko, Ajit C. Tamhane and Frank Bretz) published by Chapman and Hall/CRC Press 2009.
This package provides a comprehensive collection of linkage methods for agglomerative hierarchical clustering on a matrix of proximity data (distances or similarities), returning a multifurcated dendrogram or multidendrogram. Multidendrograms can group more than two clusters when ties in proximity data occur, and therefore they do not depend on the order of the input data. Descriptive measures to analyze the resulting dendrogram are additionally provided. <doi:10.18637/jss.v114.i02>.
This package provides a suite of utility functions providing functionality commonly needed for production level projects such as logging, error handling, cache management and date-time parsing. Functions for date-time parsing and formatting require that time zones be specified explicitly, avoiding a common source of error when working with environmental time series.
Simplifies Brazilian names phonetically using a custom metaphoneBR algorithm that preserves ending vowels. Useful for name matching processing preserving gender information carried generally by ending vowels in Portuguese. Mation (2025) <doi:10.6082/uchicago.15104>.
Computing package for Multidimensional Poverty Index (MPI) using Alkire-Foster method. Given N individuals, each person has D indicators of deprivation, the package compute MPI value to represent the degree of poverty in a population. The inputs are 1) an N by D matrix, which has the element (i,j) represents whether an individual i is deprived in an indicator j (1 is deprived and 0 is not deprived), and 2) the deprivation threshold. The main output is the MPI value, which has the range between zero and one. MPI value is approaching one if almost all people are deprived in all indicators, and it is approaching zero if almost no people are deprived in any indicator. Please see Alkire S., Chatterjee, M., Conconi, A., Seth, S. and Ana Vaz (2014) <doi:10.35648/20.500.12413/11781/ii039> for The Alkire-Foster methodology.
In the omics data association studies, it is common to conduct the p-value corrections to control the false significance. Beyond the P-value corrections, E-value is recently studied to facilitate multiple testing correction based on V. Vovk and R. Wang (2021) <doi:10.1214/20-AOS2020>. This package provides E-value calculation for DNA methylation data and RNA-seq data. Currently, five data formats are supported: DNA methylation levels using DMR detection tools (BiSeq, DMRfinder, MethylKit, Metilene and other DNA methylation tools) and RNA-seq data. The relevant references are listed below: Katja Hebestreit and Hans-Ulrich Klein (2022) <doi:10.18129/B9.bioc.BiSeq>; Altuna Akalin et.al (2012) <doi:10.18129/B9.bioc.methylKit>.
An ensemble meta-prediction framework to integrate multiple regression models into a current study. Gu, T., Taylor, J.M.G. and Mukherjee, B. (2020) <arXiv:2010.09971>. A meta-analysis framework along with two weighted estimators as the ensemble of empirical Bayes estimators, which combines the estimates from the different external models. The proposed framework is flexible and robust in the ways that (i) it is capable of incorporating external models that use a slightly different set of covariates; (ii) it is able to identify the most relevant external information and diminish the influence of information that is less compatible with the internal data; and (iii) it nicely balances the bias-variance trade-off while preserving the most efficiency gain. The proposed estimators are more efficient than the naive analysis of the internal data and other naive combinations of external estimators.
Used for general multiple mediation analysis. The analysis method is described in Yu and Li (2022) (ISBN: 9780367365479) "Statistical Methods for Mediation, Confounding and Moderation Analysis Using R and SAS", published by Chapman and Hall/CRC; and Yu et al.(2017) <DOI:10.1016/j.sste.2017.02.001> "Exploring racial disparity in obesity: a mediation analysis considering geo-coded environmental factors", published on Spatial and Spatio-temporal Epidemiology, 21, 13-23.
Various utilities for the Multiplicative Multinomial distribution.
Researchers often need to calculate body-size growth rates for individuals that do not have associated age data. These growth rates are based on mark-recapture data where an individual was captured and measured at time 1 then recaptured and measured at time 2. The sizes at each time and amount of time between captures can be used to calculate growth rates. MRgrowth follows the approach in Edmonds et al. (2021) <doi:10.1371/journal.pone.0259978> and provides functions to calculate growth using three formulas, the Faben's reformulation of the von Bertalanffy formula, the Gompertz formula, and a logistic formula.
Create beautiful and customizable tables to summarize several statistical models side-by-side. Draw coefficient plots, multi-level cross-tabs, dataset summaries, balance tables (a.k.a. "Table 1s"), and correlation matrices. This package supports dozens of statistical models, and it can produce tables in HTML, LaTeX, Word, Markdown, PDF, PowerPoint, Excel, RTF, JPG, or PNG. Tables can easily be embedded in Rmarkdown or knitr dynamic documents. Details can be found in Arel-Bundock (2022) <doi:10.18637/jss.v103.i01>.
Electronic health records (EHR) linked with biorepositories are a powerful platform for translational studies. A major bottleneck exists in the ability to phenotype patients accurately and efficiently. Towards that end, we developed an automated high-throughput phenotyping method integrating International Classification of Diseases (ICD) codes and narrative data extracted using natural language processing (NLP). Specifically, our proposed method, called MAP (Map Automated Phenotyping algorithm), fits an ensemble of latent mixture models on aggregated ICD and NLP counts along with healthcare utilization. The MAP algorithm yields a predicted probability of phenotype for each patient and a threshold for classifying subjects with phenotype yes/no (See Katherine P. Liao, et al. (2019) <doi:10.1093/jamia/ocz066>.).
This package provides a comprehensive toolkit for missing person identification combining genetic and non-genetic evidence within a Bayesian framework. Computes likelihood ratios (LRs) for DNA profiles, biological sex, age, hair color, and birthdate evidence. Provides decision analysis tools including optimal LR thresholds, error rate calculations, and ROC curve visualization. Includes interactive Shiny applications for exploring evidence combinations. For methodological details see Marsico et al. (2023) <doi:10.1016/j.fsigen.2023.102891> and Marsico, Vigeland et al. (2021) <doi:10.1016/j.fsigen.2021.102519>.
This package provides tools for calculating I-Scores, a simple way to measure how successful minor political parties are at influencing the major parties in their environment. I-Scores are designed to be a more comprehensive measurement of minor party success than vote share and legislative seats won, the current standard measurements, which do not reflect the strategies that most minor parties employ. The procedure leverages the Manifesto Project's NLP model to identify the issue areas that sentences discuss, see Burst et al. (2024) <doi:10.25522/manifesto.manifestoberta.56topics.context.2024.1.1>, and the Wordfish algorithm to estimate the relative positions that platforms take on those issue areas, see Slapin and Proksch (2008) <doi:10.1111/j.1540-5907.2008.00338.x>.
This package implements model-robust standardization for cluster-randomized trials (CRTs). Provides functions that standardize user-specified regression models to estimate marginal treatment effects. The targets include the cluster-average and individual-average treatment effects, with utilities for variance estimation and example simulation datasets. Methods are described in Li, Tong, Fang, Cheng, Kahan, and Wang (2025) <doi:10.1002/sim.70270>.
This package performs meaningful subgrouping in a meta-analysis. This is a two-step process; first, use the iterative grouping functions (e.g., mgbin(), mgcont() ) to partition studies into statistically homogeneous clusters based on their effect size data. Second, use the meaning() function to analyze these new subgroups and understand their composition based on study-level characteristics (e.g., country, setting). This approach helps to uncover hidden structures in meta-analytic data and provide a deeper interpretation of heterogeneity.