Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
Estimation of regression models for sparse asynchronous longitudinal observations, where time-dependent response and covariates are mismatched and observed intermittently within subjects. Kernel weighted estimating equations are used for generalized linear models with either time-invariant or time-dependent coefficients. Cao, H., Li, J., and Fine, J. P. (2016) <doi:10.1214/16-EJS1141>. Cao, H., Zeng, D., and Fine, J. P. (2015) <doi:10.1111/rssb.12086>.
An iterative implementation of a recursive binary partitioning algorithm to measure pairwise dependence with a modular design that allows user specification of the splitting logic and stop criteria. Helper functions provide suggested versions of both and support visualization and the computation of summary statistics on final binnings. For a thorough discussion and demonstration of the algorithm, see Salahub and Oldford (2025) <doi:10.1002/sam.70042>.
Automatic fixed rank kriging for (irregularly located) spatial data using a class of basis functions with multi-resolution features and ordered in terms of their resolutions. The model parameters are estimated by maximum likelihood (ML) and the number of basis functions is determined by Akaike's information criterion (AIC). For spatial data with either one realization or independent replicates, the ML estimates and AIC are efficiently computed using their closed-form expressions when no missing value occurs. Details regarding the basis function construction, parameter estimation, and AIC calculation can be found in Tzeng and Huang (2018) <doi:10.1080/00401706.2017.1345701>. For data with missing values, the ML estimates are obtained using the expectation- maximization algorithm. Apart from the number of basis functions, there are no other tuning parameters, making the method fully automatic. Users can also include a stationary structure in the spatial covariance, which utilizes LatticeKrig package.
This package provides tools for assessing and selecting auxiliary variables using LASSO. The package includes functions for variable selection and diagnostics, facilitating survey calibration analysis with emphasis on robust auxiliary vector selection. For more details see Tibshirani (1996) <doi:10.1111/j.2517-6161.1996.tb02080.x> and Caughrey and Hartman (2017) <doi:10.2139/ssrn.3494436>.
Collect your data on digital marketing campaigns from Amazon Ads using the Windsor.ai API <https://windsor.ai/api-fields/>.
This package provides a lightweight, dependency-free toolbox for pre-processing XY data from experimental methods (i.e. any signal that can be measured along a continuous variable). This package provides methods for baseline estimation and correction, smoothing, normalization, integration and peaks detection. Baseline correction methods includes polynomial fitting as described in Lieber and Mahadevan-Jansen (2003) <doi:10.1366/000370203322554518>, Rolling Ball algorithm after Kneen and Annegarn (1996) <doi:10.1016/0168-583X(95)00908-6>, SNIP algorithm after Ryan et al. (1988) <doi:10.1016/0168-583X(88)90063-8>, 4S Peak Filling after Liland (2015) <doi:10.1016/j.mex.2015.02.009> and more.
Causal discovery in linear structural equation models (Schultheiss, and Bühlmann (2023) <doi:10.1093/biomet/asad008>) and vector autoregressive models (Schultheiss, Ulmer, and Bühlmann (2025) <doi:10.1515/jci-2024-0011>) with explicit error control for false discovery, at least asymptotically.
This package provides adaptive direct sparse regression for high-dimensional multimodal data with heterogeneous missing patterns and measurement errors. AdapDISCOM extends the DISCOM framework with modality-specific adaptive weighting to handle varying data structures and error magnitudes across blocks. The method supports flexible block configurations (any K blocks) and includes robust variants for heavy-tailed distributions ('AdapDISCOM'-Huber) and fast implementations for large-scale applications (Fast-'AdapDISCOM'). Designed for realistic multimodal scenarios where different data sources exhibit distinct missing data patterns and contamination levels. Diakité et al. (2025) <doi:10.48550/arXiv.2508.00120>.
Circadian rhythms are rhythms that oscillate about every 24 h, which has been observed in multiple physiological processes including core body temperature, hormone secretion, heart rate, blood pressure, and many others. Measuring circadian rhythm with wearables is based on a principle that there is increased movement during wake periods and reduced movement during sleep periods, and has been shown to be reliable and valid. This package can be used to extract nonparametric circadian metrics like intradaily variability (IV), interdaily stability (IS), and relative amplitude (RA); and parametric cosinor model and extended cosinor model coefficient. Details can be found in Junrui Di et al (2019) <doi:10.1007/s12561-019-09236-4>.
This package implements a framework for creating boxplots where the whisker lengths are determined by formal multiple testing procedures, making them adaptive to sample size and data characteristics. The function bh_boxplot() generates boxplots that control the False Discovery Rate (FDR) via the Benjamini-Hochberg procedure, and the function holm_boxplot() generates boxplots that control the Family-Wise Error Rate (FWER) via the Holm procedure. The methods are based on the framework in Gang, Lin, and Tong (2025) <doi:10.48550/arXiv.2510.20259>.
We developed a lightweight machine learning tool for RNA profiling of acute lymphoblastic leukemia (ALL), however, it can be used for any problem where multiple classes need to be identified from multi-dimensional data. The methodology is described in Makinen V-P, Rehn J, Breen J, Yeung D, White DL (2022) Multi-cohort transcriptomic subtyping of B-cell acute lymphoblastic leukemia, International Journal of Molecular Sciences 23:4574, <doi:10.3390/ijms23094574>. The classifier contains optimized mean profiles of the classes (centroids) as observed in the training data, and new samples are matched to these centroids using the shortest Euclidean distance. Centroids derived from a dataset of 1,598 ALL patients are included, but users can train the models with their own data as well. The output includes both numerical and visual presentations of the classification results. Samples with mixed features from multiple classes or atypical values are also identified.
The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.
Loss reserving generally focuses on identifying a single model that can generate superior predictive performance. However, different loss reserving models specialise in capturing different aspects of loss data. This is recognised in practice in the sense that results from different models are often considered, and sometimes combined. For instance, actuaries may take a weighted average of the prediction outcomes from various loss reserving models, often based on subjective assessments. This package allows for the use of a systematic framework to objectively combine (i.e. ensemble) multiple stochastic loss reserving models such that the strengths offered by different models can be utilised effectively. Our framework is developed in Avanzi et al. (2023). Firstly, our criteria model combination considers the full distributional properties of the ensemble and not just the central estimate - which is of particular importance in the reserving context. Secondly, our framework is that it is tailored for the features inherent to reserving data. These include, for instance, accident, development, calendar, and claim maturity effects. Crucially, the relative importance and scarcity of data across accident periods renders the problem distinct from the traditional ensemble techniques in statistical learning. Our framework is illustrated with a complex synthetic dataset. In the results, the optimised ensemble outperforms both (i) traditional model selection strategies, and (ii) an equally weighted ensemble. In particular, the improvement occurs not only with central estimates but also relevant quantiles, such as the 75th percentile of reserves (typically of interest to both insurers and regulators). Reference: Avanzi B, Li Y, Wong B, Xian A (2023) "Ensemble distributional forecasting for insurance loss reserving" <doi:10.48550/arXiv.2206.08541>.
Analysis of dyadic network and relational data using additive and multiplicative effects (AME) models. The basic model includes regression terms, the covariance structure of the social relations model (Warner, Kenny and Stoto (1979) <DOI:10.1037/0022-3514.37.10.1742>, Wong (1982) <DOI:10.2307/2287296>), and multiplicative factor models (Hoff(2009) <DOI:10.1007/s10588-008-9040-4>). Several different link functions accommodate different relational data structures, including binary/network data, normal relational data, zero-inflated positive outcomes using a tobit model, ordinal relational data and data from fixed-rank nomination schemes. Several of these link functions are discussed in Hoff, Fosdick, Volfovsky and Stovel (2013) <DOI:10.1017/nws.2013.17>. Development of this software was supported in part by NIH grant R01HD067509.
Computes various stability parameters from Additive Main Effects and Multiplicative Interaction (AMMI) analysis results such as Modified AMMI Stability Value (MASV), Sums of the Absolute Value of the Interaction Principal Component Scores (SIPC), Sum Across Environments of Genotype-Environment Interaction Modelled by AMMI (AMGE), Sum Across Environments of Absolute Value of Genotype-Environment Interaction Modelled by AMMI (AV_(AMGE)), AMMI Stability Index (ASI), Modified ASI (MASI), AMMI Based Stability Parameter (ASTAB), Annicchiarico's D Parameter (DA), Zhang's D Parameter (DZ), Averages of the Squared Eigenvector Values (EV), Stability Measure Based on Fitted AMMI Model (FA), Absolute Value of the Relative Contribution of IPCs to the Interaction (Za). Further calculates the Simultaneous Selection Index for Yield and Stability from the computed stability parameters. See the vignette for complete list of citations for the methods implemented.
This package provides a pipeable, transparent implementation of areal weighted interpolation with support for interpolating multiple variables in a single function call. These tools provide a full-featured workflow for validation and estimation that fits into both modern data management (e.g. tidyverse) and spatial data (e.g. sf) frameworks.
Increasingly powerful techniques for high-throughput sequencing open the possibility to comprehensively characterize microbial communities, including rare species. However, a still unresolved issue are the substantial error rates in the experimental process generating these sequences. To overcome these limitations we propose an approach, where each sample is split and the same amplification and sequencing protocol is applied to both halves. This procedure should allow to detect likely PCR and sequencing artifacts, and true rare species by comparison of the results of both parts. The AmpliconDuo package, whereas amplicon duo from here on refers to the two amplicon data sets of a split sample, is intended to help interpret the obtained read frequency distribution across split samples, and to filter the false positive reads.
Browse through a continuously updated list of existing RStudio addins and install/uninstall their corresponding packages.
Interface to the Azure Machine Learning Software Development Kit ('SDK'). Data scientists can use the SDK to train, deploy, automate, and manage machine learning models on the Azure Machine Learning service. To learn more about Azure Machine Learning visit the website: <https://docs.microsoft.com/en-us/azure/machine-learning/service/overview-what-is-azure-ml>.
Estimate the AUC using a variety of methods as follows: (1) frequentist nonparametric methods based on the Mann-Whitney statistic or kernel methods. (2) frequentist parametric methods using the likelihood ratio test based on higher-order asymptotic results, the signed log-likelihood ratio test, the Wald test, or the approximate t solution to the Behrens-Fisher problem. (3) Bayesian parametric MCMC methods.
Computes asymmetric LD measures (ALD) for multi-allelic genetic data. These measures are identical to the correlation measure (r) for bi-allelic data.
Plots simulation results of clinical trials. Its main feature is allowing users to simultaneously investigate the impact of several simulation input dimensions through dynamic filtering of the simulation results. A more detailed description of the app can be found in Meyer et al. <DOI:10.1016/j.softx.2023.101347> or the vignettes on GitHub'.
Automated data quality auditing using unsupervised machine learning. Provides AI-driven anomaly detection for data quality assessment, primarily designed for Electronic Health Records (EHR) data, with benchmarking capabilities for validation and publication. Methods based on: Liu et al. (2008) <doi:10.1109/ICDM.2008.17>, Breunig et al. (2000) <doi:10.1145/342009.335388>.
Interface to Altair <https://altair-viz.github.io>, which itself is a Python interface to Vega-Lite <https://vega.github.io/vega-lite/>. This package uses the Reticulate framework <https://rstudio.github.io/reticulate/> to manage the interface between R and Python'.