Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
This package provides a basic, clear implementation of tree-based gradient boosting designed to illustrate the core operation of boosting models. Tuning parameters (such as stochastic subsampling, modified learning rate, or regularization) are not implemented. The only adjustable parameter is the number of training rounds. If you are looking for a high performance boosting implementation with tuning parameters, consider the xgboost package.
Motifs within biological sequences show a significant role. This package utilizes a user-defined threshold value (window size and similarity) to create consensus segments or motifs through local alignment of dynamic programming with gap and it calculates the frequency of each identified motif, offering a detailed view of their prevalence within the dataset. It allows for thorough exploration and understanding of sequence patterns and their biological importance.
This package provides a small package containing helper utilities for creating functions for computing statistics.
Individual gene expression patterns are encoded into a series of eigenvector patterns ('WGCNA package). Using the framework of linear model-based differential expression comparisons ('limma package), time-course expression patterns for genes in different conditions are compared and analyzed for significant pattern changes. For reference, see: Greenham K, Sartor RC, Zorich S, Lou P, Mockler TC and McClung CR. eLife. 2020 Sep 30;9(4). <doi:10.7554/eLife.58993>.
Detects and filters damaged cells in single-cell RNA sequencing (scRNA-seq) data using a novel approach inspired by DoubletFinder'. Damage is detected by measuring the extent to which cells deviate from artificially damaged profiles of themselves, simulated through the probabilistic escape of cytoplasmic RNA. As output, a damage score ranging from 0 to 1 is given for each cell providing an intuitive scale for filtering that is standardised across cell types, samples, and experiments.
This package provides several datasets used throughout the book "Sampling and Data Analysis Using R: Theory and Practice" by Islam (2025, ISBN:978-984-35-8644-5). The datasets support teaching and learning of statistical concepts such as sampling methods, descriptive analysis, estimation and basic data handling. These curated data objects allow instructors, students and researchers to reproduce examples, practice data manipulation and perform hands-on analysis using R.
Density surface modelling of line transect data. A Generalized Additive Model-based approach is used to calculate spatially-explicit estimates of animal abundance from distance sampling (also presence/absence and strip transect) data. Several utility functions are provided for model checking, plotting and variance estimation.
Mechanistically models/predicts the phenology (macro-phases) of 10 crop plants (trained on a big dataset over 80 years derived from the German weather service (DWD) <https://opendata.dwd.de/>). Can be applied for remote sensing purposes, dynamically check the best subset of available covariates for the given dataset and crop.
Facilitates the analysis of SNP (single nucleotide polymorphism) and silicodart (presence/absence) data. dartR.popgen provides a suit of functions to analyse such data in a population genetics context. It provides several functions to calculate population genetic metrics and to study population structure. Quite a few functions need additional software to be able to run (gl.run.structure(), gl.blast(), gl.LDNe()). You find detailed description in the help pages how to download and link the packages so the function can run the software. dartR.popgen is part of the the dartRverse suit of packages. Gruber et al. (2018) <doi:10.1111/1755-0998.12745>. Mijangos et al. (2022) <doi:10.1111/2041-210X.13918>.
Implementations of the multiple testing procedures for discrete tests described in the paper Döhler, Durand and Roquain (2018) "New FDR bounds for discrete and heterogeneous tests" <doi:10.1214/18-EJS1441>. The main procedures of the paper (HSU and HSD), their adaptive counterparts (AHSU and AHSD), and the HBR variant are available and are coded to take as input the results of a test procedure from package DiscreteTests', or a set of observed p-values and their discrete support under their nulls. A shortcut function to obtain such p-values and supports is also provided, along with a wrapper allowing to apply discrete procedures directly to data.
An efficient and convenient set of functions to perform differential network estimation through the use of alternating direction method of multipliers optimization with a variety of loss functions.
When visualising changes between two values over time, a strict linear interpolation can look jarring and unnatural. By applying a non-linear easing to the transition, the motion between values can appear smoother and more natural. This package includes functions for applying such non-linear easings to colors and numeric values, and is useful where smooth animated movement and transitions are desired.
The Demographic Table in R combines contingency table for categorical variables, mean and standard deviation for continuous variables. t-test, chi-square test and Fisher's exact test calculated the p-value of two groups. The standardized mean difference were performed with 95 % confident interval, and writing table into document file.
Fit a mixture of Discrete Laplace distributions using plain numerical optimisation. This package has similar applications as the disclapmix package that uses an EM algorithm.
An easy-to-use yet powerful system for plotting grouped data effect sizes. Various types of effect size can be estimated, then plotted together with a representation of the original data. Select from many possible data representations (box plots, violin plots, raw data points etc.), and combine as desired. Durga plots are implemented in base R, so are compatible with base R methods for combining plots, such as layout()'. See Khan & McLean (2023) <doi:10.1101/2023.02.06.526960>.
This package provides curated early warning data on landslides in Sri Lanka during the Ditwah storm. It includes structured, machine-readable tidy dataset. This is developed for education and research purposes.
This package provides a system for combining two diagnostic tests using various approaches that include statistical and machine-learning-based methodologies. These approaches are divided into four groups: linear combination methods, non-linear combination methods, mathematical operators, and machine learning algorithms. See the <https://biotools.erciyes.edu.tr/dtComb/> website for more information, documentation, and examples.
This package provides a simple syntax to change the default values for function arguments, whether they are in packages or defined locally.
Efficiently creates, manipulates, and subsets "dist" objects, commonly used in cluster analysis. Designed to minimise unnecessary conversions and computational overhead while enabling seamless interaction with distance matrices.
An RStudio addin for teaching and learning data manipulation using the dplyr package. You can learn each steps of data manipulation by clicking your mouse without coding. You can get resultant data (as a tibble') and the code for data manipulation.
Statistical methods and related graphical representations for the Desirability of Outcome Ranking (DOOR) methodology. The DOOR is a paradigm for the design, analysis, interpretation of clinical trials and other research studies based on the patient centric benefit risk evaluation. The package provides functions for generating summary statistics from individual level/summary level datasets, conduct DOOR probability-based inference, and visualization of the results. For more details of DOOR methodology, see Hamasaki and Evans (2025) <doi:10.1201/9781003390855>. For more explanation of the statistical methods and the graphics, see the technical document and user manual of the DOOR Shiny apps at <https://methods.bsc.gwu.edu>.
Generates DNA sequences based on Markov model techniques for matched sequences. This can be generalized to several sequences. The sequences (taxa) are then arranged in an evolutionary tree (phylogenetic tree) depicting how taxa diverge from their common ancestors. This gives the tests and estimation methods for the parameters of different models. Standard phylogenetic methods assume stationarity, homogeneity and reversibility for the Markov processes, and often impose further restrictions on the parameters.
Gives access to data visualisation methods that are relevant from the data scientist's point of view. The flagship idea of DataVisualizations is the mirrored density plot (MD-plot) for either classified or non-classified multivariate data published in Thrun, M.C. et al.: "Analyzing the Fine Structure of Distributions" (2020), PLoS ONE, <DOI:10.1371/journal.pone.0238835>. The MD-plot outperforms the box-and-whisker diagram (box plot), violin plot and bean plot and geom_violin plot of ggplot2. Furthermore, a collection of various visualization methods for univariate data is provided. In the case of exploratory data analysis, DataVisualizations makes it possible to inspect the distribution of each feature of a dataset visually through a combination of four methods. One of these methods is the Pareto density estimation (PDE) of the probability density function (pdf). Additionally, visualizations of the distribution of distances using PDE, the scatter-density plot using PDE for two variables as well as the Shepard density plot and the Bland-Altman plot are presented here. Pertaining to classified high-dimensional data, a number of visualizations are described, such as f.ex. the heat map and silhouette plot. A political map of the world or Germany can be visualized with the additional information defined by a classification of countries or regions. By extending the political map further, an uncomplicated function for a Choropleth map can be used which is useful for measurements across a geographic area. For categorical features, the Pie charts, slope charts and fan plots, improved by the ABC analysis, become usable. More detailed explanations are found in the book by Thrun, M.C.: "Projection-Based Clustering through Self-Organization and Swarm Intelligence" (2018) <DOI:10.1007/978-3-658-20540-9>.
Creating, and refining data nuggets. Data nuggets reduce a large dataset into a small collection of nuggets of data, each containing a center (location), weight (importance), and scale (variability) parameter. Data nugget centers are created by choosing observations in the dataset which are as equally spaced apart as possible. Data nugget weights are created by counting the number observations closest to a given data nugget center. We then say the data nugget contains these observations and the data nugget center is recalculated as the mean of these observations. Data nugget scales are created by calculating the trace of the covariance matrix of the observations contained within a data nugget divided by the dimension of the dataset. Data nuggets are refined by splitting data nuggets which have scales or shapes (defined as the ratio of the two largest eigenvalues of the covariance matrix of the observations contained within the data nugget) Reference paper: [1] Beavers, T. E., Cheng, G., Duan, Y., Cabrera, J., Lubomirski, M., Amaratunga, D., & Teigler, J. E. (2024). Data Nuggets: A Method for Reducing Big Data While Preserving Data Structure. Journal of Computational and Graphical Statistics, 1-21. [2] Cherasia, K. E., Cabrera, J., Fernholz, L. T., & Fernholz, R. (2022). Data Nuggets in Supervised Learning. \emphIn Robust and Multivariate Statistical Methods: Festschrift in Honor of David E. Tyler (pp. 429-449). Cham: Springer International Publishing.