All the methods in this package generate a vector of uniform order statistics using a beta distribution and use an inverse cumulative distribution function for some distribution to give a vector of random order statistic variables for some distribution. This is much more efficient than using a loop since it is directly sampling from the order statistic distribution.
This package provides access to granular sub-national income data from the MCC-PIK Database Of Sub-national Economic Output (DOSE). The package downloads and processes the data from its open repository on Zenodo (<https://zenodo.org/records/13773040>). Functions are provided to fetch data at multiple geographic levels, match coordinates to administrative regions, and access associated geometries.
This package provides functions for obtaining p-values (for hypothesis tests), confidence intervals, and multivariate confidence sets. In particular, the method is compatible with differentially private dataset, as long as the privacy mechanism is known. For more details, see Awan and Wang (2024), "Simulation-based, Finite-sample Inference for Privatized Data", <doi:10.48550/arXiv.2303.05328>.
The employment of the Wavelet decomposition technique proves to be highly advantageous in the modelling of noisy time series data. Wavelet decomposition technique using the "haar" algorithm has been incorporated to formulate a hybrid Wavelet KNN (K-Nearest Neighbour) model for time series forecasting, as proposed by Anjoy and Paul (2017) <DOI:10.1007/s00521-017-3289-9>.
The genetic algorithm is designed to optimize wind farms of any shape. It requires a predefined amount of turbines, a unified rotor radius and an average wind speed value for each incoming wind direction. A terrain effect model can be included that downloads an SRTM elevation model and loads a Corine Land Cover raster to approximate surface roughness.
Extremely fast hashing of R objects using xxHash'. R objects are hashed via the standard serialization mechanism in R. Raw byte vectors and strings can be handled directly for compatibility with hashes created on other systems. This implementation is a wrapper around the xxHash C library which is available from <https://github.com/Cyan4973/xxHash>.
MSstats package provide tools for preprocessing, summarization and differential analysis of mass spectrometry (MS) proteomics data. Recently, some MS protocols enable acquisition of data sets that result in larger than memory quantitative data. MSstats functions are not able to process such data. MSstatsBig package provides additional converter functions that enable processing larger than memory data sets.
This package provides a tool that contains trained deep learning models for predicting effector proteins. deepredeff has been trained to identify effector proteins using a set of known experimentally validated effectors from either bacteria, fungi, or oomycetes. Documentation is available via several vignettes, and the paper by Kristianingsih and MacLean (2020) <doi:10.1101/2020.07.08.193250>.
This package implements several algorithms for bundling edges in networks and flow and metro map layouts. This includes force directed edge bundling <doi:10.1111/j.1467-8659.2009.01450.x>, a flow algorithm based on Steiner trees<doi:10.1080/15230406.2018.1437359> and a multicriteria optimization method for metro map layouts <doi:10.1109/TVCG.2010.24>.
This package provides a collection of functions that would help one to build features based on external data. Very useful for Data Scientists in data to day work. Many functions create features using parallel computation. Since the nitty gritty of parallel computation is hidden under the hood, the user need not worry about creating clusters and shutting them down.
R lists, especially nested lists, can be very difficult to visualize or represent. Sometimes str() is not enough, so this suite of htmlwidgets is designed to help see, understand, and maybe even modify your R lists. The function reactjson() requires a package reactR that can be installed from CRAN or <https://github.com/timelyportfolio/reactR>.
Efficient calculation of pseudo-ranks and (pseudo)-rank based test statistics. In case of equal sample sizes, pseudo-ranks and mid-ranks are equal. When used for inference mid-ranks may lead to paradoxical results. Pseudo-ranks are in general not affected by such a problem. See Happ et al. (2020, <doi:10.18637/jss.v095.c01>) for details.
For biparental, three and four-way crosses Identity by Descent (IBD) probabilities can be calculated using Hidden Markov Models and inheritance vectors following Lander and Green (<https://www.jstor.org/stable/29713>) and Huang (<doi:10.1073/pnas.1100465108>). One of a series of statistical genetic packages for streamlining the analysis of typical plant breeding experiments developed by Biometris.
This package provides bindings to Tree-sitter', an incremental parsing system for programming tools. Tree-sitter builds concrete syntax trees for source files of any language, and can efficiently update those syntax trees as the source file is edited. It also includes a robust error recovery system that provides useful parse results even in the presence of syntax errors.
This package provides a Shiny application for visualization, exploration, comparison, and filtering of CRISPR screens analyzed with MAGeCK RRA or MLE. Features include interactive plots with on-click labeling, full customization of plot aesthetics, data upload and/or download, and much more. Quickly and easily explore your CRISPR screen results and generate publication-quality figures in seconds.
This package provides functionality for performing divergence analysis as presented in Dinalankara et al, "Digitizing omics profiles by divergence from a baseline", PANS 2018. This allows the user to simplify high dimensional omics data into a binary or ternary format which encapsulates how the data is divergent from a specified baseline group with the same univariate or multivariate features.
This package allows to estimate missing values in DNA methylation data. methyLImp method is based on linear regression since methylation levels show a high degree of inter-sample correlation. Implementation is parallelised over chromosomes since probes on different chromosomes are usually independent. Mini-batch approach to reduce the runtime in case of large number of samples is available.
This package provides a convenient way to analyze and visualize PICRUSt2 output with pre-defined plots and functions. It allows for generating statistical plots about microbiome functional predictions and offers customization options. It features a one-click option for creating publication-level plots, saving time and effort in producing professional-grade figures. It streamlines the PICRUSt2 analysis and visualization process.
This package vendors an assortment of useful header-only C++ libraries. Bioconductor packages can use these libraries in their own C++ code by LinkingTo this package without introducing any additional dependencies. The use of a central repository avoids duplicate vendoring of libraries across multiple R packages, and enables better coordination of version updates across cohorts of interdependent C++ libraries.
The CommonMark specification defines a rationalized version of markdown syntax. This package uses the cmark reference implementation for converting markdown text into various formats including HTML, LaTeX and groff man. In addition, it exposes the markdown parse tree in XML format. The latest version of this package also adds support for Github extensions including tables, autolinks and strikethrough text.
This package provides a collection of functions helpful in learning the basic tenets of Bayesian statistical inference. It contains functions for summarizing basic one and two parameter posterior distributions and predictive distributions. It contains MCMC algorithms for summarizing posterior distributions defined by the user. It also contains functions for regression models, hierarchical models, Bayesian tests, and illustrations of Gibbs sampling.
Estimate group aggregates, where one can set user-defined conditions that each group of records must satisfy to be suitable for aggregation. If a group of records is not suitable, it is expanded using a collapsing scheme defined by the user. A paper on this package was published in the Journal of Statistical Software <doi:10.18637/jss.v112.i04>.
This package contains a range of functions covering the present development of the distributional method for the dichotomisation of continuous outcomes. The method provides estimates with standard error of a comparison of proportions (difference, odds ratio and risk ratio) derived, with similar precision, from a comparison of means. See the URL below or <arXiv:1809.03279> for more information.
Support for geostatistical analysis of multivariate data, in particular data with restrictions, e.g. positive amounts, compositions, distributional data, microstructural data, etc. It includes descriptive analysis and modelling for such data, both from a two-point Gaussian perspective and multipoint perspective. The methods mainly follow Tolosana-Delgado, Mueller and van den Boogaart (2018) <doi:10.1007/s11004-018-9769-3>.