This package provides a fast and computationally efficient algorithm designed to enable researchers to efficiently and quickly extract semantically-related keywords using a fitted embedding model. For more details about the methods applied, see Chester (2025). <doi:10.17605/OSF.IO/5B7RQ>.
Offers a graphical user interface for the evaluation of inter-rater agreement with Cohen's and Fleiss Kappa. The calculation of kappa statistics is done using the R package irr', so that KappaGUI is essentially a Shiny front-end for irr'.
Helper functions for Org files (<https://orgmode.org/>): a generic function toOrg for transforming R objects into Org markup (most useful for data frames; there are also methods for Dates/POSIXt) and a function to read Org tables into data frames.
This package provides functions to access and collect data from the Brazilian Federal Senate open data API and website. Covers senators, legislative materials, committees, voting records, speeches, provisional measures, vetoes, and legislative agendas, returning results as tidy data frames ready for analysis.
An implementation of image processing effects that convert a photo into a line drawing image. For details, please refer to Tsuda, H. (2020). sketcher: An R package for converting a photo into a sketch style image. <doi:10.31234/osf.io/svmw5>.
SigClust is a statistical method for testing the significance of clustering results. SigClust can be applied to assess the statistical significance of splitting a data set into two clusters. For more than two clusters, SigClust can be used iteratively.
This package implements Bayesian methods, described in Hugh-Jones (2019) <doi:10.1007/s40881-019-00069-x>, for estimating the proportion of liars in coin flip-style experiments, where subjects report a random outcome and are paid for reporting a "good" outcome.
Defines the classes used to identify outliers (threshing) and compute the number of significant principal components and number of clusters (reaping) in a joint application of PCA and hierarchical clustering. See Wang et al., 2018, <doi:10.1186/s12859-017-1998-9>.
This package provides a minimal-dependency, performance-first R package for reading, writing, validating, streaming, and converting TOON (Token-Oriented Object Notation) data. Optimized for very large tabular files with robust diagnostics. Supports lossless JSON conversion and tabular CSV/Parquet/Feather conversion.
The goal of this method is to identify associations between bacteria and an environmental variable in 16S or other compositional data. The environmental variable is any variable which is measure for each microbiome sample, for example, a butyrate measurement paired with every sample in the data. Microbiome data is compositional, meaning that the total abundance of each sample sums to 1, and this introduces severe statistical distortions. This method takes a Bayesian approach to correcting for these statistical distortions, in which the total abundance is treated as an unknown variable. This package runs the python implementation using reticulate.
The Refugee Population Statistics Database published by The Office of The United Nations High Commissioner for Refugees (UNHCR) contains information about forcibly displaced populations spanning more than 70 years of statistical activities. It covers displaced populations such as refugees, asylum-seekers and internally displaced people, including their demographics. Stateless people are also included, most of who have never been displaced. The database also reflects the different types of solutions for displaced populations such as repatriation or resettlement. More information on the data and methodology can be found on the UNHCR Refugee Data Finder <https://www.unhcr.org/refugee-statistics/>.
Simulation of random orthonormal matrices from linear and quadratic exponential family distributions on the Stiefel manifold. The most general type of distribution covered is the matrix-variate Bingham-von Mises-Fisher distribution. Most of the simulation methods are presented in Hoff(2009) "Simulation of the Matrix Bingham-von Mises-Fisher Distribution, With Applications to Multivariate and Relational Data" <doi:10.1198/jcgs.2009.07177>. The package also includes functions for optimization on the Stiefel manifold based on algorithms described in Wen and Yin (2013) "A feasible method for optimization with orthogonality constraints" <doi:10.1007/s10107-012-0584-1>.
This package contains functions to create regulatory-style statistical reports. Originally designed to create tables, listings, and figures for the pharmaceutical, biotechnology, and medical device industries, these reports are generalized enough that they could be used in any industry. Generates text, rich-text, PDF, HTML, and Microsoft Word file formats. The package specializes in printing wide and long tables with automatic page wrapping and splitting. Reports can be produced with a minimum of function calls, and without relying on other table packages. The package supports titles, footnotes, page header, page footers, spanning headers, page by variables, and automatic page numbering.
This package provides a generic three-step pre-processing package for protein microarray data. This package contains different data pre-processing procedures to allow comparison of their performance. These steps are background correction, the coefficient of variation (CV) based filtering, batch correction and normalization.
This package creates "Table 1", i.e., description of baseline patient characteristics, which is essential in every medical research. It supports both continuous and categorical variables, as well as p-values and standardized mean differences. Weighted data are supported via the survey package.
This package provides an implementation of maximum likelihood estimators for a variety of heavy tailed distributions, including both the discrete and continuous power law distributions. Additionally, a goodness-of-fit based approach is used to estimate the lower cut-off for the scaling region.
This is a developer-focused, low dependency package in tidymodels that provides functions to register how models are to be used. Functions to register models are complimented with accessor functions to retrieve registered model information to aid in model fitting and error handling.
This package provides ISO language, territory, currency, script and character codes. It provides ISO 639 language codes, ISO 3166 territory codes, ISO 4217 currency codes, ISO 15924 script codes, and the ISO 8859 character codes as well as the UN M.49 area codes.
Learn vector representations of words by continuous bag of words and skip-gram implementations of the word2vec algorithm. The techniques are detailed in the paper "Distributed Representations of Words and Phrases and their Compositionality" by Mikolov et al. (2013), available at <arXiv:1310.4546>.
This package allows for testing of non-nested models. It includes tests of model distinguishability and of model fit that can be applied to both nested and non-nested models. The package also includes functionality to obtain confidence intervals associated with AIC and BIC.
This package implements an API for accessing the Domain Name Service (DNS) resolver service via the standard libresolv system library (whose API is often available directly via the standard libc C library) on Unix systems.
Borealis is an R library performing outlier analysis for count-based bisulfite sequencing data. It detectes outlier methylated CpG sites from bisulfite sequencing (BS-seq). The core of Borealis is modeling Beta-Binomial distributions. This can be useful for rare disease diagnoses.
Doscheda focuses on quantitative chemoproteomics used to determine protein interaction profiles of small molecules from whole cell or tissue lysates using Mass Spectrometry data. The package provides a shiny application to run the pipeline, several visualisations and a downloadable report of an experiment.
Calculate distances, build phylogenetic trees or perform hierarchical clustering between the samples of a VCF or FASTA file. Functions are implemented in Java-11 and called via rJava. Parallel implementation that operates directly on the VCF or FASTA file for fast execution.