This package implements Heckman selection models using a Bayesian approach via Stan and compares the performance of normal, Studentâ s t, and contaminated normal distributions in addressing complexities and selection bias (Heeju Lim, Victor E. Lachos, and Victor H. Lachos, Bayesian analysis of flexible Heckman selection models using Hamiltonian Monte Carlo, 2025, under submission).
Hospital time series data analysis workflow tools, modeling, and automations. This library provides many useful tools to review common administrative time series hospital data. Some of these include average length of stay, and readmission rates. The aim is to provide a simple and consistent verb framework that takes the guesswork out of everything.
This package provides a collection of Irucka Embry's miscellaneous USGS data sets (USGS Parameter codes with fixed values, USGS global time zone codes, and US Air Force Global Engineering Weather Data). Irucka created these data sets while a Cherokee Nation Technology Solutions (CNTS) United States Geological Survey (USGS) Contractor and/or USGS employee.
This package provides a collection of functions for sensitivity analysis of model outputs (factor screening, global sensitivity analysis and robustness analysis), for variable importance measures of data, as well as for interpretability of machine learning models. Most of the functions have to be applied on scalar output, but several functions support multi-dimensional outputs.
Read General Transit Feed Specification (GTFS) zipfiles into a list of R dataframes. Perform validation of the data structure against the specification. Analyze the headways and frequencies at routes and stops. Create maps and perform spatial analysis on the routes and stops. Please see the GTFS documentation here for more detail: <https://gtfs.org/>.
This package is a comprehensive visualization tool specifically designed for exploring phylomorphospace. It not only simplifies the process of generating phylomorphospace, but also enhances it with the capability to add graphic layers to the plot with grammar of graphics to create fully annotated phylomorphospaces. It also provide some utilities to help interpret evolutionary patterns.
immunoClust
is a model based clustering approach for Flow Cytometry samples. The cell-events of single Flow Cytometry samples are modelled by a mixture of multinominal normal- or t-distributions. The cell-event clusters of several samples are modelled by a mixture of multinominal normal-distributions aiming stable co-clusters across these samples.
scoreInvHap
can get the samples inversion status of known inversions. scoreInvHap
uses SNP data as input and requires the following information about the inversion: genotype frequencies in the different haplotypes, R2 between the region SNPs and inversion status and heterozygote genotypes in the reference. The package include this data for 21 inversions.
signifinder is an R package for computing and exploring a compendium of tumor signatures. It allows to compute a variety of signatures coming from public literature, based on gene expression values, and return single-sample (-cell/-spot) scores. Currently, signifinder collects more than 70 distinct signatures, relating to multiple tumors and multiple cancer processes.
This package allows manipulating the metadata of fat pointers:
Naming the metadata’s type (as an associated type)
Extracting metadata from a pointer
Reconstructing a pointer from a data pointer and metadata
Representing vtables, the metadata for trait objects, as a type with some limited API.
Create life tables with a Bayesian approach, which can be very useful for modelling a complex health process when considering multiple predisposing factors and multiple coexisting health conditions. Details for this method can be found in: Lynch, Scott, et al., (2022) <doi:10.1177/00811750221112398>; Zang, Emma, et al., (2022) <doi:10.1093/geronb/gbab149>.
Bit-level reading and writing are necessary when dealing with many file formats e.g. compressed data and binary files. Currently, R connections are manipulated at the byte level. This package wraps existing connections and raw vectors so that it is possible to read bits, bit sequences, unaligned bytes and low-bit representations of integers.
Various functions to import, verify, process and plot high-resolution dendrometer data using daily and stem-cycle approaches as described in Deslauriers et al, 2007 <doi:10.1016/j.dendro.2007.05.003>. For more details about the package please see: Van der Maaten et al. 2016 <doi:10.1016/j.dendro.2016.06.001>.
Three methods are provided to estimate graphical models with latent variables: (1) Jin, Y., Ning, Y., and Tan, K. M. (2020) (preprint available); (2) Chandrasekaran, V., Parrilo, P. A. & Willsky, A. S. (2012) <doi:10.1214/11-AOS949>; (3) Tan, K. M., Ning, Y., Witten, D. M. & Liu, H. (2016) <doi:10.1093/biomet/asw050>.
Uses multiple AUCs to select a combination of predictors when the outcome has multiple (ordered) levels and the focus is discriminating one particular level from the others. This method is most naturally applied to settings where the outcome has three levels. (Meisner, A, Parikh, CR, and Kerr, KF (2017) <http://biostats.bepress.com/uwbiostat/paper423/>.).
This package provides tools for exchanging pedigree data between the pedsuite packages and the Familias software for forensic kinship computations (Egeland et al. (2000) <doi:10.1016/s0379-0738(00)00147-x>). These functions were split out from the forrel package to streamline maintenance and provide a lightweight alternative for packages otherwise independent of forrel'.
The pharmaverse is a set of packages that compose multiple pathways through clinical data generation and reporting in the pharmaceutical industry. This package is designed to guide users to our work-spaces on GitHub
', Slack and LinkedIn
as well as our website and examples. Learn more about the pharmaverse at <https://pharmaverse.org>.
Price comparisons within or between countries provide an overall measure of the relative difference in prices, often denoted as price levels. This package provides index number methods for such price comparisons (e.g., The World Bank, 2011, <doi:10.1596/978-0-8213-9728-2>). Moreover, it contains functions for sampling and characterizing price data.
Corrects the spelling of a given word in English using a modification of Peter Norvig's spell correct algorithm (see <http://norvig.com/spell-correct.html>) which handles up to three edits. The algorithm tries to find the spelling with maximum probability of intended correction out of all possible candidate corrections from the original word.
To make it easy to generate random numbers based upon the underlying stats distribution functions. All data is returned in a tidy and structured format making working with the data simple and straight forward. Given that the data is returned in a tidy tibble it lends itself to working with the rest of the tidyverse'.
Given a partition resulting from any clustering algorithm, the implemented tests allow valid post-clustering inference by testing if a given variable significantly separates two of the estimated clusters. Methods are detailed in: Hivert B, Agniel D, Thiebaut R & Hejblum BP (2022). "Post-clustering difference testing: valid inference and practical considerations", <arXiv:2210.13172>
.
This package is a feature selection package of the mlr3 ecosystem. It selects the optimal feature set for any mlr3 learner. The package works with several optimization algorithms e.g. random search, Recursive feature elimination, and genetic search. Moreover, it can automatically optimize learners and estimate the performance of optimized feature sets with nested resampling.
This package contains functions to implement the methodology and considerations laid out by Marks et al. in the article "Measuring abnormality in high dimensional spaces: applications in biomechanical gait analysis". Using high-dimensional datasets to measure a subject's overall level of abnormality as compared to a reference population is often needed in outcomes research.
The Autoregressive Integrated Moving Average (ARIMA) model is very popular univariate time series model. Its application has been widened by the incorporation of exogenous variable(s) (X) in the model and modified as ARIMAX by Bierens (1987) <doi:10.1016/0304-4076(87)90086-8>. In this package we estimate the ARIMAX model using Bayesian framework.