This package provides a low-level spell checker and morphological analyzer based on the famous hunspell library. The package can analyze or check individual words as well as parse text, LaTeX, HTML or XML documents. For a more user-friendly interface use the spelling package which builds on this package to automate checking of files, documentation and vignettes in all common formats.
FHIR R4 bundles in JSON format are derived from https://synthea.mitre.org/downloads. Transformation inspired by a kaggle notebook published by Dr Alexander Scarlat, https://www.kaggle.com/code/drscarlat/fhir-starter-parse-healthcare-bundles-into-tables. This is a very limited illustration of some basic parsing and reorganization processes. Additional tooling will be required to move beyond the Synthea data illustrations.
ClustIRR analyzes repertoires of B- and T-cell receptors. It starts by identifying communities of immune receptors with similar specificities, based on the sequences of their complementarity-determining regions (CDRs). Next, it employs a Bayesian probabilistic models to quantify differential community occupancy (DCO) between repertoires, allowing the identification of expanding or contracting communities in response to e.g. infection or cancer treatment.
The funOmics package ggregates or summarizes omics data into higher level functional representations such as GO terms gene sets or KEGG metabolic pathways. The aggregated data matrix represents functional activity scores that facilitate the analysis of functional molecular sets while allowing to reduce dimensionality and provide easier and faster biological interpretations. Coordinated functional activity scores can be as informative as single molecules!
This package provides a collection of microRNAs/targets from external resources, including validated microRNA-target databases (miRecords, miRTarBase and TarBase), predicted microRNA-target databases (DIANA-microT, ElMMo, MicroCosm, miRanda, miRDB, PicTar, PITA and TargetScan) and microRNA-disease/drug databases (miR2Disease, Pharmaco-miR VerSe and PhenomiR).
Allows access to the data found in the species list featured in the renowned List of the Birds of Peru Plenge, M. A. (2023) <https://sites.google.com/site/boletinunop/checklist>. This publication stands as one of Peru's most comprehensive reviews of bird diversity. The dataset incorporates detailed species accounts and has been meticulously structured for effortless utilization within the R environment.
This package performs statistical testing to compare predictive models based on multiple observations of the A statistic (also known as Area Under the Receiver Operating Characteristic Curve, or AUC). Specifically, it implements a testing method based on the equivalence between the A statistic and the Wilcoxon statistic. For more information, see Hanley and McNeil (1982) <doi:10.1148/radiology.143.1.7063747>.
Bayesian seemingly unrelated regression with general variable selection and dense/sparse covariance matrix. The sparse seemingly unrelated regression is described in Bottolo et al. (2021) <doi:10.1111/rssc.12490>, the software paper is in Zhao et al. (2021) <doi:10.18637/jss.v100.i11>, and the model with random effects is described in Zhao et al. (2024) <doi:10.1093/jrsssc/qlad102>.
Use BirdNET', a state-of-the-art deep learning classifier, to automatically identify (bird) sounds. Analyze bioacoustic datasets without any computer science background using a pre-trained model or a custom trained classifier. Predict bird species occurrence based on location and week of the year. Kahl, S., Wood, C. M., Eibl, M., & Klinck, H. (2021) <doi:10.1016/j.ecoinf.2021.101236>.
Generate multivariate color palettes to represent two-dimensional or three-dimensional data in graphics (in contrast to standard color palettes that represent just one variable). You tell colors3d how to map color space onto your data, and it gives you a color for each data point. You can then use these colors to make plots in base R', ggplot2', or other graphics frameworks.
Covariance is of universal prevalence across various disciplines within statistics. We provide a rich collection of geometric and inferential tools for convenient analysis of covariance structures, topics including distance measures, mean covariance estimator, covariance hypothesis test for one-sample and two-sample cases, and covariance estimation. For an introduction to covariance in multivariate statistical analysis, see Schervish (1987) <doi:10.1214/ss/1177013111>.
This package provides API access to the Government of Canada Vehicle Recalls Database <https://tc.api.canada.ca/en/detail?api=VRDB> used by the Defect Investigations and Recalls Division for vehicles, tires, and child car seats. The API wrapper provides access to recall summary information searched using make, model, and year range, as well as detailed recall information searched using recall number.
Estimation of incidence and case fatality for a chronic disease, given partial information, using a multi-state model. Given data on age-specific mortality and either incidence or prevalence, Bayesian inference is used to estimate the posterior distributions of incidence, case fatality, and functions of these such as prevalence. The methods are described in Jackson et al. (2023) <doi:10.1093/jrsssa/qnac015>.
This package performs calculations with tree taper (or stem profile) equations, including model fitting. The package implements the methods from GarcĂ a, O. (2015) "Dynamic modelling of tree form" <http://mcfns.net/index.php/Journal/article/view/MCFNS7.1_2>. The models are parsimonious, describe well the tree bole shape over its full length, and are consistent with wood formation mechanisms through time.
The purpose of this package is to support the setup the R environment. The two main features are autos', to automatically source files and/or directories into your environment, and paths to consistently set path objects across projects for input and output. Both are implemented using a configuration file to allow easy, custom configurations that can be used for multiple or all projects.
The FisherEM algorithm, proposed by Bouveyron & Brunet (2012) <doi:10.1007/s11222-011-9249-9>, is an efficient method for the clustering of high-dimensional data. FisherEM models and clusters the data in a discriminative and low-dimensional latent subspace. It also provides a low-dimensional representation of the clustered data. A sparse version of Fisher-EM algorithm is also provided.
Anonymized data from surveys conducted by Forwards <https://forwards.github.io/>, the R Foundation task force on women and other under-represented groups. Currently, a single data set of responses to a survey of attendees at useR! 2016 <https://www.r-project.org/useR-2016/>, the R user conference held at Stanford University, Stanford, California, USA, June 27 - June 30 2016.
The ability to tune models is important. finetune enhances the tune package by providing more specialized methods for finding reasonable values of model tuning parameters. Two racing methods described by Kuhn (2014) <doi:10.48550/arXiv.1405.6974> are included. An iterative search method using generalized simulated annealing (Bohachevsky, Johnson and Stein, 1986) <doi:10.1080/00401706.1986.10488128> is also included.
This package provides functions and data are provided that support a course that emphasizes statistical issues of inference and generalizability. The functions are designed to make it straightforward to illustrate the use of cross-validation, the training/test approach, simulation, and model-based estimates of accuracy. Methods considered are Generalized Additive Modeling, Linear and Quadratic Discriminant Analysis, Tree-based methods, and Random Forests.
Utilizes methods of the PyMongo Python library to initialize, insert and query GeoJson data (see <https://github.com/mongodb/mongo-python-driver> for more information on PyMongo'). Furthermore, it allows the user to validate GeoJson objects and to use the console for MongoDB (bulk) commands. The reticulate package provides the R interface to Python modules, classes and functions.
Ease the transition between R vectors and markdown text. With gluedown and rmarkdown', users can create traditional vectors in R, glue those strings together with the markdown syntax, and print those formatted vectors directly to the document. This package primarily uses GitHub Flavored Markdown (GFM), an offshoot of the unambiguous CommonMark specification by John MacFarlane (2019) <https://spec.commonmark.org/>.
Data from the United States Center for Medicare and Medicaid Services (CMS) is included in this package. There are ICD-9 and ICD-10 diagnostic and procedure codes, and lists of the chapter and sub-chapter headings and the ranges of ICD codes they encompass. There are also two sample datasets. These data are used by the icd package for finding comorbidities.
This package provides a pipeline to annotate chromatography peaks from the IDSL.IPA workflow <doi:10.1021/acs.jproteome.2c00120> with molecular formulas of a prioritized chemical space using an isotopic profile matching approach. The IDSL.UFA workflow only requires mass spectrometry level 1 (MS1) data for formula annotation. The IDSL.UFA methods was described in <doi:10.1021/acs.analchem.2c00563> .
This package provides a set of tools designed to enhance transparency and understanding of date-time manipulation functions from the lubridate package. It provides detailed feedback about the operations performed by lubridate functions, allowing users to better comprehend and debug their code. These insights serve as both a learning tool for newcomers and a debugging aid for programmers working with date-time data.