This package performs statistical testing to compare predictive models based on multiple observations of the A statistic (also known as Area Under the Receiver Operating Characteristic Curve, or AUC). Specifically, it implements a testing method based on the equivalence between the A statistic and the Wilcoxon statistic. For more information, see Hanley and McNeil (1982) <doi:10.1148/radiology.143.1.7063747>.
Bayesian seemingly unrelated regression with general variable selection and dense/sparse covariance matrix. The sparse seemingly unrelated regression is described in Bottolo et al. (2021) <doi:10.1111/rssc.12490>, the software paper is in Zhao et al. (2021) <doi:10.18637/jss.v100.i11>, and the model with random effects is described in Zhao et al. (2024) <doi:10.1093/jrsssc/qlad102>.
Use BirdNET', a state-of-the-art deep learning classifier, to automatically identify (bird) sounds. Analyze bioacoustic datasets without any computer science background using a pre-trained model or a custom trained classifier. Predict bird species occurrence based on location and week of the year. Kahl, S., Wood, C. M., Eibl, M., & Klinck, H. (2021) <doi:10.1016/j.ecoinf.2021.101236>.
Generate multivariate color palettes to represent two-dimensional or three-dimensional data in graphics (in contrast to standard color palettes that represent just one variable). You tell colors3d how to map color space onto your data, and it gives you a color for each data point. You can then use these colors to make plots in base R', ggplot2', or other graphics frameworks.
Covariance is of universal prevalence across various disciplines within statistics. We provide a rich collection of geometric and inferential tools for convenient analysis of covariance structures, topics including distance measures, mean covariance estimator, covariance hypothesis test for one-sample and two-sample cases, and covariance estimation. For an introduction to covariance in multivariate statistical analysis, see Schervish (1987) <doi:10.1214/ss/1177013111>.
This package provides API access to the Government of Canada Vehicle Recalls Database <https://tc.api.canada.ca/en/detail?api=VRDB> used by the Defect Investigations and Recalls Division for vehicles, tires, and child car seats. The API wrapper provides access to recall summary information searched using make, model, and year range, as well as detailed recall information searched using recall number.
Estimation of incidence and case fatality for a chronic disease, given partial information, using a multi-state model. Given data on age-specific mortality and either incidence or prevalence, Bayesian inference is used to estimate the posterior distributions of incidence, case fatality, and functions of these such as prevalence. The methods are described in Jackson et al. (2023) <doi:10.1093/jrsssa/qnac015>.
This package performs calculations with tree taper (or stem profile) equations, including model fitting. The package implements the methods from Garcà a, O. (2015) "Dynamic modelling of tree form" <http://mcfns.net/index.php/Journal/article/view/MCFNS7.1_2>. The models are parsimonious, describe well the tree bole shape over its full length, and are consistent with wood formation mechanisms through time.
The purpose of this package is to support the setup the R environment. The two main features are autos', to automatically source files and/or directories into your environment, and paths to consistently set path objects across projects for input and output. Both are implemented using a configuration file to allow easy, custom configurations that can be used for multiple or all projects.
Anonymized data from surveys conducted by Forwards <https://forwards.github.io/>, the R Foundation task force on women and other under-represented groups. Currently, a single data set of responses to a survey of attendees at useR! 2016 <https://www.r-project.org/useR-2016/>, the R user conference held at Stanford University, Stanford, California, USA, June 27 - June 30 2016.
The FisherEM algorithm, proposed by Bouveyron & Brunet (2012) <doi:10.1007/s11222-011-9249-9>, is an efficient method for the clustering of high-dimensional data. FisherEM models and clusters the data in a discriminative and low-dimensional latent subspace. It also provides a low-dimensional representation of the clustered data. A sparse version of Fisher-EM algorithm is also provided.
The ability to tune models is important. finetune enhances the tune package by providing more specialized methods for finding reasonable values of model tuning parameters. Two racing methods described by Kuhn (2014) <doi:10.48550/arXiv.1405.6974> are included. An iterative search method using generalized simulated annealing (Bohachevsky, Johnson and Stein, 1986) <doi:10.1080/00401706.1986.10488128> is also included.
Ease the transition between R vectors and markdown text. With gluedown and rmarkdown', users can create traditional vectors in R, glue those strings together with the markdown syntax, and print those formatted vectors directly to the document. This package primarily uses GitHub Flavored Markdown (GFM), an offshoot of the unambiguous CommonMark specification by John MacFarlane (2019) <https://spec.commonmark.org/>.
This package provides functions and data are provided that support a course that emphasizes statistical issues of inference and generalizability. The functions are designed to make it straightforward to illustrate the use of cross-validation, the training/test approach, simulation, and model-based estimates of accuracy. Methods considered are Generalized Additive Modeling, Linear and Quadratic Discriminant Analysis, Tree-based methods, and Random Forests.
Utilizes methods of the PyMongo Python library to initialize, insert and query GeoJson data (see <https://github.com/mongodb/mongo-python-driver> for more information on PyMongo'). Furthermore, it allows the user to validate GeoJson objects and to use the console for MongoDB (bulk) commands. The reticulate package provides the R interface to Python modules, classes and functions.
This package creates styled tables for data presentation. Export to HTML, LaTeX, RTF, Word', Excel', and PowerPoint'. Simple, modern interface to manipulate borders, size, position, captions, colours, text styles and number formatting. Table cells can span multiple rows and/or columns. Includes a huxreg function for creation of regression tables, and quick_* one-liners to print data to a new document.
Data from the United States Center for Medicare and Medicaid Services (CMS) is included in this package. There are ICD-9 and ICD-10 diagnostic and procedure codes, and lists of the chapter and sub-chapter headings and the ranges of ICD codes they encompass. There are also two sample datasets. These data are used by the icd package for finding comorbidities.
This package provides a pipeline to annotate chromatography peaks from the IDSL.IPA workflow <doi:10.1021/acs.jproteome.2c00120> with molecular formulas of a prioritized chemical space using an isotopic profile matching approach. The IDSL.UFA workflow only requires mass spectrometry level 1 (MS1) data for formula annotation. The IDSL.UFA methods was described in <doi:10.1021/acs.analchem.2c00563> .
This is for code management functions, NLP tools, a Monty Hall simulator, and for implementing my own variable reduction technique called Feed Reduction. The Feed Reduction technique is not yet published, but is merely a tool for implementing a series of binary neural networks meant for reducing data into N dimensions, where N is the number of possible values of the response variable.
This package provides a set of tools designed to enhance transparency and understanding of date-time manipulation functions from the lubridate package. It provides detailed feedback about the operations performed by lubridate functions, allowing users to better comprehend and debug their code. These insights serve as both a learning tool for newcomers and a debugging aid for programmers working with date-time data.
Semi-parametric approach for sparse canonical correlation analysis which can handle mixed data types: continuous, binary and truncated continuous. Bridge functions are provided to connect Kendall's tau to latent correlation under the Gaussian copula model. The methods are described in Yoon, Carroll and Gaynanova (2020) <doi:10.1093/biomet/asaa007> and Yoon, Mueller and Gaynanova (2021) <doi:10.1080/10618600.2021.1882468>.
An interface to build machine learning models for classification and regression problems. mikropml implements the ML pipeline described by TopçuoÄ lu et al. (2020) <doi:10.1128/mBio.00434-20> with reasonable default options for data preprocessing, hyperparameter tuning, cross-validation, testing, model evaluation, and interpretation steps. See the website <https://www.schlosslab.org/mikropml/> for more information, documentation, and examples.
This package provides a comprehensive implementation of Petersen-type estimators and its many variants for two-sample capture-recapture studies. A conditional likelihood approach is used that allows for tag loss; non reporting of tags; reward tags; categorical, geographical and temporal stratification; partial stratification; reverse capture-recapture; and continuous variables in modeling the probability of capture. Many examples from fisheries management are presented.
Makes it easy to push data to Power BI using R and the Power BI REST APIs (see <https://docs.microsoft.com/en-us/rest/api/power-bi/>). A set of functions for turning data frames into Power BI datasets and refreshing these datasets are provided. Administrative tasks such as monitoring refresh statuses and pulling metadata about workspaces and users are also supported.