Duplicated restaurant data (pre-processed and formatted) for entity resolution. This package contains formatted data from a data set that contains information about different restaurants, with the Zagats portion containing 331 records and the Fodors portion containing 533 records. The following variables are included in the data set: id, name, address, city, phone, type. The data set has a respective gold data set that provides information on which records match based on id.
This package provides a collection of functions related to the study of etiologic heterogeneity both across disease subtypes and across individual disease markers. The included functions allow one to quantify the extent of etiologic heterogeneity in the context of a case-control study, and provide p-values to test for etiologic heterogeneity across individual risk factors. Begg CB, Zabor EC, Bernstein JL, Bernstein L, Press MF, Seshan VE (2013) <doi:10.1002/sim.5902>.
Set chunk hooks for R Markdown documents <https://rmarkdown.rstudio.com/>, and improve user experience. For example, change units of figure sizes, benchmark chunks, and number lines on code blocks.
Could be used to obtain spatial depths, spatial ranks and outliers of multivariate random variables. Could also be used to visualize DD-plots (a multivariate generalization of QQ-plots).
This package provides functions for converting objects to scalars (vectors of length 1) and a more inclusive definition of data that can be interpreted as numbers (numeric and complex alike).
Create an interactive function map by analyzing a specified R script. It uses the find_dependencies()
function from the functiondepends package to recursively trace all user-defined function dependencies.
Bindings to libfluidsynth to parse and synthesize MIDI files. It can read MIDI into a data frame, play it on the local audio device, or convert into an audio file.
Adds flow maps to ggplot2 plots. The flow maps consist of ggplot2 layers which visualize the nodes as circles and the bilateral flows between the nodes as bidirectional half-arrows.
Download and process public domain works in the Project Gutenberg collection <https://www.gutenberg.org/>. Includes metadata for all Project Gutenberg works, so that they can be searched and retrieved.
Use GTFS (General Transit Feed Specification) data for routing from nominated start and end stations, for extracting isochrones', and travel times from any nominated start station to all other stations.
Makes it easy to extract and combine variables from the HILDA (Household, Income and Labour Dynamics in Australia) survey maintained by the Melbourne Institute <https://melbourneinstitute.unimelb.edu.au/hilda>.
R interface to access the web services of the ICES (International Council for the Exploration of the Sea) DATRAS trawl survey database <https://datras.ices.dk/WebServices/Webservices.aspx>
.
Datasets and wrapper functions for tidyverse-friendly introductory linear regression, used in "Statistical Inference via Data Science: A ModernDive
into R and the Tidyverse" available at <https://moderndive.com/>.
An API wrapper for the Monash University Probabilistic Footy Tipping Competition <https://probabilistic-footy.monash.edu/~footy/index.shtml>. Allows users to submit tips directly to the competition from R.
Implementation of two p-value combination techniques (inverse normal and Fisher methods). A vignette is provided to explain how to perform a meta-analysis from two independent RNA-seq experiments.
The Needleman-Wunsch global alignment algorithm can be used to find approximate matches between sample names in different data sets. See Wang et al. (2010) <doi:10.4137/CIN.S5613>.
Tokenizers break text into pieces that are more usable by machine learning models. Many tokenizers share some preparation steps. This package provides those shared steps, along with a simple tokenizer.
Generation of count (assuming Poisson distribution) and continuous data (using Fleishman polynomials) simultaneously. The details of the method are explained in Demirtas et al. (2012) <DOI:10.1002/sim.5362>.
This package provides a series of checks to identify common issues in Study Data Tabulation Model (SDTM) datasets. These checks are intended to be generalizable, actionable, and meaningful for analysis.
This package provides methods for sampling contact matrices from diary data for use in infectious disease modelling, as discussed in Mossong et al. (2008) <doi:10.1371/journal.pmed.0050074>.
This package provides a small set of functions wrapping up the call stack and command line inspection needed to determine a running script's filename from within the script itself.
Two- and three-dimensional morphometric maps of enamel and dentine thickness and multivariate analysis. Volume calculation of dental materials. Principal component analysis of thickness maps with associated morphometric map variations.
This package contains functions to standardize tracheid profiles using the traditional method (Vaganov) and a new method to standardize tracheidograms based on the relative position of tracheids within tree rings.
This package provides tools to visualize oligonucleotide patterns and sequence motif occurrences across a large set of sequences centred at a common reference point and sorted by a user defined feature.