This package provides functions that make it easy to reveal ggplot2 graphs incrementally. The functions take a plot produced with ggplot2 and return a list of plots showing data incrementally by panels, layers, groups, the values in an axis or any arbitrary aesthetic.
Statistical analysis of monthly background checks of gun purchases for the New York Times story "What Drives Gun Sales: Terrorism, Obama and Calls for Restrictions" at <http://www.nytimes.com/interactive/2015/12/10/us/gun-sales-terrorism-obama-restrictions.html?> is provided.
Fit joint models of survival and multivariate longitudinal data. The longitudinal data is specified by generalised linear mixed models. The joint models are fit via maximum likelihood using an approximate expectation maximisation algorithm. Bernhardt (2015) <doi:10.1016/j.csda.2014.11.011>.
An extension of ggplot2 to provide quiver plots to visualise vector fields. This functionality is implemented using a geom to produce a new graphical layer, which allows aesthetic options. This layer can be overlaid on a map to improve visualisation of mapped data.
Enable user to find the IP addresses which are used as VPN anonymizer, open proxies, web proxies and Tor exits. The package lookup the proxy IP address from IP2Proxy BIN Data file. You may visit <https://lite.ip2location.com> for free database download.
This package provides string similarity calculations inspired by the Python thefuzz package. Compare strings by edit distance, similarity ratio, best matching substring, ordered token matching and set-based token matching. A range of edit distance measures are available thanks to the stringdist package.
This package implements the computation of discrepancy statistics summarizing differences between the density of imputed and observed values and the construction of weights to balance covariates that are part of the missing data mechanism as described in Marbach (2021) <arXiv:2107.05427>
.
This package contains data from the May 2021 Occupational Employment and Wage Statistics data release from the U.S. Bureau of Labor Statistics. The dataset covers employment and wages across occupations, industries, states, and at the national level. Metropolitan data is not included.
This package contains data from the May 2020 Occupational Employment and Wage Statistics data release from the U.S. Bureau of Labor Statistics. The dataset covers employment and wages across occupations, industries, states, and at the national level. Metropolitan data is not included.
This package provides methods for reducing the number of features within a data set. See Bauer JO (2021) <doi:10.1145/3475827.3475832> and Bauer JO, Drabant B (2021) <doi:10.1016/j.jmva.2021.104754> for more information on principal loading analysis.
The sinaplot is a data visualization chart suitable for plotting any single variable in a multiclass data set. It is an enhanced jitter strip chart, where the width of the jitter is controlled by the density distribution of the data within each class.
Simplifies regression tests by comparing objects produced by test code with earlier versions of those same objects. If objects are unchanged the tests pass, otherwise execution stops with error details. If in interactive mode, tests can be reviewed through the provided interactive environment.
Convert, validate, format and elegantly print geographic coordinates and waypoints (paired latitude and longitude values) in decimal degrees, degrees and minutes, and degrees, minutes and seconds using high performance C++ code to enable rapid conversion and formatting of large coordinate and waypoint datasets.
Read and write XES Files to create event log objects used by the bupaR
framework. XES (Extensible Event Stream) is the `IEEE` standard for storing and sharing event data (see <http://standards.ieee.org/findstds/standard/1849-2016.html> for more info).
Borealis is an R library performing outlier analysis for count-based bisulfite sequencing data. It detectes outlier methylated CpG
sites from bisulfite sequencing (BS-seq). The core of Borealis is modeling Beta-Binomial distributions. This can be useful for rare disease diagnoses.
Doscheda focuses on quantitative chemoproteomics used to determine protein interaction profiles of small molecules from whole cell or tissue lysates using Mass Spectrometry data. The package provides a shiny application to run the pipeline, several visualisations and a downloadable report of an experiment.
Testing individual SNPs, as well as arbitrarily large groups of SNPs in GWA studies, using a joint model of all SNPs. The method controls the FWER, and provides an automatic, data-driven refinement of the SNP clusters to smaller groups or single markers.
This package perform weighted-pvalue based multiple hypothesis test and provides corresponding information such as ranking probability, weight, significant tests, etc . To conduct this testing procedure, the testing method apply a probabilistic relationship between the test rank and the corresponding test effect size.
The NCBI Gene Expression Omnibus (GEO) is a public repository of microarray data. Given the rich and varied nature of this resource, it is only natural to want to apply BioConductor tools to these data. GEOquery is the bridge between GEO and BioConductor.
This package provides utilities based on libpoppler
for extracting text, fonts, attachments and metadata from a PDF file. It also supports high quality rendering of PDF documents into PNG, JPEG, TIFF format, or into raw bitmap vectors for further processing in R.
The goal of this method is to identify associations between bacteria and an environmental variable in 16S or other compositional data. The environmental variable is any variable which is measure for each microbiome sample, for example, a butyrate measurement paired with every sample in the data. Microbiome data is compositional, meaning that the total abundance of each sample sums to 1, and this introduces severe statistical distortions. This method takes a Bayesian approach to correcting for these statistical distortions, in which the total abundance is treated as an unknown variable. This package runs the python implementation using reticulate.
The Refugee Population Statistics Database published by The Office of The United Nations High Commissioner for Refugees (UNHCR) contains information about forcibly displaced populations spanning more than 70 years of statistical activities. It covers displaced populations such as refugees, asylum-seekers and internally displaced people, including their demographics. Stateless people are also included, most of who have never been displaced. The database also reflects the different types of solutions for displaced populations such as repatriation or resettlement. More information on the data and methodology can be found on the UNHCR Refugee Data Finder <https://www.unhcr.org/refugee-statistics/>.
Simulation of random orthonormal matrices from linear and quadratic exponential family distributions on the Stiefel manifold. The most general type of distribution covered is the matrix-variate Bingham-von Mises-Fisher distribution. Most of the simulation methods are presented in Hoff(2009) "Simulation of the Matrix Bingham-von Mises-Fisher Distribution, With Applications to Multivariate and Relational Data" <doi:10.1198/jcgs.2009.07177>. The package also includes functions for optimization on the Stiefel manifold based on algorithms described in Wen and Yin (2013) "A feasible method for optimization with orthogonality constraints" <doi:10.1007/s10107-012-0584-1>.
This package contains functions to create regulatory-style statistical reports. Originally designed to create tables, listings, and figures for the pharmaceutical, biotechnology, and medical device industries, these reports are generalized enough that they could be used in any industry. Generates text, rich-text, PDF, HTML, and Microsoft Word file formats. The package specializes in printing wide and long tables with automatic page wrapping and splitting. Reports can be produced with a minimum of function calls, and without relying on other table packages. The package supports titles, footnotes, page header, page footers, spanning headers, page by variables, and automatic page numbering.