The goal of this method is to identify associations between bacteria and an environmental variable in 16S or other compositional data. The environmental variable is any variable which is measure for each microbiome sample, for example, a butyrate measurement paired with every sample in the data. Microbiome data is compositional, meaning that the total abundance of each sample sums to 1, and this introduces severe statistical distortions. This method takes a Bayesian approach to correcting for these statistical distortions, in which the total abundance is treated as an unknown variable. This package runs the python implementation using reticulate.
The Refugee Population Statistics Database published by The Office of The United Nations High Commissioner for Refugees (UNHCR) contains information about forcibly displaced populations spanning more than 70 years of statistical activities. It covers displaced populations such as refugees, asylum-seekers and internally displaced people, including their demographics. Stateless people are also included, most of who have never been displaced. The database also reflects the different types of solutions for displaced populations such as repatriation or resettlement. More information on the data and methodology can be found on the UNHCR Refugee Data Finder <https://www.unhcr.org/refugee-statistics/>.
Simulation of random orthonormal matrices from linear and quadratic exponential family distributions on the Stiefel manifold. The most general type of distribution covered is the matrix-variate Bingham-von Mises-Fisher distribution. Most of the simulation methods are presented in Hoff(2009) "Simulation of the Matrix Bingham-von Mises-Fisher Distribution, With Applications to Multivariate and Relational Data" <doi:10.1198/jcgs.2009.07177>. The package also includes functions for optimization on the Stiefel manifold based on algorithms described in Wen and Yin (2013) "A feasible method for optimization with orthogonality constraints" <doi:10.1007/s10107-012-0584-1>.
This package contains functions to create regulatory-style statistical reports. Originally designed to create tables, listings, and figures for the pharmaceutical, biotechnology, and medical device industries, these reports are generalized enough that they could be used in any industry. Generates text, rich-text, PDF, HTML, and Microsoft Word file formats. The package specializes in printing wide and long tables with automatic page wrapping and splitting. Reports can be produced with a minimum of function calls, and without relying on other table packages. The package supports titles, footnotes, page header, page footers, spanning headers, page by variables, and automatic page numbering.
This package provides host-side tooling and Rust bindings to interact with LPC55 chips via the ROM bootloader.
On the Guix System, simply installing this package will not have any effect. It is meant to be passed to the udev
service.
This package provides a collection of functions to extract citation information from R packages and to deal with files in citation file format (<https://citation-file-format.github.io/>), extending the functionality already provided by the citation()
function in the utils package.
Have you ever been tempted to create roxygen2'-style documentation comments for one of your functions that was not part of one of your packages (yet)? This is exactly what this package is about: running roxygen2 on (chunks of) a single code file.
This package provides functions for the Bayesian analysis of extreme value models, using Markov chain Monte Carlo methods. Allows the construction of both uninformative and informed prior distributions for common statistical models applied to extreme event data, including the generalized extreme value distribution.
An interface to the core Familias functions which are programmed in C++. The implementation is described in Egeland, Mostad and Olaisen (1997) <doi:10.1016/S1355-0306(97)72202-0> and Simonsson and Mostad (2016) <doi:10.1016/j.fsigen.2016.04.005>.
An extension of ggplot2 to provide quiver plots to visualise vector fields. This functionality is implemented using a geom to produce a new graphical layer, which allows aesthetic options. This layer can be overlaid on a map to improve visualisation of mapped data.
Statistical analysis of monthly background checks of gun purchases for the New York Times story "What Drives Gun Sales: Terrorism, Obama and Calls for Restrictions" at <http://www.nytimes.com/interactive/2015/12/10/us/gun-sales-terrorism-obama-restrictions.html?> is provided.
This package provides a syntax to create and combine Gaussian process kernels in greta'. You can then them to define either full rank or sparse Gaussian processes. This is an extension to the greta software, Golding (2019) <doi:10.21105/joss.01601>.
This package provides functions that make it easy to reveal ggplot2 graphs incrementally. The functions take a plot produced with ggplot2 and return a list of plots showing data incrementally by panels, layers, groups, the values in an axis or any arbitrary aesthetic.
Fit joint models of survival and multivariate longitudinal data. The longitudinal data is specified by generalised linear mixed models. The joint models are fit via maximum likelihood using an approximate expectation maximisation algorithm. Bernhardt (2015) <doi:10.1016/j.csda.2014.11.011>.
Enable user to find the IP addresses which are used as VPN anonymizer, open proxies, web proxies and Tor exits. The package lookup the proxy IP address from IP2Proxy BIN Data file. You may visit <https://lite.ip2location.com> for free database download.
This package provides string similarity calculations inspired by the Python thefuzz package. Compare strings by edit distance, similarity ratio, best matching substring, ordered token matching and set-based token matching. A range of edit distance measures are available thanks to the stringdist package.
This package provides tools to retrieve and summarize taxonomic information and synonymy data for reptile species using data scraped from The Reptile Database website (<https://reptile-database.reptarium.cz/>). Outputs include clean and structured data frames useful for ecological, evolutionary, and conservation research.
This package implements the computation of discrepancy statistics summarizing differences between the density of imputed and observed values and the construction of weights to balance covariates that are part of the missing data mechanism as described in Marbach (2021) <arXiv:2107.05427>
.
This package contains data from the May 2020 Occupational Employment and Wage Statistics data release from the U.S. Bureau of Labor Statistics. The dataset covers employment and wages across occupations, industries, states, and at the national level. Metropolitan data is not included.
This package contains data from the May 2021 Occupational Employment and Wage Statistics data release from the U.S. Bureau of Labor Statistics. The dataset covers employment and wages across occupations, industries, states, and at the national level. Metropolitan data is not included.
This package provides methods for reducing the number of features within a data set. See Bauer JO (2021) <doi:10.1145/3475827.3475832> and Bauer JO, Drabant B (2021) <doi:10.1016/j.jmva.2021.104754> for more information on principal loading analysis.
The sinaplot is a data visualization chart suitable for plotting any single variable in a multiclass data set. It is an enhanced jitter strip chart, where the width of the jitter is controlled by the density distribution of the data within each class.
Simplifies regression tests by comparing objects produced by test code with earlier versions of those same objects. If objects are unchanged the tests pass, otherwise execution stops with error details. If in interactive mode, tests can be reviewed through the provided interactive environment.
Convert, validate, format and elegantly print geographic coordinates and waypoints (paired latitude and longitude values) in decimal degrees, degrees and minutes, and degrees, minutes and seconds using high performance C++ code to enable rapid conversion and formatting of large coordinate and waypoint datasets.