Enable user to find the IP addresses which are used as VPN anonymizer, open proxies, web proxies and Tor exits. The package lookup the proxy IP address from IP2Proxy BIN Data file. You may visit <https://lite.ip2location.com> for free database download.
This package provides string similarity calculations inspired by the Python thefuzz package. Compare strings by edit distance, similarity ratio, best matching substring, ordered token matching and set-based token matching. A range of edit distance measures are available thanks to the stringdist package.
This package provides tools to retrieve and summarize taxonomic information and synonymy data for reptile species using data scraped from The Reptile Database website (<https://reptile-database.reptarium.cz/>). Outputs include clean and structured data frames useful for ecological, evolutionary, and conservation research.
This package provides tools to retrieve and summarize taxonomic information and synonymy data for reptile species using data scraped from The Reptile Database website (<https://reptile-database.reptarium.cz/>). Outputs include clean and structured data frames useful for ecological, evolutionary, and conservation research.
This package implements the computation of discrepancy statistics summarizing differences between the density of imputed and observed values and the construction of weights to balance covariates that are part of the missing data mechanism as described in Marbach (2021) <arXiv:2107.05427>.
This package contains data from the May 2021 Occupational Employment and Wage Statistics data release from the U.S. Bureau of Labor Statistics. The dataset covers employment and wages across occupations, industries, states, and at the national level. Metropolitan data is not included.
This package contains data from the May 2020 Occupational Employment and Wage Statistics data release from the U.S. Bureau of Labor Statistics. The dataset covers employment and wages across occupations, industries, states, and at the national level. Metropolitan data is not included.
This package provides methods for reducing the number of features within a data set. See Bauer JO (2021) <doi:10.1145/3475827.3475832> and Bauer JO, Drabant B (2021) <doi:10.1016/j.jmva.2021.104754> for more information on principal loading analysis.
The sinaplot is a data visualization chart suitable for plotting any single variable in a multiclass data set. It is an enhanced jitter strip chart, where the width of the jitter is controlled by the density distribution of the data within each class.
Convert, validate, format and elegantly print geographic coordinates and waypoints (paired latitude and longitude values) in decimal degrees, degrees and minutes, and degrees, minutes and seconds using high performance C++ code to enable rapid conversion and formatting of large coordinate and waypoint datasets.
Read and write XES Files to create event log objects used by the bupaR framework. XES (Extensible Event Stream) is the `IEEE` standard for storing and sharing event data (see <http://standards.ieee.org/findstds/standard/1849-2016.html> for more info).
This package performs robust estimation and inference when using covariate adjustment and/or covariate-adaptive randomization in randomized clinical trials. Ting Ye, Jun Shao, Yanyao Yi, Qinyuan Zhao (2023) <doi:10.1080/01621459.2022.2049278>. Ting Ye, Marlena Bannick, Yanyao Yi, Jun Shao (2023) <doi:10.1080/24754269.2023.2205802>. Ting Ye, Jun Shao, Yanyao Yi (2023) <doi:10.1093/biomet/asad045>. Marlena Bannick, Jun Shao, Jingyi Liu, Yu Du, Yanyao Yi, Ting Ye (2024) <doi:10.1093/biomet/asaf029>. Xiaoyu Qiu, Yuhan Qian, Jaehwan Yi, Jinqiu Wang, Yu Du, Yanyao Yi, Ting Ye (2025) <doi:10.48550/arXiv.2408.12541>.
This package implements the algorithm by Pourahmadi and Wang (2015) <doi:10.1016/j.spl.2015.06.015> for generating a random p x p correlation matrix. Briefly, the idea is to represent the correlation matrix using Cholesky factorization and p(p-1)/2 hyperspherical coordinates (i.e., angles), sample the angles from a particular distribution and then convert to the standard correlation matrix form. The angles are sampled from a distribution with pdf proportional to sin^k(theta) (0 < theta < pi, k >= 1) using the efficient sampling algorithm described in Enes Makalic and Daniel F. Schmidt (2018) <arXiv:1809.05212>.
This package offers simple statistical identification of contaminating sequence features in marker-gene or metagenomics data. It works on any kind of feature derived from environmental sequencing data (e.g. ASVs, OTUs, taxonomic groups, MAGs, etc). Requires DNA quantitation data or sequenced negative control samples.
This package helps you create simple maps; add sub-plots like pie plots to a map or any other plot; format, plot and export gridded data. The package was developed for displaying fisheries data but most functions can be used for more generic data visualisation.
This package provides a solution for analyzing digital images of plankton. In combination with ImageJ, an image analysis system, it processes digital images, measures individuals, trains for automatic classification of taxa, and finally, measures plankton samples (abundances, total and partial size spectra or biomasses, etc.).
This package simplifies regression tests by comparing objects produced by test code with earlier versions of those same objects. If objects are unchanged the tests pass, otherwise execution stops with error details. If in interactive mode, tests can be reviewed through the provided interactive environment.
Hapassoc performs likelihood inference of trait associations with haplotypes and other covariates in generalized linear models (GLMs). The functions are developed primarily for data collected in cohort or cross-sectional studies. They can accommodate uncertain haplotype phase and handle missing genotypes at some SNPs.
genArise is an easy to use tool for dual color microarray data. Its GUI-Tk based environment let any non-experienced user performs a basic, but not simple, data analysis just following a wizard. In addition it provides some tools for the developer.
Easy data analysis and quality checks which are commonly used in data science. It combines the tabular and graphical visualization for easier usability. This package also creates an R Notebook with detailed data exploration with one function call. The notebook can be made interactive.
This package provides a hodgepodge of hopefully helpful functions. Two of these perform shrinkage estimation: one using a simple weighted method where the user can specify the degree of shrinkage required, and one using James-Stein shrinkage estimation for the case of unequal variances.
Bayesian analysis of multivariate receptor modeling. The package consists of implementations of the methods of Park and Oh (2015) <doi:10.1016/j.chemolab.2015.08.021>.The package uses JAGS'(Just Another Gibbs Sampler) to generate Markov chain Monte Carlo samples of parameters.
We provide a tidy data structure and visualisations for multiple or grouped variable correlations, general association measures scagnostics and other pairwise scores suitable for numerical, ordinal and nominal variables. Supported measures include distance correlation, maximal information, ace correlation, Kendall's tau, and polychoric correlation.
Corbae-Ouliaris frequency domain filtering. According to Corbae and Ouliaris (2006) <doi:10.1017/CBO9781139164863.008>, this is a solution for extracting cycles from time series, like business cycles etc. when filtering. This method is valid for both stationary and non-stationary time series.