Despite there being a section in RFC 7231 <https://tools.ietf.org/html/rfc7231#section-5.5.3> defining a suggested structure for User-Agent headers this data is notoriously difficult to parse consistently. Tools are provided that will take in user agent strings and return structured R objects. This is a V8'-backed package based on the ua-parser project <https://github.com/ua-parser>.
This package predicts a drug’s primary target(s) or secondary target(s) by integrating large-scale genetic and drug screens from the Cancer Dependency Map project run by the Broad Institute. It further investigates whether the drug specifically targets the wild-type or mutated target forms. To show how to use this package in practice, we provided sample data along with step-by-step example.
Provide functions for retrieving, exploratory analyzing and visualizing the Human Protein Atlas data. HPAanalyze is designed to fullfill 3 main tasks: (1) Import, subsetting and export downloadable datasets; (2) Visualization of downloadable datasets for exploratory analysis; and (3) Working with the individual XML files. This package aims to serve researchers with little programming experience, but also allow power users to use the imported data as desired.
Modular package for generation of sets of ranges representing the null hypothesis. These can take the form of bootstrap samples of ranges (using the block bootstrap framework of Bickel et al 2010), or sets of control ranges that are matched across one or more covariates. nullranges is designed to be inter-operable with other packages for analysis of genomic overlap enrichment, including the plyranges Bioconductor package.
This package provides a tool to search and download a collection of tumour microenvironment single-cell RNA sequencing datasets and their metadata. TMExplorer aims to act as a single point of entry for users looking to study the tumour microenvironment at the single cell level. Users can quickly search available datasets using the metadata table and then download the ones they are interested in for analysis.
PaleoClim <http://www.paleoclim.org> (Brown et al. 2019, <doi:10.1038/sdata.2018.254>) is a set of free, high resolution paleoclimate surfaces covering the whole globe. It includes data on surface temperature, precipitation and the standard bioclimatic variables commonly used in ecological modelling, derived from the HadCM3 general circulation model and downscaled to a spatial resolution of up to 2.5 minutes. Simulations are available for key time periods from the Late Holocene to mid-Pliocene. Data on current and Last Glacial Maximum climate is derived from CHELSA (Karger et al. 2017, <doi:10.1038/sdata.2017.122>) and reprocessed by PaleoClim to match their format; it is available at up to 30 seconds resolution. This package provides a simple interface for downloading PaleoClim data in R, with support for caching and filtering retrieved data by period, resolution, and geographic extent.
The Open MPI Project is an MPI-3 implementation that is developed and maintained by a consortium of academic, research, and industry partners. Open MPI is therefore able to combine the expertise, technologies, and resources from all across the High Performance Computing community in order to build the best MPI library available. Open MPI offers advantages for system and software vendors, application developers and computer science researchers.
This package provides a shiny application to assess statistical assumptions and guide users toward appropriate tests. The app is designed for researchers with minimal statistical training and provides diagnostics, plots, and test recommendations for a wide range of analyses. Many statistical assumptions are implemented using the package rstatix (Kassambara, 2019) <doi:10.32614/CRAN.package.rstatix> and performance (Lüdecke et al., 2021) <doi:10.21105/joss.03139>.
This package contains functions for testing for significant differences between multiple coefficients of variation. Includes Feltz and Miller's (1996) <DOI:10.1002/(SICI)1097-0258(19960330)15:6%3C647::AID-SIM184%3E3.0.CO;2-P> asymptotic test and Krishnamoorthy and Lee's (2014) <DOI:10.1007/s00180-013-0445-2> modified signed-likelihood ratio test. See the vignette for more, including full details of citations.
This package provides functions to download, process, and visualize German geospatial data across administrative levels, including states, districts, and municipalities. Supports interactive tables and customized maps using built-in or external datasets. Official shapefiles are accessed from the German Federal Agency for Cartography and Geodesy (BKG) <https://gdz.bkg.bund.de/>, licensed under dl-de/by-2-0 <https://www.govdata.de/dl-de/by-2-0>.
This package provides four addons for analyzing trends and unit roots in financial time series: (i) functions for the density and probability of the augmented Dickey-Fuller Test, (ii) functions for the density and probability of MacKinnon's unit root test statistics, (iii) reimplementations for the ADF and MacKinnon Test, and (iv) an urca Unit Root Test Interface for Pfaff's unit root test suite.
This package contains a set of tools for constructing and coercing into and from the "mdate" class. This date class implements ISO 8601-2:2019(E) and allows regular dates to be annotated to express unspecified date components, approximate or uncertain date components, date ranges, and sets of dates. This is useful for describing and analysing temporal information, whether historical or recent, where date precision may vary.
This package provides functions and datasets to support Smilde, Næs and Liland (2021, ISBN: 978-1-119-60096-1) "Multiblock Data Fusion in Statistics and Machine Learning - Applications in the Natural and Life Sciences". This implements and imports a large collection of methods for multiblock data analysis with common interfaces, result- and plotting functions, several real data sets and six vignettes covering a range different applications.
Matching with string distance has never been easier! messy.cats contains various functions that employ string distance tools in order to make data management easier for users working with categorical data. Categorical data, especially user inputted categorical data that often tends to be plagued by typos, can be difficult to work with. messy.cats aims to provide functions that make cleaning categorical data simple and easy.
Scale alignment is a new procedure for rescaling dimensions of between-items multidimensional Rasch family models so that dimensions scores can be compared directly (Feuerstahler & Wilson, 2019; under review) <doi:10.1111/jedm.12209>. This package includes functions for implementing delta-dimensional alignment (DDA) and logistic regression alignment (LRA) for dichotomous or polytomous data. This function also includes a wrapper for models fit using the TAM package.
Transform complex statistical output into straightforward, understandable, and context-aware natural language descriptions using Large Language Models (LLMs), making complex analyses more accessible to individuals with varying statistical expertise. It relies on the ellmer package to interface with LLM providers including OpenAI <https://openai.com/>, Google AI Studio <https://aistudio.google.com/>, and Anthropic <https://www.anthropic.com/> (API keys are required and managed via ellmer').
Flexibly simulates a dataset with time-varying covariates with user-specified exchangeable correlation structures across and within clusters. Covariates can be normal or binary and can be static within a cluster or time-varying. Time-varying normal variables can optionally have linear trajectories within each cluster. See ?make_one_dataset for the main wrapper function. See Montez-Rath et al. <arXiv:1709.10074> for methodological details.
We provide a tidy grammar of population genetics, facilitating the manipulation and analysis of data on biallelic single nucleotide polymorphisms (SNPs). tidypopgen scales to very large genetic datasets by storing genotypes on disk, and performing operations on them in chunks, without ever loading all data in memory. The full functionalities of the package are described in Carter et al. (2025) <doi:10.1101/2025.06.06.658325>.
This package implements triple-difference (DDD) estimators for both average treatment effects and event-study parameters. Methods include regression adjustment, inverse-probability weighting, and doubly-robust estimators, all of which rely on a conditional DDD parallel-trends assumption and allow covariate adjustment across multiple pre- and post-treatment periods. The methodology is detailed in Ortiz-Villavicencio and Sant'Anna (2025) <doi:10.48550/arXiv.2505.09942>.
This package provides a set of functions for manipulating data frames in accordance with specific business rules. In addition, it includes wrapper functions for commonly used functions from the popular tidyverse package, making it easy to integrate these functions into data analysis workflows. The package is designed to streamline data preprocessing and help users quickly and efficiently perform data transformations that are specific to their business needs.
Grammatical evolution (see O'Neil, M. and Ryan, C. (2003,ISBN:1-4020-7444-1)) uses decoders to convert linear (binary or integer genes) into programs. In addition, automatic determination of codon precision with a limited rule choice bias is provided. For a recent survey of grammatical evolution, see Ryan, C., O'Neill, M., and Collins, J. J. (2018) <doi:10.1007/978-3-319-78717-6>.
The package provides ready to use epigenomes (obtained from TWGBS) and transcriptomes (RNA-seq) from various tissues as obtained in the study (Delacher and Imbusch 2017, PMID: 28783152). Regulatory T cells (Treg cells) perform two distinct functions: they maintain self-tolerance, and they support organ homeostasis by differentiating into specialized tissue Treg cells. The underlying dataset characterises the epigenetic and transcriptomic modifications for specialized tissue Treg cells.
This package facilitates phyloseq exploration and analysis of taxonomic profiling data. This package provides tools for the manipulation, statistical analysis, and visualization of taxonomic profiling data. In addition to targeted case-control studies, microbiome facilitates scalable exploration of population cohorts. This package supports the independent phyloseq data format and expands the available toolkit in order to facilitate the standardization of the analyses and the development of best practices.
Estimation of regression models for sparse asynchronous longitudinal observations, where time-dependent response and covariates are mismatched and observed intermittently within subjects. Kernel weighted estimating equations are used for generalized linear models with either time-invariant or time-dependent coefficients. Cao, H., Li, J., and Fine, J. P. (2016) <doi:10.1214/16-EJS1141>. Cao, H., Zeng, D., and Fine, J. P. (2015) <doi:10.1111/rssb.12086>.