Ref::Util introduces several functions to help identify references in a smarter (and usually faster) way. The difference with conventional approach:
No comparison against a string constant
Supports blessed variables
Supports tied variables and magic
Ignores overloading
Ignores subtle types
Usually faster
The main function is doppelgangR(), which takes as minimal input a list of ExpressionSet object, and searches all list pairs for duplicated samples. The search is based on the genomic data (exprs(eset)), phenotype/clinical data (pData(eset)), and "smoking guns" - supposedly unique identifiers found in pData(eset).
It computes betas-select, coefficients after standardization in structural equation models and regression models, standardizing only selected variables. Supports models with moderation, with product terms formed after standardization. It also offers confidence intervals that account for standardization, including bootstrap confidence intervals as proposed by Cheung et al. (2022) <doi:10.1037/hea0001188>.
Estimates the shape and volume of high-dimensional datasets and performs set operations: intersection / overlap, union, unique components, inclusion test, and hole detection. Uses stochastic geometry approach to high-dimensional kernel density estimation, support vector machine delineation, and convex hull generation. Applications include modeling trait and niche hypervolumes and species distribution modeling.
This package provides a toolbox to handle and represent trophic networks in space or time across aggregation levels. This package contains a layout algorithm specifically designed for trophic networks, using dimension reduction on a diffusion graph kernel and trophic levels. Importantly, this package provides a layout method applicable for large trophic networks.
This package provides access to well-documented medical datasets for teaching. Featuring several from the Teaching of Statistics in the Health Sciences website <https://www.causeweb.org/tshs/category/dataset/>, a few reconstructed datasets of historical significance in medical research, some reformatted and extended from existing R packages, and some data donations.
This package provides a collection of functions to search and download street view imagery ('Mapilary <https://www.mapillary.com/developer/api-documentation>) and to extract, quantify, and visualize visual features. Moreover, there are functions provided to generate Qualtrics survey in TXT format using the collection of street views for various research purposes.
This package provides a framework for the creation and use of Neural ordinary differential equations with the tensorflow and keras packages. The idea of Neural ordinary differential equations comes from Chen et al. (2018) <doi:10.48550/arXiv.1806.07366>, and presents a novel way of learning and solving differential systems.
Lite interface for finding locations of addresses or businesses around the world using the ArcGIS REST API service <https://developers.arcgis.com/rest/geocode/api-reference/overview-world-geocoding-service.htm>. Address text can be converted to location candidates and a location can be converted into an address. No API key required.
Package for the access and distribution of long-term lake datasets from lakes in the Adirondack Park, northern New York state. Includes a wide variety of physical, chemical, and biological parameters from 28 lakes. Data are from multiple collection organizations and have been harmonized in both time and space for ease of reuse.
This package implements a bootstrap aggregated (bagged) version of the k-nearest neighbors survival probability prediction method (Lowsky et al. 2013). In addition to the bootstrapping of training samples, the features can be subsampled in each baselearner to break the correlation between them. The Rcpp package is used to speed up the computation.
This package implements the distribution-free goodness-of-fit regression test for the mean structure of parametric models introduced in Khmaladze (2021) <doi:10.1007/s10463-021-00786-3>. The test is implemented for general functions with minimal distributional assumptions as well as common models (e.g., lm, glm) with the usual assumptions.
Hospital machine learning and ai data analysis workflow tools, modeling, and automations. This library provides many useful tools to review common administrative hospital data. Some of these include predicting length of stay, and readmits. The aim is to provide a simple and consistent verb framework that takes the guesswork out of everything.
This package provides classes and methods for seismic data analysis. The base classes and methods are inspired by the python code found in the ObsPy python toolbox <https://github.com/obspy/obspy>. Additional classes and methods support data returned by web services provided by the EarthScope Consortium. <https://service.earthscope.org/>.
Mine metrics on common places on the web through the power of their APIs (application programming interfaces). It also helps make the data in a format that is easily used for a dashboard or other purposes. There is an associated dashboard template and tutorials that are underdevelopment that help you fully utilize metricminer'.
Turning point method is a method proposed by Choi (1990) <doi:10.2307/2531453> to estimate 50 percent effective dose (ED50) in the study of drug sensitivity. The method has its own advantages for that it can provide robust ED50 estimation. This package contains the modified function of Choi's turning point method.
Native R tools for optimal binning workflows in predictive modeling. The package provides APIs for binary, multi-class and continuous targets, with multi-variable binning and scorecard workflows. Methods are informed by Navas-Palencia (2020) <doi:10.48550/arXiv.2001.08025> and Navas-Palencia (2021) <doi:10.48550/arXiv.2104.08619>.
Allows search and visualisation of a collection of uniformly processed skeletal transcriptomic datasets. Includes methods to identify datasets where genes of interest are differentially expressed and find datasets with a similar gene expression pattern to a query dataset Soul J, Hardingham TE, Boot-Handford RP, Schwartz JM (2019) <doi:10.1093/bioinformatics/bty947>.
This package is a feature selection package of the mlr3 ecosystem. It selects the optimal feature set for any mlr3 learner. The package works with several optimization algorithms e.g. random search, Recursive feature elimination, and genetic search. Moreover, it can automatically optimize learners and estimate the performance of optimized feature sets with nested resampling.
This package contains functions to implement the methodology and considerations laid out by Marks et al. in the article "Measuring abnormality in high dimensional spaces: applications in biomechanical gait analysis". Using high-dimensional datasets to measure a subject's overall level of abnormality as compared to a reference population is often needed in outcomes research.
This package is a comprehensive visualization tool specifically designed for exploring phylomorphospace. It not only simplifies the process of generating phylomorphospace, but also enhances it with the capability to add graphic layers to the plot with grammar of graphics to create fully annotated phylomorphospaces. It also provide some utilities to help interpret evolutionary patterns.
immunoClust is a model based clustering approach for Flow Cytometry samples. The cell-events of single Flow Cytometry samples are modelled by a mixture of multinominal normal- or t-distributions. The cell-event clusters of several samples are modelled by a mixture of multinominal normal-distributions aiming stable co-clusters across these samples.
scoreInvHap can get the samples inversion status of known inversions. scoreInvHap uses SNP data as input and requires the following information about the inversion: genotype frequencies in the different haplotypes, R2 between the region SNPs and inversion status and heterozygote genotypes in the reference. The package include this data for 21 inversions.
signifinder is an R package for computing and exploring a compendium of tumor signatures. It allows to compute a variety of signatures coming from public literature, based on gene expression values, and return single-sample (-cell/-spot) scores. Currently, signifinder collects more than 70 distinct signatures, relating to multiple tumors and multiple cancer processes.