This package helps to construct standard dialog boxes for your GUI, including message boxes, input boxes, list, file or directory selection, and others. In case R cannot display GUI dialog boxes, a simpler command line version of these interactive elements is also provided as a fallback solution.
This package extends the grammar of graphics as implemented by ggplot2
to include the description of animation. It does this by providing a range of new grammar classes that can be added to the plot object in order to customise how it should change with time.
ReUseData
is an _R/Bioconductor_ software tool to provide a systematic and versatile approach for standardized and reproducible data management. ReUseData
facilitates transformation of shell or other ad hoc scripts for data preprocessing into workflow-based data recipes. Evaluation of data recipes generate curated data files in their generic formats (e.g., VCF, bed). Both recipes and data are cached using database infrastructure for easy data management and reuse. Prebuilt data recipes are available through ReUseData
portal ("https://rcwl.org/dataRecipes
/") with full annotation and user instructions. Pregenerated data are available through ReUseData
cloud bucket that is directly downloadable through "getCloudData()
".
Access data stored in REDCap databases using the Application Programming Interface (API). REDCap (Research Electronic Data CAPture; <https://projectredcap.org>, Harris, et al. (2009) <doi:10.1016/j.jbi.2008.08.010>, Harris, et al. (2019) <doi:10.1016/j.jbi.2019.103208>) is a web application for building and managing online surveys and databases developed at Vanderbilt University. The API allows users to access data and project meta data (such as the data dictionary) from the web programmatically. The redcapAPI
package facilitates the process of accessing data with options to prepare an analysis-ready data set consistent with the definitions in a database's data dictionary.
(guix-science-nonfree packages bioinformatics)
CellAlign is a tool for quantitative comparison of expression dynamics within or between single-cell trajectories. The input to the CellAlign workflow is any trajectory vector that orders single cell expression with a pseudo-time spacing and the expression matrix for the cells used to define the trajectory.
Read and manipulate Camera Trap Data Packages ('Camtrap DP'). Camtrap DP (<https://camtrap-dp.tdwg.org>) is a data exchange format for camera trap data. With camtrapdp you can read, filter and transform data (including to Darwin Core) before further analysis in e.g. camtraptor or camtrapR
'.
Conditional Maximum Likelihood Calibration and data management of multistage tests. Supports polytomous items and incomplete designs with linear as well as multistage tests. Extended Nominal Response and Interaction models, DIF and profile analysis. See Robert J. Zwitser and Gunter Maris (2015)<doi:10.1007/s11336-013-9369-6>.
This package provides a common interface for applying dimensionality reduction methods, such as Principal Component Analysis ('PCA'), Independent Component Analysis ('ICA'), diffusion maps, Locally-Linear Embedding ('LLE'), t-distributed Stochastic Neighbor Embedding ('t-SNE'), and Uniform Manifold Approximation and Projection ('UMAP'). Has built-in support for sparse matrices.
Uses species occupancy at coarse grain sizes to predict species occupancy at fine grain sizes. Ten models are provided to fit and extrapolate the occupancy-area relationship, as well as methods for preparing atlas data for modelling. See Marsh et. al. (2018) <doi:10.18637/jss.v086.c03>.
This package provides several methods for model distillation and interpretability for general black box machine learning models and treatment effect estimation methods. For details on the algorithms implemented, see <https://forestry-labs.github.io/distillML/index.html>
Brian Cho, Theo F. Saarinen, Jasjeet S. Sekhon, Simon Walter.
Researchers carried out a series of experiments passing a number of essays to different GPT detection models. Juxtaposing detector predictions for papers written by native and non-native English writers, the authors argue that GPT detectors disproportionately classify real writing from non-native English writers as AI-generated.
Stores small spatial datasets used to teach basic spatial analysis concepts. Datasets are based off of the GeoDa
software workbook and data site <https://geodacenter.github.io/data-and-lab/> developed by Luc Anselin and team at the University of Chicago. Datasets are stored as sf objects.
Given a high-dimensional dataset that typically represents a cytometry dataset, and a subset of the datapoints, this algorithm outputs an hyperrectangle so that datapoints within the hyperrectangle best correspond to the specified subset. In essence, this allows the conversion of clustering algorithms outputs to gating strategies outputs.
Compute several variations of the Implicit Association Test (IAT) scores, including the D scores (Greenwald, Nosek, Banaji, 2003, <doi:10.1037/0022-3514.85.2.197>) and the new scores that were developed using robust statistics (Richetin, Costantini, Perugini, and Schonbrodt, 2015, <doi:10.1371/journal.pone.0129601>).
Quick indexation of any type of vector or of any combination of those. Indexation turns a vector into an integer vector going from 1 to the number of unique elements. Indexes are important building blocks for many algorithms. The method is described at <https://github.com/lrberge/indexthis/>.
Clustering or classification of longitudinal data based on a mixture of multivariate t or Gaussian distributions with a Cholesky-decomposed covariance structure. Details in McNicholas
and Murphy (2010) <doi:10.1002/cjs.10047> and McNicholas
and Subedi (2012) <doi:10.1016/j.jspi.2011.11.026>.
This package provides a variety of association tests for microbiome data analysis including Quasi-Conditional Association Tests (QCAT) described in Tang Z.-Z. et al.(2017) <doi:10.1093/bioinformatics/btw804> and Zero-Inflated Generalized Dirichlet Multinomial (ZIGDM) tests described in Tang Z.-Z. & Chen G. (2017, submitted).
This package provides a novel framework to estimate mixed models via gradient boosting. The implemented functions are based on mboost and lme4'. Hence, the family range is predetermined by lme4'. A correction mechanism for cluster-constant covariates is implemented as well as an estimation of random effects covariance.
Enables user to perform the following: 1. Roll n number of die/dice (roll()
). 2. Toss n number of coin(s) (toss()
). 3. Play the game of Rock, Paper, Scissors. 4. Choose n number of card(s) from a pack of 52 playing cards (Joker optional).
Estimation methods for optimal treatment regimes under three different criteria, namely marginal quantile, marginal mean, and mean absolute difference. For the first two criteria, both one-stage and two-stage estimation method are implemented. A doubly robust estimator for estimating the quantile-optimal treatment regime is also included.
This package implements named semaphores from the boost C++ library <https://www.boost.org/> for interprocess communication. Multiple R sessions on the same host can block (with optional timeout) on a semaphore until it becomes positive, then atomically decrement it and unblock. Any session can increment the semaphore.
Simulate age-structured populations that vary in space and time and explore the efficacy of a range of built-in or user-defined sampling protocols to reproduce the population parameters of the known population. (See Regular et al. (2020) <doi:10.1371/journal.pone.0232822> for more details).
Delta Method implementation to estimate standard errors with known asymptotic properties within the tidyverse workflow. The Delta Method is a statistical tool that approximates an estimatorâ s behaviour using a Taylor Expansion. For a comprehensive explanation, please refer to Chapter 3 of van der Vaart (1998, ISBN: 9780511802256).
Generate SuperSigs
(supervised mutational signatures) from single nucleotide variants in the cancer genome. Functions included in the package allow the user to learn supervised mutational signatures from their data and apply them to new data. The methodology is based on the one described in Afsari (2021, ELife).