Add trendline and confidence interval of linear or nonlinear regression model and show equation to ggplot as simple as possible. For a general overview of the methods used in this package, see Ritz and Streibig (2008) <doi:10.1007/978-0-387-09616-2> and Greenwell and Schubert Kabban (2014) <doi:10.32614/RJ-2014-009>.
Sports Injury Data analysis aims to identify and describe the magnitude of the injury problem, and to gain more insights (e.g. determine potential risk factors) by statistical modelling approaches. The injurytools package provides standardized routines and utilities that simplify such analyses. It offers functions for data preparation, informative visualizations and descriptive and model-based analyses.
This package performs variety of viral quasispecies diversity analyses [see Pamornchainavakul et al. (2024) <doi:10.21203/rs.3.rs-4637890/v1>] based on long-read sequence alignment. Main functions include 1) sequencing error and other noise minimization and read sampling, 2) Single nucleotide variant (SNV) profiles comparison, and 3) viral quasispecies profiles comparison and visualization.
This package implements methods to automate the Auer-Gervini graphical Bayesian approach for determining the number of significant principal components. Automation uses clustering, change points, or simple statistical models to distinguish "long" from "short" steps in a graph showing the posterior number of components as a function of a prior parameter. See <doi:10.1101/237883>.
An R package providing extended biological annotations for the SomaScan
Assay, a proteomics platform developed by SomaLogic
Operating Co., Inc. The annotations in this package were assembled using data from public repositories. For more information about the SomaScan
assay and its data, please reference the SomaLogic/SomaLogic-Data
GitHub
repository.
This package provides tools for analyzing R expressions or blocks of code and determining the dependencies between them. It focuses on R scripts, but can be used on the bodies of functions. There are many facilities including the ability to summarize or get a high-level view of code, determining dependencies between variables, code improvement suggestions.
This package creates dummy columns from columns that have categorical variables (character or factor types). You can also specify which columns to make dummies out of, or which columns to ignore. Also creates dummy rows from character, factor, and Date columns. This package provides a significant speed increase from creating dummy variables through model.matrix()
.
The IPC::Run3 module allows you to run a subprocess and redirect stdin, stdout, and/or stderr to files and perl data structures. It aims to satisfy 99% of the need for using system, qx, and open3 with a simple, extremely Perlish API and none of the bloat and rarely used features of IPC::Run.
In many analyses, a large amount of variables have to be tested independently against the trait/endpoint of interest, and also adjusted for covariates and confounding factors at the same time. The major bottleneck in these is the amount of time that it takes to complete these analyses. With RegParallel
, a large number of tests can be performed simultaneously. On a 12-core system, 144 variables can be tested simultaneously, with 1000s of variables processed in a matter of seconds via nested parallel processing. Works for logistic regression, linear regression, conditional logistic regression, Cox proportional hazards and survival models, and Bayesian logistic regression. Also caters for generalised linear models that utilise survey weights created by the survey CRAN package and that utilise survey::svyglm'.
Simulating bivariate survival data from copula models. Estimation of the association parameter in copula models. Two different ways to estimate the association parameter in copula models are implemented. A goodness-of-fit test for a given copula model is implemented. See Emura, Lin and Wang (2010) <doi:10.1016/j.csda.2010.03.013> for details.
Counts colors within color range(s) in images, and provides a masked version of the image with targeted pixels changed to a different color. Output includes the locations of the pixels in the images, and the proportion of the image within the target color range with optional background masking. Users can specify multiple color ranges for masking.
Quick and easy access to datasets that let you replicate the empirical examples in Cameron and Trivedi (2005) "Microeconometrics: Methods and Applications" (ISBN: 9780521848053).The data are available as soon as you install and load the package (lazy-loading) as data frames. The documentation includes reference to chapter sections and page numbers where the datasets are used.
Set of functions for step-wise generation of (weighted) graphs. Aimed for research in the field of single- and multi-objective combinatorial optimization. Graphs are generated adding nodes, edges and weights. Each step may be repeated multiple times with different predefined and custom generators resulting in high flexibility regarding the graph topology and structure of edge weights.
This package provides a sparklyr <https://spark.rstudio.com/> extension that provides an R interface for GraphFrames
<https://graphframes.github.io/>. GraphFrames
is a package for Apache Spark that provides a DataFrame-based
API for working with graphs. Functionality includes motif finding and common graph algorithms, such as PageRank
and Breadth-first search.
This package provides a bridge between the loon and ggplot2 packages. Extends the grammar of ggplot to add clauses to create interactive loon plots. Existing ggplot(s) can be turned into interactive loon plots and loon plots into static ggplot(s); the function loon.ggplot()
is the bridge from one plot structure to the other.
This package contains functions intended to facilitate the production of plant taxonomic monographs. The package includes functions to convert tables into taxonomic descriptions, lists of collectors, examined specimens, identification keys (dichotomous and interactive), and can generate a monograph skeleton. Additionally, wrapper functions to batch the production of phenology histograms and distributional and diversity maps are also available.
It performs variable selection in a multivariate linear model by estimating the covariance matrix of the residuals then use it to remove the dependence that may exist among the responses and eventually performs variable selection by using the Lasso criterion. The method is described in the paper Perrot-Dockès et al. (2017) <arXiv:1704.00076>
.
R Client for the Microsoft Cognitive Services Text Analytics REST API, including Sentiment Analysis, Topic Detection, Language Detection, and Key Phrase Extraction. An account MUST be registered at the Microsoft Cognitive Services website <https://www.microsoft.com/cognitive-services/> in order to obtain a (free) API key. Without an API key, this package will not work properly.
This package provides a collection of various oversampling techniques developed from SMOTE is provided. SMOTE is a oversampling technique which synthesizes a new minority instance between a pair of one minority instance and one of its K nearest neighbor. Other techniques adopt this concept with other criteria in order to generate balanced dataset for class imbalance problem.
Balancing computational and statistical efficiency, subsampling techniques offer a practical solution for handling large-scale data analysis. Subsampling methods enhance statistical modeling for massive datasets by efficiently drawing representative subsamples from full dataset based on tailored sampling probabilities. These probabilities are optimized for specific goals, such as minimizing the variance of coefficient estimates or reducing prediction error.
This package is for designing Crispr/Cas9 and Prime Editing experiments. It contains functions to (1) define and transform genomic targets, (2) find spacers (4) count offtarget (mis)matches, and (5) compute Doench2016/2014 targeting efficiency. Care has been taken for multicrispr to scale well towards large target sets, enabling the design of large Crispr/Cas9 libraries.
When exploring data or models we often examine variables one by one. This analysis is incomplete if the relationship between these variables is not taken into account. The corrgrapher package facilitates simultaneous exploration of the Partial Dependence Profiles and the correlation between variables in the model. The package corrgrapher is a part of the DrWhy.AI
universe.
This package provides tools to process and analyze chest expansion using 3D marker data from motion capture systems. Includes functions for data processing, marker position adjustment, volume calculation using convex hulls, and visualization in 2D and 3D. Barber et al. (1996) <doi:10.1145/235815.235821>. TAMIYA Hiroyuki et al. (2021) <doi:10.1038/s41598-021-01033-8>.
Utility functions to be used to analyse datasets obtained from seed germination/emergence assays. Fits several types of seed germination/emergence models, including those reported in Onofri et al. (2018) "Hydrothermal-time-to-event models for seed germination", European Journal of Agronomy, 101, 129-139 <doi:10.1016/j.eja.2018.08.011>. Contains several datasets for practicing.