This collection of data exploration tools was developed at Yale University for the graphical exploration of complex multivariate data; barcode and gpairs now have their own packages. The big.read.table()
function provided here may be useful for large files when only a subset is needed (but please see the note in the help page for this function).
Construct an explainable nomogram for a machine learning (ML) model to improve availability of an ML prediction model in addition to a computer application, particularly in a situation where a computer, a mobile phone, an internet connection, or the application accessibility are unreliable. This package enables a nomogram creation for any ML prediction models, which is conventionally limited to only a linear/logistic regression model. This nomogram may indicate the explainability value per feature, e.g., the Shapley additive explanation value, for each individual. However, this package only allows a nomogram creation for a model using categorical without or with single numerical predictors. Detailed methodologies and examples are documented in our vignette, available at <https://htmlpreview.github.io/?https://github.com/herdiantrisufriyana/rmlnomogram/blob/master/doc/ml_nomogram_exemplar.html>.
Generate basic charts either by custom applications, or from a small script launched from the system console, or within the R console. Two ASCII text files are necessary: (1) The graph parameters file, which name is passed to the function rplotengine()
'. The user can specify the titles, choose the type of the graph, graph output formats (e.g. png, eps), proportion of the X-axis and Y-axis, position of the legend, whether to show or not a grid at the background, etc. (2) The data to be plotted, which name is specified as a parameter ('data_filename') in the previous file. This data file has a tabulated format, with a single character (e.g. tab) between each column. Optionally, the file could include data columns for showing confidence intervals.
This package provides The ChaCha20
stream cipher (RFC 8439) implemented in pure Rust using traits from the RustCrypto
`cipher` crate, with optional architecture-specific hardware acceleration (AVX2, SSE2). Additionally provides the ChaCha8
, ChaCha12
, XChaCha20
, XChaCha12
and XChaCha8
stream ciphers, and also optional rand_core-compatible RNGs based on those ciphers.
Finds, prioritizes and deletes erroneous taxa in a phylogenetic tree. This package calculates scores for taxa in a tree. Higher score means the taxon is more erroneous. If the score is zero for a taxon, the taxon is not erroneous. This package also can remove all erroneous taxa automatically by iterating score calculation and pruning taxa with the highest score.
Calculate Bayesian marginal effects, average marginal effects, and marginal coefficients (also called population averaged coefficients) for models fit using the brms package including fixed effects, mixed effects, and location scale models. These are based on marginal predictions that integrate out random effects if necessary (see for example <doi:10.1186/s12874-015-0046-6> and <doi:10.1111/biom.12707>).
Plots a set of x,y,z co-ordinates in a contour map. Designed to be similar to plots in base R so additional elements can be added using lines()
, points()
etc. This package is intended to be better suited, than existing packages, to displaying circular shaped plots such as those often seen in the semi-conductor industry.
Statistical models fit to compositional data are often difficult to interpret due to the sum to 1 constraint on data variables. DImodelsVis
provides novel visualisations tools to aid with the interpretation of models fit to compositional data. All visualisations in the package are created using the ggplot2 plotting framework and can be extended like every other ggplot object.
This package performs exploratory data analysis and variable screening for binary classification models using weight-of-evidence (WOE) and information value (IV). In order to make the package as efficient as possible, aggregations are done in data.table and creation of WOE vectors can be distributed across multiple cores. The package also supports exploration for uplift models (NWOE and NIV).
Display a 2D-matrix data as a interactive zoomable gray-scale image viewer, providing tools for manual data inspection. The viewer window shows cursor guiding lines and a corresponding data slices for both axes at the current cursor position. A tool-bar allows adjusting image display brightness/contrast through WebGL
filters and performing basic high-pass/low-pass filtering.
The data analysis module for the Iterative Optimization Heuristics Profiler ('IOHprofiler'). This module provides statistical analysis methods for the benchmark data generated by optimization heuristics, which can be visualized through a web-based interface. The benchmark data is usually generated by the experimentation module, called IOHexperimenter'. IOHanalyzer also supports the widely used COCO (Comparing Continuous Optimisers) data format for benchmarking.
This package provides functions and S4 methods to create and manage discrete time Markov chains more easily. In addition functions to perform statistical (fitting and drawing random variates) and probabilistic (analysis of their structural proprieties) analysis are provided. See Spedicato (2017) <doi:10.32614/RJ-2017-036>. Some functions for continuous times Markov chains depend on the suggested ctmcd package.
It includes functions to download and process the Planet NICFI (Norway's International Climate and Forest Initiative) Satellite Imagery utilizing the Planet Mosaics API <https://developers.planet.com/docs/basemaps/reference/#tag/Basemaps-and-Mosaics>. GDAL (library for raster and vector geospatial data formats) and aria2c (paralleled download utility) must be installed and configured in the user's Operating System.
Computes profile extrema functions for arbitrary functions. If the function is expensive-to-evaluate it computes profile extrema by emulating the function with a Gaussian process (using package DiceKriging
'). In this case uncertainty quantification on the profile extrema can also be computed. The different plotting functions for profile extrema give the user a tool to better locate excursion sets.
Allows to connect selectizeInputs
widgets as filters to a reactable table. As known from spreadsheet applications, column filters are interdependent, so each filter only shows the values that are really available at the moment based on the current selection in other filters. Filter values currently not available (and also those being available) can be shown via popovers or tooltips.
Making specification curve analysis easy, fast, and pretty. It improves upon existing offerings with additional features and tidyverse integration. Users can easily visualize and evaluate how their models behave under different specifications with a high degree of customization. For a description and applications of specification curve analysis see Simonsohn, Simmons, and Nelson (2020) <doi:10.1038/s41562-020-0912-z>.
This package performs analysis of various genetic parameters like genotypic and phenotypic coefficient of variance, heritability, genetic advance, genetic advance as a percentage of mean. The package also has functions for genotypic and phenotypic covariance, correlation and path analysis. Dataset has been added to facilitate example. For more information refer Singh, R.K. and Chaudhary, B.D. (1977, ISBN:81766330709788176633079).
Inference of ligand-receptor (LR) interactions from bulk expression (transcriptomics/proteomics) data, or spatial transcriptomics. BulkSignalR
bases its inferences on the LRdb database included in our other package, SingleCellSignalR
available from Bioconductor. It relies on a statistical model that is specific to bulk data sets. Different visualization and data summary functions are proposed to help navigating prediction results.
This package contains a number of comparative "phylogenetic" methods, mostly focusing on analysing diversification and character evolution. Contains implementations of "BiSSE" (Binary State Speciation and Extinction) and its unresolved tree extensions, "MuSSE" (Multiple State Speciation and Extinction), "QuaSSE", "GeoSSE", and "BiSSE-ness" Other included methods include Markov models of discrete and continuous trait evolution and constant rate speciation and extinction.
The clusterCrit
package provides an implementation of the following indices: Czekanowski-Dice, Folkes-Mallows, Hubert Γ, Jaccard, McNemar, Kulczynski, Phi, Rand, Rogers-Tanimoto, Russel-Rao or Sokal-Sneath. ClusterCrit defines several functions which compute internal quality indices or external comparison indices. The partitions are specified as an integer vector giving the index of the cluster each observation belongs to.
This package contains data from an observational study concerning possible effects of light daily alcohol consumption on survival and on HDL cholesterol. It also replicates various simple analyses in Rosenbaum (2025a) <doi:10.1080/09332480.2025.2473291>. Finally, it includes new R code in wgtRankCef()
that implements and replicates a new method for constructing evidence factors in observational block designs.
Given a patient-sharing network, calculate either the classic care density as proposed by Pollack et al. (2013) <doi:10.1007/s11606-012-2104-7> or the fragmented care density as proposed by Engels et al. (2024) <doi:10.1186/s12874-023-02106-0>. By utilizing the igraph and data.table packages, the provided functions scale well for very large graphs.
The DoseFinding
package provides functions for the design and analysis of dose-finding experiments (with focus on pharmaceutical Phase II clinical trials). It provides functions for: multiple contrast tests, fitting non-linear dose-response models (using Bayesian and non-Bayesian estimation), calculating optimal designs and an implementation of the MCPMod methodology (Pinheiro et al. (2014) <doi:10.1002/sim.6052>).
Duplicated data can exist in different rows and columns and user may need to treat observations (rows) connected by duplicated data as one observation, e.g. companies can belong to one family (and thus: be one company) by sharing some telephone numbers. This package allows to find connected rows based on data on chosen columns and collapse it into one row.