The clusterCrit package provides an implementation of the following indices: Czekanowski-Dice, Folkes-Mallows, Hubert Γ, Jaccard, McNemar, Kulczynski, Phi, Rand, Rogers-Tanimoto, Russel-Rao or Sokal-Sneath. ClusterCrit defines several functions which compute internal quality indices or external comparison indices. The partitions are specified as an integer vector giving the index of the cluster each observation belongs to.
Allows for painless use of the Metopio health atlas APIs <https://metopio.com/health-atlas> to explore and import data. Metopio health atlases store open public health data. See what topics (or indicators) are available among specific populations, periods, and geographic layers. Download relevant data along with geographic boundaries or point datasets. Spatial datasets are returned as sf objects.
Summarizes characteristics of linear mixed effects models without data or a fitted model by converting code for fitting lmer() from lme4 and lme() from nlme into tables, equations, and visuals. Outputs can be used to learn how to fit linear mixed effects models in R and to communicate about these models in presentations, manuscripts, and analysis plans.
Automatically estimate 11 effect size measures from a well-formatted dataset. Various other functions can help, for example, removing dependency between several effect sizes, or identifying differences between two datasets. This package is mainly designed to assist in conducting a systematic review with a meta-analysis but can be useful to any researcher interested in estimating an effect size.
Standardized survey outcome rate functions, including the response rate, contact rate, cooperation rate, and refusal rate. These outcome rates allow survey researchers to measure the quality of survey data using definitions published by the American Association of Public Opinion Research (AAPOR). For details on these standards, see AAPOR (2016) <https://www.aapor.org/Standards-Ethics/Standard-Definitions-(1).aspx>.
Computes odds ratios and 95% confidence intervals from a generalized linear model object. It also computes model significance with the chi-squared statistic and p-value and it computes model fit using a contingency table to determine the percent of observations for which the model correctly predicts the value of the outcome. Calculates model sensitivity and specificity.
Simulate pedigree, genetic merits and phenotypes with random/non-random matings followed by random/non-random selection with different intensities and patterns in males and females. Genotypes can be simulated for a given pedigree, or an appended pedigree to an existing pedigree with genotypes. Mrode, R. A. (2005) <ISBN:9780851989969, 0851989969>; Nilforooshan, M.A. (2022) <doi:10.37496/rbz5120210131>.
Nonparametric density estimation for (hyper)spherical data by means of a parametrically guided kernel estimator (Alonso-Pena et al. (2024) <doi:10.1111/sjos.12737>. The package also allows the data-driven selection of the smoothing parameter and the representation of the estimated density for circular and spherical data. Estimators of the density without guide can also be obtained.
This package provides a set of functions to efficiently recognize and clean the continuous dorsal pattern of a female brown anole lizard (Anolis sagrei) traced from ImageJ', an open platform for scientific image analysis (see <https://imagej.net> for more information), and extract common features such as the pattern sinuosity indices, coefficient of variation, and max-min width.
This is a compilation of my preferred themes and related theme elements for ggplot2'. I believe these themes and theme elements are aesthetically pleasing, both for pedagogical instruction and for the presentation of applied statistical research to a wide audience. These themes imply routine use of easily obtained/free fonts, simple forms of which are included in this package.
This package provides tools for reporting and forecasting viral respiratory infections, using case surveillance data. Report generation tools for short-term forecasts, and validation metrics for an arbitrary number of customizable respiratory viruses. Estimation of the effective reproduction number is based on the EpiEstim framework described in work by Cori and colleagues. (2013) <doi:10.1093/aje/kwt133>.
This collection of data exploration tools was developed at Yale University for the graphical exploration of complex multivariate data; barcode and gpairs now have their own packages. The big.read.table() function provided here may be useful for large files when only a subset is needed (but please see the note in the help page for this function).
This package provides grid grobs that fill in a user-defined area with various patterns. It includes enhanced versions of the geometric and image-based patterns originally contained in the ggpattern package as well as original pch, polygon_tiling, regular_polygon, rose, text, wave, and weave patterns plus support for custom user-defined patterns.
This package provides routines for the analysis of indirectly measured haplotypes. The statistical methods assume that all subjects are unrelated and that haplotypes are ambiguous (due to unknown linkage phase of the genetic markers). The main functions are: haplo.em(), haplo.glm(), haplo.score(), and haplo.power(); all of which have detailed examples in the vignette.
Inference of ligand-receptor (LR) interactions from bulk expression (transcriptomics/proteomics) data, or spatial transcriptomics. BulkSignalR bases its inferences on the LRdb database included in our other package, SingleCellSignalR available from Bioconductor. It relies on a statistical model that is specific to bulk data sets. Different visualization and data summary functions are proposed to help navigating prediction results.
Finds, prioritizes and deletes erroneous taxa in a phylogenetic tree. This package calculates scores for taxa in a tree. Higher score means the taxon is more erroneous. If the score is zero for a taxon, the taxon is not erroneous. This package also can remove all erroneous taxa automatically by iterating score calculation and pruning taxa with the highest score.
Calculate Bayesian marginal effects, average marginal effects, and marginal coefficients (also called population averaged coefficients) for models fit using the brms package including fixed effects, mixed effects, and location scale models. These are based on marginal predictions that integrate out random effects if necessary (see for example <doi:10.1186/s12874-015-0046-6> and <doi:10.1111/biom.12707>).
Plots a set of x,y,z co-ordinates in a contour map. Designed to be similar to plots in base R so additional elements can be added using lines(), points() etc. This package is intended to be better suited, than existing packages, to displaying circular shaped plots such as those often seen in the semi-conductor industry.
Data whitening is a widely used preprocessing step to remove correlation structure since statistical models often assume independence. Here we use a probabilistic model of the observed data to apply a whitening transformation. This Gaussian Inverse Wishart Empirical Bayes model substantially reduces computational complexity, and regularizes the eigen-values of the sample covariance matrix to improve out-of-sample performance.
Statistical models fit to compositional data are often difficult to interpret due to the sum to 1 constraint on data variables. DImodelsVis provides novel visualisations tools to aid with the interpretation of models fit to compositional data. All visualisations in the package are created using the ggplot2 plotting framework and can be extended like every other ggplot object.
Display a 2D-matrix data as a interactive zoomable gray-scale image viewer, providing tools for manual data inspection. The viewer window shows cursor guiding lines and a corresponding data slices for both axes at the current cursor position. A tool-bar allows adjusting image display brightness/contrast through WebGL filters and performing basic high-pass/low-pass filtering.
This package performs exploratory data analysis and variable screening for binary classification models using weight-of-evidence (WOE) and information value (IV). In order to make the package as efficient as possible, aggregations are done in data.table and creation of WOE vectors can be distributed across multiple cores. The package also supports exploration for uplift models (NWOE and NIV).
This package provides functions and S4 methods to create and manage discrete time Markov chains more easily. In addition functions to perform statistical (fitting and drawing random variates) and probabilistic (analysis of their structural proprieties) analysis are provided. See Spedicato (2017) <doi:10.32614/RJ-2017-036>. Some functions for continuous times Markov chains depend on the suggested ctmcd package.
It includes functions to download and process the Planet NICFI (Norway's International Climate and Forest Initiative) Satellite Imagery utilizing the Planet Mosaics API <https://developers.planet.com/docs/basemaps/reference/#tag/Basemaps-and-Mosaics>. GDAL (library for raster and vector geospatial data formats) and aria2c (paralleled download utility) must be installed and configured in the user's Operating System.