Iterate and repel visually similar colors away in various ggplot2 plots. When many groups are plotted at the same time on multiple axes, for instance stacked bars or scatter plots, effectively ordering colors becomes difficult. This tool iterates through color combinations to find the best solution to maximize visual distinctness of nearby groups, so plots are more friendly toward colorblind users. This is achieved by two distance measurements, distance between groups within the plot, and CIELAB color space distances between colors as described in Carter et al., (2018) <doi:10.25039/TR.015.2018>.
This is a one-function package that will pass only unique values to a computationally-expensive function that returns an output of the same length as the input. In importing and working with tidy data, it is common to have index columns, often including time stamps that are far from unique. Some functions to work with these such as text conversion to other variable types (e.g. as.POSIXct()
), various grep()
-based functions, and often the cut()
function are relatively slow when working with tens of millions of rows or more.
CIFTI files contain brain imaging data in "grayordinates," which represent the gray matter as cortical surface vertices (left and right) and subcortical voxels (cerebellum, basal ganglia, and other deep gray matter). ciftiTools
provides a unified environment for reading, writing, visualizing and manipulating CIFTI-format data. It supports the "dscalar," "dlabel," and "dtseries" intents. Grayordinate data is read in as a "xifti" object, which is structured for convenient access to the data and metadata, and includes support for surface geometry files to enable spatially-dependent functionality such as static or interactive visualizations and smoothing.
This package provides a set of functions providing several outlier (i.e., studies with extreme findings) and influential detection measures and methodologies in network meta-analysis : - simple outlier and influential detection measures - outlier and influential detection measures by considering study deletion (shift the mean) - plots for outlier and influential detection measures - Q-Q plot for network meta-analysis - Forward Search algorithm in network meta-analysis. - forward plots to monitor statistics in each step of the forward search algorithm - forward plots for summary estimates and their confidence intervals in each step of forward search algorithm.
Power and sample size calculations for a variety of study designs and outcomes. Methods include t tests, ANOVA (including tests for interactions, simple effects and contrasts), proportions, categorical data (chi-square tests and proportional odds), linear, logistic and Poisson regression, alternative and coprimary endpoints, power for confidence intervals, correlation coefficient tests, cluster randomized trials, individually randomized group treatment trials, multisite trials, treatment-by-covariate interaction effects and nonparametric tests of location. Utilities are provided for computing various effect sizes. Companion package to the book "Power and Sample Size in R", Crespi (2025, ISBN:9781138591622).
Calculates performance criteria measures and associated Monte Carlo standard errors for simulation results. Includes functions to help run simulation studies, following a general simulation workflow that closely aligns with the approach described by Morris, White, and Crowther (2019) <DOI:10.1002/sim.8086>. Also includes functions for calculating bootstrap confidence intervals (including normal, basic, studentized, percentile, bias-corrected, and bias-corrected-and-accelerated) with tidy output, as well as for extrapolating confidence interval coverage rates and hypothesis test rejection rates following techniques suggested by Boos and Zhang (2000) <DOI:10.1080/01621459.2000.10474226>.
This package provides a graphics output device for R that records plots in a LaTeX-friendly
format. The device transforms plotting commands issued by R functions into LaTeX
code blocks. When included in a LaTeX
document, these blocks are interpreted with the help of TikZ'---a
graphics package for TeX
and friends written by Till Tantau. Using the tikzDevice
', the text of R plots can contain LaTeX
commands such as mathematical formula. The device also allows arbitrary LaTeX
code to be inserted into the output stream.
This package provides a package for summary and annotation of genomic intervals. Users can visualize and quantify genomic intervals over pre-defined functional regions, such as promoters, exons, introns, etc. The genomic intervals represent regions with a defined chromosome position, which may be associated with a score, such as aligned reads from HT-seq experiments, TF binding sites, methylation scores, etc. The package can use any tabular genomic feature data as long as it has minimal information on the locations of genomic intervals. In addition, it can use BAM or BigWig files as input.
In the framework of Symbolic Data Analysis, a relatively new approach to the statistical analysis of multi-valued data, we consider histogram-valued data, i.e., data described by univariate histograms. The methods and the basic statistics for histogram-valued data are mainly based on the L2 Wasserstein metric between distributions, i.e., the Euclidean metric between quantile functions. The package contains unsupervised classification techniques, least square regression and tools for histogram-valued data and for histogram time series. An introducing paper is Irpino A. Verde R. (2015) <doi: 10.1007/s11634-014-0176-4>.
Generate Mermaid syntax for a pedigree flowchart from a pedigree data frame. Mermaid syntax is commonly used to generate plots, charts, diagrams, and flowcharts. It is a textual syntax for creating reproducible illustrations. This package generates Mermaid syntax from a pedigree data frame to visualize a pedigree flowchart. The Mermaid syntax can be embedded in a Markdown or R Markdown file, or viewed on Mermaid editors and renderers. Links shape, style, and orientation can be customized via function arguments, and nodes shapes and styles can be customized via optional columns in the pedigree data frame.
We aim for fitting a multinomial regression model with Lasso penalty and doing statistical inference (calculating confidence intervals of coefficients and p-values for individual variables). It implements 1) the coordinate descent algorithm to fit an l1-penalized multinomial regression model (parameterized with a reference level); 2) the debiasing approach to obtain the inference results, which is described in "Tian, Y., Rusinek, H., Masurkar, A. V., & Feng, Y. (2024). L1รข Penalized Multinomial Regression: Estimation, Inference, and Prediction, With an Application to Risk Factor Identification for Different Dementia Subtypes. Statistics in Medicine, 43(30), 5711-5747.".
It is a framework to fit semiparametric regression estimators for the total parameter of a finite population when the interest variable is asymmetric distributed. The main references for this package are Sarndal C.E., Swensson B., and Wretman J. (2003,ISBN: 978-0-387-40620-6, "Model Assisted Survey Sampling." Springer-Verlag) Cardozo C.A, Paula G.A. and Vanegas L.H. (2022) "Generalized log-gamma additive partial linear mdoels with P-spline smoothing", Statistical Papers. Cardozo C.A and Alonso-Malaver C.E. (2022). "Semi-parametric model assisted estimation in finite populations." In preparation.
This package provides tools to estimate pollinator body size and co-varying traits. This package contains novel Bayesian predictive models of pollinator body size (for bees and hoverflies) as well as preexisting predictive models for pollinator body size (currently implemented for ants, bees, butterflies, flies, moths and wasps) as well as bee tongue length and foraging distance, total field nectar loads and wing loading. An additional GitHub
repository <https://github.com/liamkendall/pollimetrydata> provides model objects to use the bodysize function internally. All models are described in Kendall et al (2018) <doi:10.1101/397604>.
This package support non-robust and robust computations of the sample autocovariance (ACOVF) and sample autocorrelation functions (ACF) of univariate and multivariate processes. The methodology consists in reversing the diagonalization procedure involving the periodogram or the cross-periodogram and the Fourier transform vectors, and, thus, obtaining the ACOVF or the ACF as discussed in Fuller (1995) doi:10.1002/9780470316917. The robust version is obtained by fitting robust M-regressors to obtain the M-periodogram or M-cross-periodogram as discussed in Reisen et al. (2017) doi:10.1016/j.jspi.2017.02.008.
Supports the analysis of oceanographic data recorded by Argo autonomous drifting profiling floats. Functions are provided to (a) download and cache data files, (b) subset data in various ways, (c) handle quality-control flags and (d) plot the results according to oceanographic conventions. A shiny app is provided for easy exploration of datasets. The package is designed to work well with the oce package, providing a wide range of processing capabilities that are particular to oceanographic analysis. See Kelley, Harbin, and Richards (2021) <doi:10.3389/fmars.2021.635922> for more on the scientific context and applications.
This package provides a set of functions to allow analysis of count data (such as faecal egg count data) using Bayesian MCMC methods. Returns information on the possible values for mean count, coefficient of variation and zero inflation (true prevalence) present in the data. A complete faecal egg count reduction test (FECRT) model is implemented, which returns inference on the true efficacy of the drug from the pre- and post-treatment data provided, using non-parametric bootstrapping as well as using Bayesian MCMC. Functions to perform power analyses for faecal egg counts (including FECRT) are also provided.
Fit growth models to otoliths and/or tagging data, using the RTMB package and maximum likelihood. The otoliths (or similar measurements of age) provide direct observed coordinates of age and length. The tagging data provide information about the observed length at release and length at recapture at a later time, where the age at release is unknown and estimated as a vector of parameters. The growth models provided by this package can be fitted to otoliths only, tagging data only, or a combination of the two. Growth variability can be modelled as constant or increasing with length.
It provides classifiers which can be used for discrete variables and for continuous variables based on the Naive Bayes and Fuzzy Naive Bayes hypothesis. Those methods were developed by researchers belong to the Laboratory of Technologies for Virtual Teaching and Statistics (LabTEVE
) and Laboratory of Applied Statistics to Image Processing and Geoprocessing (LEAPIG) at Federal University of Paraiba, Brazil'. They considered some statistical distributions and their papers were published in the scientific literature, as for instance, the Gaussian classifier using fuzzy parameters, proposed by Moraes, Ferreira and Machado (2021) <doi:10.1007/s40815-020-00936-4>.
Write beautiful yet customizable letters in R Markdown and directly obtain the finished PDF. Smooth generation of PDFs is realized by rmarkdown', the pandoc-letter template and the KOMA-Script letter class. KOMA-Script provides enhanced replacements for the standard LaTeX
classes with emphasis on typography and versatility. KOMA-Script is particularly useful for international writers as it handles various paper formats well, provides layouts for many common window envelope types (e.g. German, US, French, Japanese) and lets you define your own layouts. The package comes with a default letter layout based on DIN 5008B'.
This package provides a tidy workflow for landscape-scale analysis. multilandr offers tools to generate landscapes at multiple spatial scales and compute landscape metrics, primarily using the landscapemetrics package. It also features utility functions for plotting and analyzing multi-scale landscapes, exploring correlations between metrics, filtering landscapes based on specific conditions, generating landscape gradients for a given metric, and preparing datasets for further statistical analysis. Documentation about multilandr is provided in an introductory vignette included in this package and in the paper by Huais (2024) <doi:10.1007/s10980-024-01930-z>; see citation("multilandr") for details.
An exploratory and heuristic approach for specification search in Structural Equation Modeling. The basic idea is to subsample the original data and then search for optimal models on each subset. Optimality is defined through two objectives: model fit and parsimony. As these objectives are conflicting, we apply a multi-objective optimization methods, specifically NSGA-II, to obtain optimal models for the whole range of model complexities. From these optimal models, we consider only the relevant model specifications (structures), i.e., those that are both stable (occur frequently) and parsimonious and use those to infer a causal model.
Interactive R package with an intuitive Shiny-based graphical interface for alternative splicing quantification and integrative analyses of alternative splicing and gene expression based on The Cancer Genome Atlas (TCGA), the Genotype-Tissue Expression project (GTEx), Sequence Read Archive (SRA) and user-provided data. The tool interactively performs survival, dimensionality reduction and median- and variance-based differential splicing and gene expression analyses that benefit from the incorporation of clinical and molecular sample-associated features (such as tumour stage or survival). Interactive visual access to genomic mapping and functional annotation of selected alternative splicing events is also included.
Helps visualizing what is summarized in Pearson's correlation coefficient. That is, it visualizes its main constituent, namely the distances of the single values to their respective mean. The visualization thereby shows what the etymology of the word correlation contains: In pairwise combination, bringing back (see package Vignette for more details). I hope that the correlatio package may benefit some people in understanding and critically evaluating what Pearson's correlation coefficient summarizes in a single number, i.e., to what degree and why Pearson's correlation coefficient may (or may not) be warranted as a measure of association.
An implementation of the cross-validated difference in means (CVDM) test by Desmarais and Harden (2014) <doi:10.1007/s11135-013-9884-7> (see also Harden and Desmarais, 2011 <doi:10.1177/1532440011408929>) and the cross-validated median fit (CVMF) test by Desmarais and Harden (2012) <doi:10.1093/pan/mpr042>. These tests use leave-one-out cross-validated log-likelihoods to assist in selecting among model estimations. You can also utilize data from Golder (2010) <doi:10.1177/0010414009341714> and Joshi & Mason (2008) <doi:10.1177/0022343308096155> that are included to facilitate examples from real-world analysis.