The standard index of DNA methylation (beta) is computed from methylated and unmethylated signal intensities. Betas calculated from raw signal intensities perform well, but using 11 methylomic datasets we demonstrate that quantile normalization methods produce marked improvement. The commonly used procedure of normalizing betas is inferior to the separate normalization of M and U, and it is also advantageous to normalize Type I and Type II assays separately. This package provides 15 flavours of betas and three performance metrics, with methods for objects produced by the methylumi
and minfi
packages.
Calculates distances from point locations to features. The usual approach for eg. resource selection function analyses is to generate a complete distance to features surface then sample it with your observed and random points. Since these raster based approaches can be pretty costly with large areas, and often lead to memory issues in R, the distanceto package opts to compute these distances using efficient, vector based approaches. As a helper, there's a decidedly low-res raster based approach for visually inspecting your region's distance surface. But the workhorse is distance_to.
Provide estimation and data generation tools for some new multivariate frailty models. This version includes the gamma, inverse Gaussian, weighted Lindley, Birnbaum-Saunders, truncated normal, mixture of inverse Gaussian, mixture of Birnbaum-Saunders and generalized exponential as the distribution for the frailty terms. For the basal model, it is considered a parametric approach based on the exponential, Weibull and the piecewise exponential distributions as well as a semiparametric approach. For details, see Gallardo and Bourguignon (2025) <doi:10.1002/bimj.70044> and Gallardo et al. (2024) <doi:10.1007/s11222-024-10458-w>.
This package provides a Bayesian model selection approach for generalized linear mixed models. Currently, GLMMselect can be used for Poisson GLMM and Bernoulli GLMM. GLMMselect can select fixed effects and random effects simultaneously. Covariance structures for the random effects are a product of a unknown scalar and a known semi-positive definite matrix. GLMMselect can be widely used in areas such as longitudinal studies, genome-wide association studies, and spatial statistics. GLMMselect is based on Xu, Ferreira, Porter, and Franck (202X), Bayesian Model Selection Method for Generalized Linear Mixed Models, Biometrics, under review.
There are occasions where you need a piece of HTML with integrated styles. A prime example of this is HTML email. This transformation involves moving the CSS and associated formatting instructions from the style block in the head of your document into the body of the HTML. Many prominent email clients require integrated styles in HTML email; otherwise a received HTML email will be displayed without any styling. This package will quickly and precisely perform these CSS transformations when given HTML text and it does so by using the JavaScript
juice library.
Allows for fitting of maximum likelihood models using Markov chains on phylogenetic trees for analysis of discrete character data. Examples of such discrete character data include restriction sites, gene family presence/absence, intron presence/absence, and gene family size data. Hypothesis-driven user- specified substitution rate matrices can be estimated. Allows for biologically realistic models combining constrained substitution rate matrices, site rate variation, site partitioning, branch-specific rates, allowing for non-stationary prior root probabilities, correcting for sampling bias, etc. See Dang and Golding (2016) <doi:10.1093/bioinformatics/btv541> for more details.
Spatial homogeneous regions (SHRs) in tissues are domains that are homogenous with respect to cell type composition. We present a method for identifying SHRs using spatial transcriptomics data, and demonstrate that it is efficient and effective at finding SHRs for a wide variety of tissue types. concordex relies on analysis of k-nearest-neighbor (kNN
) graphs. The tool is also useful for analysis of non-spatial transcriptomics data, and can elucidate the extent of concordance between partitions of cells derived from clustering algorithms, and transcriptomic similarity as represented in kNN
graphs.
Covered uses modern Ruby features to generate comprehensive coverage, including support for templates which are compiled into Ruby. It has the following features:
Incremental coverage -- if you run your full test suite, and the run a subset, it will still report the correct coverage - so you can incrementally work on improving coverage.
Integration with RSpec, Minitest, Travis & Coveralls - no need to configure anything - out of the box support for these platforms.
It supports coverage of views -- templates compiled to Ruby code can be tracked for coverage reporting.
Considering an (n x m) data matrix X, this package is based on the method proposed by Gower, Groener, and Velden (2010) <doi:10.1198/jcgs.2010.07134>, and utilize the resulting matrices from the extended version of the NIPALS decomposition to determine n triangles whose areas are used to visually estimate the elements of a specific column of X. After a 90-degree rotation of the sample points, the triangles are drawn regarding the following points: 1.the origin of the axes; 2.the sample points; 3. the vector endpoint representing some variable.
CIFTI files contain brain imaging data in "grayordinates," which represent the gray matter as cortical surface vertices (left and right) and subcortical voxels (cerebellum, basal ganglia, and other deep gray matter). ciftiTools
provides a unified environment for reading, writing, visualizing and manipulating CIFTI-format data. It supports the "dscalar," "dlabel," and "dtseries" intents. Grayordinate data is read in as a "xifti" object, which is structured for convenient access to the data and metadata, and includes support for surface geometry files to enable spatially-dependent functionality such as static or interactive visualizations and smoothing.
This package provides tools for visualization of, and inference on, the calibration of prediction models on the cumulative domain. This provides a method for evaluating calibration of risk prediction models without having to group the data or use tuning parameters (e.g., loess bandwidth). This package implements the methodology described in Sadatsafavi and Patkau (2024) <doi:10.1002/sim.10138>. The core of the package is cumulcalib()
, which takes in vectors of binary responses and predicted risks. The plot()
and summary()
methods are implemented for the results returned by cumulcalib()
.
This is a one-function package that will pass only unique values to a computationally-expensive function that returns an output of the same length as the input. In importing and working with tidy data, it is common to have index columns, often including time stamps that are far from unique. Some functions to work with these such as text conversion to other variable types (e.g. as.POSIXct()
), various grep()
-based functions, and often the cut()
function are relatively slow when working with tens of millions of rows or more.
Iterate and repel visually similar colors away in various ggplot2 plots. When many groups are plotted at the same time on multiple axes, for instance stacked bars or scatter plots, effectively ordering colors becomes difficult. This tool iterates through color combinations to find the best solution to maximize visual distinctness of nearby groups, so plots are more friendly toward colorblind users. This is achieved by two distance measurements, distance between groups within the plot, and CIELAB color space distances between colors as described in Carter et al., (2018) <doi:10.25039/TR.015.2018>.
This package provides a set of functions providing several outlier (i.e., studies with extreme findings) and influential detection measures and methodologies in network meta-analysis : - simple outlier and influential detection measures - outlier and influential detection measures by considering study deletion (shift the mean) - plots for outlier and influential detection measures - Q-Q plot for network meta-analysis - Forward Search algorithm in network meta-analysis. - forward plots to monitor statistics in each step of the forward search algorithm - forward plots for summary estimates and their confidence intervals in each step of forward search algorithm.
Power and sample size calculations for a variety of study designs and outcomes. Methods include t tests, ANOVA (including tests for interactions, simple effects and contrasts), proportions, categorical data (chi-square tests and proportional odds), linear, logistic and Poisson regression, alternative and coprimary endpoints, power for confidence intervals, correlation coefficient tests, cluster randomized trials, individually randomized group treatment trials, multisite trials, treatment-by-covariate interaction effects and nonparametric tests of location. Utilities are provided for computing various effect sizes. Companion package to the book "Power and Sample Size in R", Crespi (2025, ISBN:9781138591622).
Calculates performance criteria measures and associated Monte Carlo standard errors for simulation results. Includes functions to help run simulation studies, following a general simulation workflow that closely aligns with the approach described by Morris, White, and Crowther (2019) <DOI:10.1002/sim.8086>. Also includes functions for calculating bootstrap confidence intervals (including normal, basic, studentized, percentile, bias-corrected, and bias-corrected-and-accelerated) with tidy output, as well as for extrapolating confidence interval coverage rates and hypothesis test rejection rates following techniques suggested by Boos and Zhang (2000) <DOI:10.1080/01621459.2000.10474226>.
This package provides a graphics output device for R that records plots in a LaTeX-friendly
format. The device transforms plotting commands issued by R functions into LaTeX
code blocks. When included in a LaTeX
document, these blocks are interpreted with the help of TikZ'---a
graphics package for TeX
and friends written by Till Tantau. Using the tikzDevice
', the text of R plots can contain LaTeX
commands such as mathematical formula. The device also allows arbitrary LaTeX
code to be inserted into the output stream.
This package provides a package for summary and annotation of genomic intervals. Users can visualize and quantify genomic intervals over pre-defined functional regions, such as promoters, exons, introns, etc. The genomic intervals represent regions with a defined chromosome position, which may be associated with a score, such as aligned reads from HT-seq experiments, TF binding sites, methylation scores, etc. The package can use any tabular genomic feature data as long as it has minimal information on the locations of genomic intervals. In addition, it can use BAM or BigWig files as input.
In the framework of Symbolic Data Analysis, a relatively new approach to the statistical analysis of multi-valued data, we consider histogram-valued data, i.e., data described by univariate histograms. The methods and the basic statistics for histogram-valued data are mainly based on the L2 Wasserstein metric between distributions, i.e., the Euclidean metric between quantile functions. The package contains unsupervised classification techniques, least square regression and tools for histogram-valued data and for histogram time series. An introducing paper is Irpino A. Verde R. (2015) <doi: 10.1007/s11634-014-0176-4>.
We aim for fitting a multinomial regression model with Lasso penalty and doing statistical inference (calculating confidence intervals of coefficients and p-values for individual variables). It implements 1) the coordinate descent algorithm to fit an l1-penalized multinomial regression model (parameterized with a reference level); 2) the debiasing approach to obtain the inference results, which is described in "Tian, Y., Rusinek, H., Masurkar, A. V., & Feng, Y. (2024). L1รข Penalized Multinomial Regression: Estimation, Inference, and Prediction, With an Application to Risk Factor Identification for Different Dementia Subtypes. Statistics in Medicine, 43(30), 5711-5747.".
Generate Mermaid syntax for a pedigree flowchart from a pedigree data frame. Mermaid syntax is commonly used to generate plots, charts, diagrams, and flowcharts. It is a textual syntax for creating reproducible illustrations. This package generates Mermaid syntax from a pedigree data frame to visualize a pedigree flowchart. The Mermaid syntax can be embedded in a Markdown or R Markdown file, or viewed on Mermaid editors and renderers. Links shape, style, and orientation can be customized via function arguments, and nodes shapes and styles can be customized via optional columns in the pedigree data frame.
It is a framework to fit semiparametric regression estimators for the total parameter of a finite population when the interest variable is asymmetric distributed. The main references for this package are Sarndal C.E., Swensson B., and Wretman J. (2003,ISBN: 978-0-387-40620-6, "Model Assisted Survey Sampling." Springer-Verlag) Cardozo C.A, Paula G.A. and Vanegas L.H. (2022) "Generalized log-gamma additive partial linear mdoels with P-spline smoothing", Statistical Papers. Cardozo C.A and Alonso-Malaver C.E. (2022). "Semi-parametric model assisted estimation in finite populations." In preparation.
This package provides tools to estimate pollinator body size and co-varying traits. This package contains novel Bayesian predictive models of pollinator body size (for bees and hoverflies) as well as preexisting predictive models for pollinator body size (currently implemented for ants, bees, butterflies, flies, moths and wasps) as well as bee tongue length and foraging distance, total field nectar loads and wing loading. An additional GitHub
repository <https://github.com/liamkendall/pollimetrydata> provides model objects to use the bodysize function internally. All models are described in Kendall et al (2018) <doi:10.1101/397604>.
This package support non-robust and robust computations of the sample autocovariance (ACOVF) and sample autocorrelation functions (ACF) of univariate and multivariate processes. The methodology consists in reversing the diagonalization procedure involving the periodogram or the cross-periodogram and the Fourier transform vectors, and, thus, obtaining the ACOVF or the ACF as discussed in Fuller (1995) doi:10.1002/9780470316917. The robust version is obtained by fitting robust M-regressors to obtain the M-periodogram or M-cross-periodogram as discussed in Reisen et al. (2017) doi:10.1016/j.jspi.2017.02.008.