Simultaneous tests and confidence intervals are provided for one-way experimental designs with one or many normally distributed, primary response variables (endpoints). Differences (Hasler and Hothorn, 2011 <doi:10.2202/1557-4679.1258>) or ratios (Hasler and Hothorn, 2012 <doi:10.1080/19466315.2011.633868>) of means can be considered. Various contrasts can be chosen, unbalanced sample sizes are allowed as well as heterogeneous variances (Hasler and Hothorn, 2008 <doi:10.1002/bimj.200710466>) or covariance matrices (Hasler, 2014 <doi:10.1515/ijb-2012-0015>).
Easily override the default visual choices in ggplot2 to make your time series plots look more like the Wall Street Journal. Specific theme design choices include omitting x-axis grid lines and displaying sparse light grey y-axis grid lines. Additionally, this allows to label the y-axis scales with your units only displayed on the top-most number, while also removing the bottom most number (unless specifically overridden). The goal is visual simplicity, because who has time to waste looking at a cluttered graph?
This package provides connections to the epiviz web app (http://epiviz.cbcb.umd.edu) for interactive visualization of genomic data. Objects in R/bioc interactive sessions can be displayed in genome browser tracks or plots to be explored by navigation through genomic regions. Fundamental Bioconductor data structures are supported (e.g., GenomicRanges
and RangedSummarizedExperiment
objects), while providing an easy mechanism to support other data structures (through package epivizrData
). Visualizations (using d3.js) can be easily added to the web app as well.
BANDITS is a Bayesian hierarchical model for detecting differential splicing of genes and transcripts, via DTU (differential transcript usage), between two or more conditions. The method uses a Bayesian hierarchical framework, which allows for sample specific proportions in a Dirichlet-Multinomial model, and samples the allocation of fragments to the transcripts. Parameters are inferred via MCMC (Markov chain Monte Carlo) techniques and a DTU test is performed via a multivariate Wald test on the posterior densities for the average relative abundance of transcripts.
This package implements beta regression for modeling beta-distributed dependent variables on the open unit interval (0, 1), e.g., rates and proportions, see Cribari-Neto and Zeileis (2010) <doi:10.18637/jss.v034.i02>. Moreover, extended-support beta regression models can accommodate dependent variables with boundary observations at 0 and/or 1. For the classical beta regression model, alternative specifications are provided: Bias-corrected and bias-reduced estimation, finite mixture models, and recursive partitioning for beta regression, see <doi:10.18637/jss.v048.i11>.
This package provides support for the foreach
looping construct. foreach
is an idiom that allows for iterating over elements in a collection, without the use of an explicit loop counter. This package in particular is intended to be used for its return value, rather than for its side effects. In that sense, it is similar to the standard lapply
function, but doesn't require the evaluation of a function. Using foreach
without side effects also facilitates executing the loop in parallel.
Average population attributable fractions are calculated for a set of risk factors (either binary or ordinal valued) for both prospective and case- control designs. Confidence intervals are found by Monte Carlo simulation. The method can be applied to either prospective or case control designs, provided an estimate of disease prevalence is provided. In addition to an exact calculation of AF, an approximate calculation, based on randomly sampling permutations has been implemented to ensure the calculation is computationally tractable when the number of risk factors is large.
This package implements Bayesian hierarchical models with flexible Gaussian process priors, focusing on Extended Latent Gaussian Models and incorporating various Gaussian process priors for Bayesian smoothing. Computations leverage finite element approximations and adaptive quadrature for efficient inference. Methods are detailed in Zhang, Stringer, Brown, and Stafford (2023) <doi:10.1177/09622802221134172>; Zhang, Stringer, Brown, and Stafford (2024) <doi:10.1080/10618600.2023.2289532>; Zhang, Brown, and Stafford (2023) <doi:10.48550/arXiv.2305.09914>
; and Stringer, Brown, and Stafford (2021) <doi:10.1111/biom.13329>.
BAYesian inference for MEDical designs in R. Functions for the computation of Bayes factors for common biomedical research designs. Implemented are functions to test the equivalence (equiv_bf), non-inferiority (infer_bf), and superiority (super_bf) of an experimental group compared to a control group on a continuous outcome measure. Bayes factors for these three tests can be computed based on raw data (x, y) or summary statistics (n_x, n_y, mean_x, mean_y, sd_x, sd_y [or ci_margin and ci_level]).
This package provides tools for estimation and clustering of spherical data, seamlessly integrated with the flexmix package. Includes the necessary M-step implementations for both Poisson Kernel-Based Distribution (PKBD) and spherical Cauchy distribution. Additionally, the package provides random number generators for PKBD and spherical Cauchy distribution. Methods are based on Golzy M., Markatou M. (2020) <doi:10.1080/10618600.2020.1740713>, Kato S., McCullagh
P. (2020) <doi:10.3150/20-bej1222> and Sablica L., Hornik K., Leydold J. (2023) <doi:10.1214/23-ejs2149>.
Visualize contact tracing data using a shiny app and estimate the incubation or latency time of an infectious disease respecting the following characteristics in the analysis; (i) doubly interval censoring with (partly) overlapping or distinct windows; (ii) an infection risk corresponding to exponential growth; (iii) right truncation allowing for individual truncation times; (iv) different choices concerning the family of the distribution. For our earlier work, we refer to Arntzen et al. (2023) <doi:10.1002/sim.9726>. A paper describing our approach in detail will follow.
Support in preparing a raw ESM dataset for statistical analysis. Preparation includes the handling of errors (mostly due to technological reasons) and the generating of new variables that are necessary and/or helpful in meeting the conditions when statistically analyzing ESM data. The functions in esmprep are meant to hierarchically lead from bottom, i.e. the raw (separated) ESM dataset(s), to top, i.e. a single ESM dataset ready for statistical analysis. This hierarchy evolved out of my personal experience in working with ESM data.
Calculates additive and dominance genetic relationship matrices and their inverses, in matrix and tabular-sparse formats. It includes functions for checking and processing pedigree, calculating inbreeding coefficients (Meuwissen & Luo, 1992 <doi:10.1186/1297-9686-24-4-305>), as well as functions to calculate the matrix of genetic group contributions (Q), and adding those contributions to the genetic merit of animals (Quaas (1988) <doi:10.3168/jds.S0022-0302(88)79691-5>). Calculation of Q is computationally extensive. There are computationally optimized functions to calculate Q.
This package contains an engine for spatially-explicit eco-evolutionary mechanistic models with a modular implementation and several support functions. It allows exploring the consequences of ecological and macroevolutionary processes across realistic or theoretical spatio-temporal landscapes on biodiversity patterns as a general term. Reference: Oskar Hagen, Benjamin Flueck, Fabian Fopp, Juliano S. Cabral, Florian Hartig, Mikael Pontarp, Thiago F. Rangel, Loic Pellissier (2021) "gen3sis: A general engine for eco-evolutionary simulations of the processes that shape Earth's biodiversity" <doi:10.1371/journal.pbio.3001340>.
The algorithm of semi-supervised learning is based on finite Gaussian mixture models and includes a mechanism for handling missing data. It aims to fit a g-class Gaussian mixture model using maximum likelihood. The algorithm treats the labels of unclassified features as missing data, building on the framework introduced by Rubin (1976) <doi:10.2307/2335739> for missing data analysis. By taking into account the dependencies in the missing pattern, the algorithm provides more information for determining the optimal classifier, as specified by Bayes rule.
This package provides HE plot and other functions for visualizing hypothesis tests in multivariate linear models. HE plots represent sums-of-squares-and-products matrices for linear hypotheses and for error using ellipses (in two dimensions) and ellipsoids (in three dimensions). It also provides other tools for analysis and graphical display of the models such as robust methods and homogeneity of variance covariance matrices. The related candisc package provides visualizations in a reduced-rank canonical discriminant space when there are more than a few response variables.
This package provides efficient implementation of the Isolate-Detect methodology for the consistent estimation of the number and location of multiple change-points in one-dimensional data sequences from the "deterministic + noise" model. For details on the Isolate-Detect methodology, please see Anastasiou and Fryzlewicz (2018) <https://docs.wixstatic.com/ugd/24cdcc_6a0866c574654163b8255e272bc0001b.pdf>. Currently implemented scenarios are: piecewise-constant signal with Gaussian noise, piecewise-constant signal with heavy-tailed noise, continuous piecewise-linear signal with Gaussian noise, continuous piecewise-linear signal with heavy-tailed noise.
Nonparametric Failure Time (NFT) Bayesian Additive Regression Trees (BART): Time-to-event Machine Learning with Heteroskedastic Bayesian Additive Regression Trees (HBART) and Low Information Omnibus (LIO) Dirichlet Process Mixtures (DPM). An NFT BART model is of the form Y = mu + f(x) + sd(x) E where functions f and sd have BART and HBART priors, respectively, while E is a nonparametric error distribution due to a DPM LIO prior hierarchy. See the following for a complete description of the model at <doi:10.1111/biom.13857>.
This package implements a method that builds the coefficients of a polynomial model that performs almost equivalently as a given neural network (densely connected). This is achieved using Taylor expansion at the activation functions. The obtained polynomial coefficients can be used to explain features (and their interactions) importance in the neural network, therefore working as a tool for interpretability or eXplainable
Artificial Intelligence (XAI). See Morala et al. 2021 <doi:10.1016/j.neunet.2021.04.036>, and 2023 <doi:10.1109/TNNLS.2023.3330328>.
Estimation of two- and three-way dynamic panel threshold regression models (Di Lascio and Perazzini (2024) <https://repec.unibz.it/bemps104.pdf>; Di Lascio and Perazzini (2022, ISBN:978-88-9193-231-0); Seo and Shin (2016) <doi:10.1016/j.jeconom.2016.03.005>) through the generalized method of moments based on the first difference transformation and the use of instrumental variables. The models can be used to find a change point detection in the time series. In addition, random number generation is also implemented.
This package provides tools to convert from specific formats to more general forms of spatial data. Using tables to store the actual entities present in spatial data provides flexibility, and the functions here deliberately minimize the level of interpretation applied, leaving that for specific applications. Includes support for simple features, round-trip for Spatial classes and long-form tables, analogous to ggplot2::fortify'. There is also a more normal form representation that decomposes simple features and their kin to tables of objects, parts, and unique coordinates.
Calculates a Satorra-Bentler scaled chi-squared difference test between nested models that were estimated using maximum likelihood (ML) with robust standard errors, which cannot be calculated the traditional way. For details see Satorra & Bentler (2001) <doi:10.1007/bf02296192> and Satorra & Bentler (2010) <doi:10.1007/s11336-009-9135-y>. This package may be particularly helpful when used in conjunction with Mplus software, specifically when implementing the complex survey option. In such cases, the model estimator in Mplus defaults to ML with robust standard errors.
Identifying cell types based on expression profiles is a pillar of single cell analysis. scROSHI
identifies cell types based on expression profiles of single cell analysis by utilizing previously obtained cell type specific gene sets. It takes into account the hierarchical nature of cell type relationship and does not require training or annotated data. A detailed description of the method can be found at: Prummer, Bertolini, Bosshard, Barkmann, Yates, Boeva, The Tumor Profiler Consortium, Stekhoven, and Singer (2022) <doi:10.1101/2022.04.05.487176>.
Determine sample sizes, draw samples, and conduct data analysis using data frames. It specifically enables you to determine simple random sample sizes, stratified sample sizes, and complex stratified sample sizes using a secondary variable such as population; draw simple random samples and stratified random samples from sampling data frames; determine which observations are missing from a random sample, missing by strata, duplicated within a dataset; and perform data analysis, including proportions, margins of error and upper and lower bounds for simple, stratified and cluster sample designs.