Bootstrap resampling methods have been widely studied in the context of survey data. This package implements various bootstrap resampling techniques tailored for survey data, with a focus on stratified simple random sampling and stratified two-stage cluster sampling. It provides tools for precise and consistent bootstrap variance estimation for population totals, means, and quartiles. Additionally, it enables easy generation of bootstrap samples for in-depth analysis.
An implementation of deliberative reasoning index (DRI) and related tools for analysis of deliberation survey data. Calculation of DRI, plot of intersubjective correlations (IC), generation of large-language model (LLM) survey data, and permutation tests are supported. Example datasets and a graphical user interface (GUI) are also available to support analysis. For more information, see Niemeyer and Veri (2022) <doi:10.1093/oso/9780192848925.003.0007>.
Estimate prior variable weights for Bayesian Additive Regression Trees (BART). These weights correspond to the probabilities of the variables being selected in the splitting rules of the sum-of-trees. Weights are estimated using empirical Bayes and external information on the explanatory variables (co-data). BART models are fitted using the dbarts R package. See Goedhart and others (2023) <doi:10.1002/sim.70004> for details.
This package provides tools for systematically exploring large quantities of temporal data across cyclic temporal granularities (deconstructions of time) by visualizing probability distributions. Cyclic time granularities can be circular, quasi-circular or aperiodic. gravitas computes cyclic single-order-up or multiple-order-up granularities, check the feasibility of creating plots for any two cyclic granularities and recommend probability distributions plots for exploring periodicity in the data.
Many tools for Geometric Data Analysis (Le Roux & Rouanet (2005) <doi:10.1007/1-4020-2236-0>), such as MCA variants (Specific Multiple Correspondence Analysis, Class Specific Analysis), many graphical and statistical aids to interpretation (structuring factors, concentration ellipses, inductive tests, bootstrap validation, etc.) and multiple-table analysis (Multiple Factor Analysis, between- and inter-class analysis, Principal Component Analysis and Correspondence Analysis with Instrumental Variables, etc.).
Implementation of the Generalized Score Matching estimator in Yu et al. (2019) <https://jmlr.org/papers/v20/18-278.html> for non-negative graphical models (truncated Gaussian, exponential square-root, gamma, a-b models) and univariate truncated Gaussian distributions. Also includes the original estimator for untruncated Gaussian graphical models from Lin et al. (2016) <doi:10.1214/16-EJS1126>, with the addition of a diagonal multiplier.
Run a Gibbs sampler for a multivariate Bayesian sparse group selection model with Dirac, continuous and hierarchical spike prior for detecting pleiotropy on the traits. This package is designed for summary statistics containing estimated regression coefficients and its estimated covariance matrix. The methodology is available from: Baghfalaki, T., Sugier, P. E., Truong, T., Pettitt, A. N., Mengersen, K., & Liquet, B. (2021) <doi:10.1002/sim.8855>.
This package implements large-scale hypothesis testing by variance mixing. It takes two statistics per testing unit -- an estimated effect and its associated squared standard error -- and fits a nonparametric, shape-constrained mixture separately on two latent parameters. It reports local false discovery rates (lfdr) and local false sign rates (lfsr). Manuscript describing algorithm of MixTwice: Zheng et al(2021) <doi: 10.1093/bioinformatics/btab162>.
Following Sommer (2022) <https://mediatum.ub.tum.de/1658240> portfolio level risk estimates (e.g. Value at Risk, Expected Shortfall) are estimated by modeling each asset univariately by an ARMA-GARCH model and then their cross dependence via a Vine Copula model in a rolling window fashion. One can even condition on variables/time series at certain quantile levels to stress test the risk measure estimates.
Because your linear models deserve better than console output. A sleek color palette and kable styling to make your regression results look sharper than they are. Includes support for Partial Least Squares (PLS) regression via both the SVD and NIPALS algorithms, along with a unified interface for model fitting and fabulous LaTeX and console output formatting. See the package website at <https://finitesample.space/snazzier>.
This package implements methods to fit Virtual Twins models (Foster et al. (2011) <doi:10.1002/sim.4322>) for identifying subgroups with differential effects in the context of clinical trials while controlling the probability of falsely detecting a differential effect when the conditional average treatment effect is uniform across the study population using parameter selection methods proposed in Wolf et al. (2022) <doi:10.1177/17407745221095855>.
This package provides functions for tabulating and summarising categorical variables. Most functions are designed to work with dataframes, and use the tidyverse idiom of taking the dataframe as the first argument so they work within pipelines. Equivalent functions that operate directly on vectors are also provided where it makes sense. This package aims to make exploratory data analysis involving categorical variables quicker, simpler and more robust.
Various functions to fit models for non-normal repeated measurements, such as Binary Random Effects Models with Two Levels of Nesting, Bivariate Beta-binomial Regression Models, Marginal Bivariate Binomial Regression Models, Cormack capture-recapture models, Continuous-time Hidden Markov Chain Models, Discrete-time Hidden Markov Chain Models, Changepoint Location Models using a Continuous-time Two-state Hidden Markov Chain, generalized nonlinear autoregression models, multivariate Gaussian copula models, generalized non-linear mixed models with one random effect, generalized non-linear mixed models using h-likelihood for one random effect, Repeated Measurements Models for Counts with Frailty or Serial Dependence, Repeated Measurements Models for Continuous Variables with Frailty or Serial Dependence, Ordinal Random Effects Models with Dropouts, marginal homogeneity models for square contingency tables, correlated negative binomial models with Kalman update. References include Lindsey's text books, JK Lindsey (2001) <isbn:10-0198508123> and JK Lindsey (1999) <isbn:10-0198505590>.
This package provides a fast, flexible, and comprehensive framework for quantitative text analysis in R. It provides functionality for corpus management, creating and manipulating tokens and ngrams, exploring keywords in context, forming and manipulating sparse matrices of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and distances, applying content dictionaries, applying supervised and unsupervised machine learning, visually representing text and text analyses, and more.
Enrich your ggplots with group-wise comparisons. This package provides an easy way to indicate if two groups are significantly different. Commonly this is shown by a bracket on top connecting the groups of interest which itself is annotated with the level of significance. The package provides a single layer that takes the groups for comparison and the test as arguments and adds the annotation to the plot.
Efficient C++ optimized functions for numerical and symbolic calculus. It includes basic symbolic arithmetic, tensor calculus, Einstein summing convention, fast computation of the Levi-Civita symbol and generalized Kronecker delta, Taylor series expansion, multivariate Hermite polynomials, accurate high-order derivatives, differential operators (Gradient, Jacobian, Hessian, Divergence, Curl, Laplacian) and numerical integration in arbitrary orthogonal coordinate systems: cartesian, polar, spherical, cylindrical, parabolic or user defined by custom scale factors.
This library lets you write interactive programs without callbacks or side-effects. Functional Reactive Programming (FRP) uses composable events and time-varying values to describe interactive systems as pure functions. Just like other pure functional code, functional reactive code is easier to get right on the first try, maintain, and reuse. Reflex is a fully-deterministic, higher-order FRP interface and an engine that efficiently implements that interface.
SPsimSeq uses a specially designed exponential family for density estimation to constructs the distribution of gene expression levels from a given real RNA sequencing data (single-cell or bulk), and subsequently simulates a new dataset from the estimated marginal distributions using Gaussian-copulas to retain the dependence between genes. It allows simulation of multiple groups and batches with any required sample size and library size.
Fits linear or generalized linear regression models using Bayesian global-local shrinkage prior hierarchies as described in Polson and Scott (2010) <doi:10.1093/acprof:oso/9780199694587.003.0017>. Provides an efficient implementation of ridge, lasso, horseshoe and horseshoe+ regression with logistic, Gaussian, Laplace, Student-t, Poisson or geometric distributed targets using the algorithms summarized in Makalic and Schmidt (2016) <doi:10.48550/arXiv.1611.06649>.
This package implements methods for Bayesian analysis of State Space Models. Includes implementations of the Particle Marginal Metropolis-Hastings algorithm described in Andrieu et al. (2010) <doi:10.1111/j.1467-9868.2009.00736.x> and automatic tuning inspired by Pitt et al. (2012) <doi:10.1016/j.jeconom.2012.06.004> and J. Dahlin and T. B. Schön (2019) <doi:10.18637/jss.v088.c02>.
This package provides tools for penalized estimation of flexible hidden Markov models for time series of counts w/o the need to specify a (parametric) family of distributions. These include functions for model fitting, model checking, and state decoding. For details, see Adam, T., Langrock, R., and Weià , C.H. (2019): Penalized Estimation of Flexible Hidden Markov Models for Time Series of Counts. <arXiv:1901.03275>.
Fits a variety of cure models using excess hazard modeling methodology such as the mixture model proposed by Phillips et al. (2002) <doi:10.1002/sim.1101> The Weibull distribution is used to represent the survival function of the uncured patients; Fits also non-mixture cure model such as the time-to-null excess hazard model proposed by Boussari et al. (2020) <doi:10.1111/biom.13361>.
Calculate the confidence interval and p value for change in C-statistic. The adjusted C-statistic is calculated by using formula as "Somers Dxy rank correlation"/2+0.5. The confidence interval was calculated by using the bootstrap method. The p value was calculated by using the Z testing method. Please refer to the article of Peter Ganz et al. (2016) <doi:10.1001/jama.2016.5951>.
This creates code names that a user can consider for their organizations, their projects, themselves, people in their organizations or projects, or whatever else. The user can also supply a numeric seed (and even a character seed) for maximum reproducibility. Use is simple and the code names produced come in various types too, contingent on what the user may be desiring as a code name or nickname.