It implements many univariate and multivariate permutation (and rotation) tests. Allowed tests: the t one and two samples, ANOVA, linear models, Chi Squared test, rank tests (i.e. Wilcoxon, Mann-Whitney, Kruskal-Wallis), Sign test and Mc Nemar. Test on Linear Models are performed also in presence of covariates (i.e. nuisance parameters). The permutation and the rotation methods to get the null distribution of the test statistics are available. It also implements methods for multiplicity control such as Westfall & Young minP procedure and Closed Testing (Marcus, 1976) and k-FWER. Moreover, it allows to test for fixed effects in mixed effects models.
This package provides Gaussian process (GP) regression tools for social science inference problems. GPs combine flexible nonparametric regression with principled uncertainty quantification: rather than committing to a single model fit, the posterior reflects lesser knowledge at the edge of or beyond the observed data, where other approaches become highly model-dependent. The package reduces user-chosen hyperparameters from three to zero and supplies convenience functions for regression discontinuity (gp_rdd()), interrupted time-series (gp_its()), and general GP fitting (gpss(), gp_train(), gp_predict()). Methods are described in Cho, Kim, and Hazlett (2026) <doi:10.1017/pan.2026.10032>.
Non-parametric estimators for casual effects based on longitudinal modified treatment policies as described in Diaz, Williams, Hoffman, and Schenck <doi:10.1080/01621459.2021.1955691>, traditional point treatment, and traditional longitudinal effects. Continuous, binary, categorical treatments, and multivariate treatments are allowed as well are censored outcomes. The treatment mechanism is estimated via a density ratio classification procedure irrespective of treatment variable type. For both continuous and binary outcomes, additive treatment effects can be calculated and relative risks and odds ratios may be calculated for binary outcomes. Supports survival outcomes with competing risks (Diaz, Hoffman, and Hejazi; <doi:10.1007/s10985-023-09606-7>).
Climate-sensitive, single-tree forest simulator based on data-driven machine learning. It simulates the main forest processesâ radial growth, height growth, mortality, crown recession, regeneration, and harvestingâ so users can assess stand development under climate and management scenarios. The height model is described by Skudnik and JevÅ¡enak (2022) <doi:10.1016/j.foreco.2022.120017>, the basal-area increment model by JevÅ¡enak and Skudnik (2021) <doi:10.1016/j.foreco.2020.118601>, and an overview of the MLFS package, workflow, and applications is provided by JevÅ¡enak, ArniÄ , Krajnc, and Skudnik (2023), Ecological Informatics <doi:10.1016/j.ecoinf.2023.102115>.
This package provides functionality to process text files created by Emacs Org mode, and decompose the content to the smallest components (headlines, body, tag, clock entries etc). Emacs is an extensible, customizable text editor and Org mode is for keeping notes, maintaining TODO lists, planning projects. Allows users to analyze org files as data frames in R, e.g., to convieniently group tasks by tag into project and calculate total working hours. Also provides some help functions like search.parent, gg.pie (visualise working hours in ggplot2) and tree.headlines (visualise headline stricture in tree format) to help user managing their complex org files.
This package provides a comprehensive framework for planning and executing analyses in R. It provides a structured approach to running the same function multiple times with different arguments, executing multiple functions on the same datasets, and creating systematic analyses across multiple strata or variables. The framework is particularly useful for applying the same analysis across multiple strata (e.g., locations, age groups), running statistical methods on multiple variables (e.g., exposures, outcomes), generating multiple tables or graphs for reports, and creating systematic surveillance analyses. Key features include efficient data management, structured analysis planning, flexible execution options, built-in debugging tools, and hash-based caching.
Set of tools to find coherent patterns in gene expression (microarray) data using a Bayesian Sparse Latent Factor Model (SLFM) <DOI:10.1007/978-3-319-12454-4_15>. Considerable effort has been put to build a fast and memory efficient package, which makes this proposal an interesting and computationally convenient alternative to study patterns of gene expressions exhibited in matrices. The package contains the implementation of two versions of the model based on different mixture priors for the loadings: one relies on a degenerate component at zero and the other uses a small variance normal distribution for the spike part of the mixture.
This package implements the Vine Copula Change Point (VCCP) methodology for the estimation of the number and location of multiple change points in the vine copula structure of multivariate time series. The method uses vine copulas, various state-of-the-art segmentation methods to identify multiple change points, and a likelihood ratio test or the stationary bootstrap for inference. The vine copulas allow for various forms of dependence between time series including tail, symmetric and asymmetric dependence. The functions have been extensively tested on simulated multivariate time series data and fMRI data. For details on the VCCP methodology, please see Xiong & Cribben (2021).
In order to achieve accurate estimation without sparsity assumption on the precision matrix, element-wise inference on the precision matrix, and joint estimation of multiple Gaussian graphical models, a novel method is proposed and efficient algorithm is implemented. FLAG() is the main function given a data matrix, and FlagOneEdge() will be used when one pair of random variables are interested where their indices should be given. Flexible and Accurate Methods for Estimation and Inference of Gaussian Graphical Models with Applications, see Qian Y (2023) <doi:10.14711/thesis-991013223054603412>, Qian Y, Hu X, Yang C (2023) <doi:10.48550/arXiv.2306.17584>.
This package provides a collection of string functions designed for writing compact and expressive R code. yasp (Yet Another String Package) is simple, fast, dependency-free, and written in pure R. The package provides: a coherent set of abbreviations for paste() from package base with a variety of defaults, such as p() for "paste" and pcc() for "paste and collapse with commas"; wrap(), bracket(), and others for wrapping a string in flanking characters; unwrap() for removing pairs of characters (at any position in a string); and sentence() for cleaning whitespace around punctuation and capitalization appropriate for prose sentences.
This package provides methods for manipulating regression models and for describing these in a style adapted for medical journals. It contains functions for generating an HTML table with crude and adjusted estimates, plotting hazard ratio, plotting model estimates and confidence intervals using forest plots, extending this to comparing multiple models in a single forest plots. In addition to the descriptive methods, there are functions for the robust covariance matrix provided by the sandwich package, a function for adding non-linearities to a model, and a wrapper around the Epi package's Lexis() functions for time-splitting a dataset when modeling non-proportional hazards in Cox regressions.
Dino normalizes single-cell, mRNA sequencing data to correct for technical variation, particularly sequencing depth, prior to downstream analysis. The approach produces a matrix of corrected expression for which the dependency between sequencing depth and the full distribution of normalized expression; many existing methods aim to remove only the dependency between sequencing depth and the mean of the normalized expression. This is particuarly useful in the context of highly sparse datasets such as those produced by 10X genomics and other uninque molecular identifier (UMI) based microfluidics protocols for which the depth-dependent proportion of zeros in the raw expression data can otherwise present a challenge.
Statistical methods and related graphical representations for the Desirability of Outcome Ranking (DOOR) methodology. The DOOR is a paradigm for the design, analysis, interpretation of clinical trials and other research studies based on the patient centric benefit risk evaluation. The package provides functions for generating summary statistics from individual level/summary level datasets, conduct DOOR probability-based inference, and visualization of the results. For more details of DOOR methodology, see Hamasaki and Evans (2025) <doi:10.1201/9781003390855>. For more explanation of the statistical methods and the graphics, see the technical document and user manual of the DOOR Shiny apps at <https://methods.bsc.gwu.edu>.
Computes characteristics of independent rainfall events (duration, total rainfall depth, and intensity) extracted from a sub-daily rainfall time series based on the inter-event time definition (IETD) method. To have a reference value of IETD, it also analyzes/computes IETD values through three methods: autocorrelation analysis, the average annual number of events analysis, and coefficient of variation analysis. Ideal for analyzing the sensitivity of IETD to characteristics of independent rainfall events. Adams B, Papa F (2000) <ISBN: 978-0-471-33217-6>. Joo J et al. (2014) <doi:10.3390/w6010045>. Restrepo-Posada P, Eagleson P (1982) <doi:10.1016/0022-1694(82)90136-6>.
This package provides a general framework of two directional simultaneous inference is provided for high-dimensional as well as the fixed dimensional models with manifest variable or latent variable structure, such as high-dimensional mean models, high- dimensional sparse regression models, and high-dimensional latent factors models. It is making the simultaneous inference on a set of parameters from two directions, one is testing whether the estimated zero parameters indeed are zero and the other is testing whether there exists zero in the parameter set of non-zero. More details can be referred to Wei Liu, et al. (2022) <doi:10.48550/arXiv.2012.11100>.
The package obtains parameter estimation, i.e., maximum likelihood estimators (MLE), via the Expectation-Maximization (EM) algorithm for the Finite Mixture of Regression (FMR) models with Normal distribution, and MLE for the Finite Mixture of Accelerated Failure Time Regression (FMAFTR) subject to right censoring with Log-Normal and Weibull distributions via the EM algorithm and the Newton-Raphson algorithm (for Weibull distribution). More importantly, the package obtains the maximum penalized likelihood (MPLE) for both FMR and FMAFTR models (collectively called FMRs). A component-wise tuning parameter selection based on a component-wise BIC is implemented in the package. Furthermore, this package provides Ridge Regression and Elastic Net.
Chinese numerals processing in R, such as conversion between Chinese numerals and Arabic numerals as well as detection and extraction of Chinese numerals in character objects and string. This package supports the casual scale naming system and the respective SI prefix systems used in mainland China and Taiwan: "The State Council's Order on the Unified Implementation of Legal Measurement Units in Our Country" The State Council of the People's Republic of China (1984) "Names, Definitions and Symbols of the Legal Units of Measurement and the Decimal Multiples and Submultiples" Ministry of Economic Affairs (2019) <https://gazette.nat.gov.tw/egFront/detail.do?metaid=108965>.
Bayesian factor models are effective tools for dimension reduction. This is especially applicable to multivariate large-scale datasets. It allows researchers to understand the latent factors of the data which are the linear or non-linear combination of the variables. Dynamic Intrinsic Conditional Autocorrelative Priors (ICAR) Spatiotemporal Factor Models DIFM package provides function to run Markov Chain Monte Carlo (MCMC), evaluation methods and visual plots from Shin and Ferreira (2023)<doi:10.1016/j.spasta.2023.100763>. Our method is a class of Bayesian factor model which can account for spatial and temporal correlations. By incorporating these correlations, the model can capture specific behaviors and provide predictions.
Kernel Learning Integrative Clustering (KLIC) is an algorithm that allows to combine multiple kernels, each representing a different measure of the similarity between a set of observations. The contribution of each kernel on the final clustering is weighted according to the amount of information carried by it. As well as providing the functions required to perform the kernel-based clustering, this package also allows the user to simply give the data as input: the kernels are then built using consensus clustering. Different strategies to choose the best number of clusters are also available. For further details please see Cabassi and Kirk (2020) <doi:10.1093/bioinformatics/btaa593>.
Calculation routines based on the FOCUS Kinetics Report (2006, 2014). Includes a function for conveniently defining differential equation models, model solution based on eigenvalues if possible or using numerical solvers. If a C compiler (on windows: Rtools') is installed, differential equation models are solved using automatically generated C functions. Non-constant errors can be taken into account using variance by variable or two-component error models <doi:10.3390/environments6120124>. Hierarchical degradation models can be fitted using nonlinear mixed-effects model packages as a back end <doi:10.3390/environments8080071>. Please note that no warranty is implied for correctness of results or fitness for a particular purpose.
This package implements multivariate Fay-Herriot models for small area estimation. It uses empirical best linear unbiased prediction (EBLUP) estimator. Multivariate models consider the correlation of several target variables and borrow strength from auxiliary variables to improve the effectiveness of a domain sample size. Models which accommodated by this package are univariate model with several target variables (model 0), multivariate model (model 1), autoregressive multivariate model (model 2), and heteroscedastic autoregressive multivariate model (model 3). Functions provide EBLUP estimators and mean squared error (MSE) estimator for each model. These models were developed by Roberto Benavent and Domingo Morales (2015) <doi:10.1016/j.csda.2015.07.013>.
Support for a variety of commonly used precision agriculture operations. Includes functions to download and process raw satellite images from Sentinel-2 <https://documentation.dataspace.copernicus.eu/APIs/OData.html>. Includes functions that download vegetation index statistics for a given period of time, without the need to download the raw images <https://documentation.dataspace.copernicus.eu/APIs/SentinelHub/Statistical.html>. There are also functions to download and visualize weather data in a historical context. Lastly, the package also contains functions to process yield monitor data. These functions can build polygons around recorded data points, evaluate the overlap between polygons, clean yield data, and smooth yield maps.
Calculates nonparametric pointwise confidence intervals for the survival distribution for right censored data, and for medians [Fay and Brittain <DOI:10.1002/sim.6905>]. Has two-sample tests for dissimilarity (e.g., difference, ratio or odds ratio) in survival at a fixed time, and differences in medians [Fay, Proschan, and Brittain <DOI:10.1111/biom.12231>]. Basically, the package gives exact inference methods for one- and two-sample exact inferences for Kaplan-Meier curves (e.g., generalizing Fisher's exact test to allow for right censoring), which are especially important for latter parts of the survival curve, small sample sizes or heavily censored data. Includes mid-p options.
This package provides a collection of functions that perform jump regression and image analysis such as denoising, deblurring and jump detection. The implemented methods are based on the following research: Qiu, P. (1998) <doi:10.1214/aos/1024691468>, Qiu, P. and Yandell, B. (1997) <doi: 10.1080/10618600.1997.10474746>, Qiu, P. (2009) <doi: 10.1007/s10463-007-0166-9>, Kang, Y. and Qiu, P. (2014) <doi: 10.1080/00401706.2013.844732>, Qiu, P. and Kang, Y. (2015) <doi: 10.5705/ss.2014.054>, Kang, Y., Mukherjee, P.S. and Qiu, P. (2018) <doi: 10.1080/00401706.2017.1415975>, Kang, Y. (2020) <doi: 10.1080/10618600.2019.1665536>.