There are several functions to implement the method for analysis in a randomized clinical trial with strata with following key features. A stratified Mann-Whitney estimator addresses the comparison between two randomized groups for a strictly ordinal response variable. The multivariate vector of such stratified Mann-Whitney estimators for multivariate response variables can be considered for one or more response variables such as in repeated measurements and these can have missing completely at random (MCAR) data. Non-parametric covariance adjustment is also considered with the minimal assumption of randomization. The p-value for hypothesis test and confidence interval are provided.
Computes Chatterjee's non-parametric correlation coefficient for time series data. It extends the original metric to time series analysis by providing the Xi-Autocorrelation Function (Xi-ACF) and Xi-Cross-Correlation Function (Xi-CCF). The package allows users to test for non-linear dependence using Iterative Amplitude Adjusted Fourier Transform (IAAFT) surrogate data. Main functions include xi_acf() and xi_ccf() for computation, along with matrix extraction tools. Methodologies are based on Chatterjee (2021) <doi:10.1080/01621459.2020.1758115> and surrogate data testing methods by Schreiber and Schmitz (1996) <doi:10.1103/PhysRevLett.77.635>.
This package provides utilities to process, organize and explore protein structure, sequence and dynamics data. Features include the ability to read and write structure, sequence and dynamic trajectory data, perform sequence and structure database searches, data summaries, atom selection, alignment, superposition, rigid core identification, clustering, torsion analysis, distance matrix analysis, structure and sequence conservation analysis, normal mode analysis, principal component analysis of heterogeneous structure data, and correlation network analysis from normal mode and molecular dynamics data. In addition, various utility functions are provided to enable the statistical and graphical power of the R environment to work with biological sequence and structural data.
Designed for simplicity, a mirai evaluates an R expression asynchronously in a parallel process, locally or distributed over the network. The result is automatically available upon completion. Modern networking and concurrency, built on nanonext and NNG (Nanomsg Next Gen), ensures reliable and efficient scheduling over fast inter-process communications or TCP/IP secured by TLS. Distributed computing can launch remote resources via SSH or cluster managers. An inherently queued architecture handles many more tasks than available processes, and requires no storage on the file system. Innovative features include support for otherwise non-exportable reference objects, event-driven promises, and asynchronous parallel map.
Suffix Array Kernel Smoothing (see https://academic.oup.com/bioinformatics/article-abstract/35/20/3944/5418797), or SArKS, identifies sequence motifs whose presence correlates with numeric scores (such as differential expression statistics) assigned to the sequences (such as gene promoters). SArKS smooths over sequence similarity, quantified by location within a suffix array based on the full set of input sequences. A second round of smoothing over spatial proximity within sequences reveals multi-motif domains. Discovered motifs can then be merged or extended based on adjacency within MMDs. False positive rates are estimated and controlled by permutation testing.
This package provides comprehensive tools for blinded sample size re-estimation (BSSR) in two-arm clinical trials with binary endpoints. Unlike traditional fixed-sample designs, BSSR allows adaptive sample size adjustments during trials while maintaining statistical integrity and study blinding. Implements five exact statistical tests: Pearson chi-squared, Fisher exact, Fisher mid-p, Z-pooled exact unconditional, and Boschloo exact unconditional tests. Supports restricted, unrestricted, and weighted BSSR approaches with exact Type I error control. Statistical methods based on Mehrotra et al. (2003) <doi:10.1111/1541-0420.00051> and Kieser (2020) <doi:10.1007/978-3-030-49528-2_21>.
This package implements a very fast C++ algorithm to quickly bootstrap receiver operating characteristics (ROC) curves and derived performance metrics, including the area under the curve (AUC) and the partial area under the curve as well as the true and false positive rate. The analysis of paired receiver operating curves is supported as well, so that a comparison of two predictors is possible. You can also plot the results and calculate confidence intervals. On a typical desktop computer the time needed for the calculation of 100000 bootstrap replicates given 500 observations requires time on the order of magnitude of one second.
Density, distribution function, quantile function, and random generation for the generalized Beta and Beta prime distributions. The family of generalized Beta distributions is conjugate for the Bayesian binomial model, and the generalized Beta prime distribution is the posterior distribution of the relative risk in the Bayesian two Poisson samples model when a Gamma prior is assigned to the Poisson rate of the reference group and a Beta prime prior is assigned to the relative risk. References: Laurent (2012) <doi:10.1214/11-BJPS139>, Hamza & Vallois (2016) <doi:10.1016/j.spl.2016.03.014>, Chen & Novick (1984) <doi:10.3102/10769986009002163>.
Run simple direct gravitational N-body simulations. The package can access different external N-body simulators (e.g. GADGET-4 by Springel et al. (2021) <doi:10.48550/arXiv.2010.03567>), but also has a simple built-in simulator. This default simulator uses a variable block time step and lets the user choose between a range of integrators, including 4th and 6th order integrators for high-accuracy simulations. Basic top-hat smoothing is available as an option. The code also allows the definition of background particles that are fixed or in uniform motion, not subject to acceleration by other particles.
This package provides small area estimation for count data type and gives option whether to use covariates in the estimation or not. By implementing Empirical Bayes (EB) Poisson-Gamma model, each function returns EB estimators and mean squared error (MSE) estimators for each area. The EB estimators without covariates are obtained using the model proposed by Clayton & Kaldor (1987) <doi:10.2307/2532003>, the EB estimators with covariates are obtained using the model proposed by Wakefield (2006) <doi:10.1093/biostatistics/kxl008> and the MSE estimators are obtained using Jackknife method by Jiang et. al. (2002) <doi:10.1214/aos/1043351257>.
This package provides a suite of auxiliary functions that enhance time series estimation and forecasting, including a robust anomaly detection routine based on Chen and Liu (1993) <doi:10.2307/2290724> (imported and wrapped from the tsoutliers package), utilities for managing calendar and time conversions, performance metrics to assess both point forecasts and distributional predictions, advanced simulation by allowing the generation of time series componentsâ such as trend, seasonal, ARMA, irregular, and anomaliesâ in a modular fashion based on the innovations form of the state space model and a number of transformation methods including Box-Cox, Logit, Softplus-Logit and Sigmoid.
The analysis of environmental data often requires the detection of trends and change-points. This package includes tests for trend detection (Cox-Stuart Trend Test, Mann-Kendall Trend Test, (correlated) Hirsch-Slack Test, partial Mann-Kendall Trend Test, multivariate (multisite) Mann-Kendall Trend Test, (Seasonal) Sen's slope, partial Pearson and Spearman correlation trend test), change-point detection (Lanzante's test procedures, Pettitt's test, Buishand Range Test, Buishand U Test, Standard Normal Homogeinity Test), detection of non-randomness (Wallis-Moore Phase Frequency Test, Bartels rank von Neumann's ratio test, Wald-Wolfowitz Test) and the two sample Robust Rank-Order Distributional Test.
This package provides a friendly (flexible) Markov Chain Monte Carlo (MCMC) framework for implementing Metropolis-Hastings algorithm in a modular way allowing users to specify automatic convergence checker, personalized transition kernels, and out-of-the-box multiple MCMC chains using parallel computing. Most of the methods implemented in this package can be found in Brooks et al. (2011, ISBN 9781420079425). Among the methods included, we have: Haario (2001) <doi:10.1007/s11222-011-9269-5> Adaptive Metropolis, Vihola (2012) <doi:10.1007/s11222-011-9269-5> Robust Adaptive Metropolis, and Thawornwattana et al. (2018) <doi:10.1214/17-BA1084> Mirror transition kernels.
This is a set of functions to retrieve information about GIMMS NDVI3g files currently available online; download (and re-arrange, in the case of NDVI3g.v0) the half-monthly data sets; import downloaded files from ENVI binary (NDVI3g.v0) or NetCDF format (NDVI3g.v1) directly into R based on the widespread raster package; conduct quality control; and generate monthly composites (e.g., maximum values) from the half-monthly input data. As a special gimmick, a method is included to conveniently apply the Mann-Kendall trend test upon Raster* images, optionally featuring trend-free pre-whitening to account for lag-1 autocorrelation.
This package provides a method for detecting outliers with a Kalman filter on impulsed noised outliers and prediction on cleaned data. kfino is a robust sequential algorithm allowing to filter data with a large number of outliers. This algorithm is based on simple latent linear Gaussian processes as in the Kalman Filter method and is devoted to detect impulse-noised outliers. These are data points that differ significantly from other observations. ML (Maximization Likelihood) and EM (Expectation-Maximization algorithm) algorithms were implemented in kfino'. The method is described in full details in the following arXiv e-Print: <arXiv:2208.00961>.
Estimates random effect latent measurement models, wherein the loadings, residual variances, intercepts, latent means, and latent variances all vary across groups. The random effect variances of the measurement parameters are then modeled using a hierarchical inclusion model, wherein the inclusion of the variances (i.e., whether it is effectively zero or non-zero) is informed by similar parameters (of the same type, or of the same item). This additional hierarchical structure allows the evidence in favor of partial invariance to accumulate more quickly, and yields more certain decisions about measurement invariance. Martin, Williams, and Rast (2020) <doi:10.31234/osf.io/qbdjt>.
This R package provides an implementation of multivariate extensions of a well-known fractal analysis technique, Detrended Fluctuations Analysis (DFA; Peng et al., 1995<doi:10.1063/1.166141>), for multivariate time series: multivariate DFA (mvDFA). Several coefficients are implemented that take into account the correlation structure of the multivariate time series to varying degrees. These coefficients may be used to analyze long memory and changes in the dynamic structure that would by univariate DFA. Therefore, this R package aims to extend and complement the original univariate DFA (Peng et al., 1995) for estimating the scaling properties of nonstationary time series.
This package provides methods for estimating and utilizing the multivariate generalized propensity score (mvGPS) for multiple continuous exposures described in Williams, J.R, and Crespi, C.M. (2020) <arxiv:2008.13767>. The methods allow estimation of a dose-response surface relating the joint distribution of multiple continuous exposure variables to an outcome. Weights are constructed assuming a multivariate normal density for the marginal and conditional distribution of exposures given a set of confounders. Confounders can be different for different exposure variables. The weights are designed to achieve balance across all exposure dimensions and can be used to estimate dose-response surfaces.
The library allows to perform a multivariate time series classification based on the use of Discrete Wavelet Transform for feature extraction, a step wise discriminant to select the most relevant features and finally, the use of a linear or quadratic discriminant for classification. Note that all these steps can be done separately which allows to implement new steps. Velasco, I., Sipols, A., de Blas, C. S., Pastor, L., & Bayona, S. (2023) <doi:10.1186/S12938-023-01079-X>. Percival, D. B., & Walden, A. T. (2000,ISBN:0521640687). Maharaj, E. A., & Alonso, A. M. (2014) <doi:10.1016/j.csda.2013.09.006>.
Several fast random number generators are provided as C++ header-only libraries: the PCG family as well as Xoroshiro128+ and Xoshiro256+. Additionally, fast functions for generating random numbers according to a uniform, normal and exponential distribution are included. The latter two use the Ziggurat algorithm originally proposed by Marsaglia and Tsang. These functions are exported to R and as a C++ interface and are enabled for use with the default 64 bit generator from the PCG family, Xoroshiro128+ and Xoshiro256+ as well as the 64 bit version of the 20 rounds Threefry engine (Salmon et al., 2011) as provided by the package sitmo.
SPIAT (**Sp**atial **I**mage **A**nalysis of **T**issues) is an R package with a suite of data processing, quality control, visualization and data analysis tools. SPIAT is compatible with data generated from single-cell spatial proteomics platforms (e.g. OPAL, CODEX, MIBI, cellprofiler). SPIAT reads spatial data in the form of X and Y coordinates of cells, marker intensities and cell phenotypes. SPIAT includes six analysis modules that allow visualization, calculation of cell colocalization, categorization of the immune microenvironment relative to tumor areas, analysis of cellular neighborhoods, and the quantification of spatial heterogeneity, providing a comprehensive toolkit for spatial data analysis.
This package provides functions to standardise the analysis of Differential Allelic Representation (DAR). DAR compromises the integrity of Differential Expression analysis results as it can bias expression, influencing the classification of genes (or transcripts) as being differentially expressed. DAR analysis results in an easy-to-interpret value between 0 and 1 for each genetic feature of interest, where 0 represents identical allelic representation and 1 represents complete diversity. This metric can be used to identify features prone to false-positive calls in Differential Expression analysis, and can be leveraged with statistical methods to alleviate the impact of such artefacts on RNA-seq data.
Exclusion-based parentage assignment is essential for studies in biodiversity conservation and breeding programs - Kang Huang, Rui Mi, Derek W Dunn, Tongcheng Wang, Baoguo Li, (2018), <doi:10.1534/genetics.118.301592>. The tool compares multilocus genotype data of potential parents and offspring, identifying likely parentage relationships while accounting for genotyping errors, missing data, and duplicate genotypes. acoRn includes two algorithms: one generates synthetic genotype data based on user-defined parameters, while the other analyzes existing genotype data to identify parentage patterns. The package is versatile, applicable to diverse organisms, and offers clear visual outputs, making it a valuable resource for researchers.
This package provides a tool for causal meta-analysis. This package implements the aggregation formulas and inference methods proposed in Berenfeld et al. (2025) <doi:10.48550/arXiv.2505.20168>. Users can input aggregated data across multiple studies and compute causally meaningful aggregated effects of their choice (risk difference, risk ratio, odds ratio, etc) under user-specified population weighting. The built-in function camea() allows to obtain precise variance estimates for these effects and to compare the latter to a classical meta-analysis aggregate, the random effect model, as implemented in the metafor package <https://CRAN.R-project.org/package=metafor>.