This package provides tools designed to perform and evaluate cluster analysis (including Tocher's algorithm), discriminant analysis and path analysis (standard and under collinearity), as well as some useful miscellaneous tools for dealing with sample size and optimum plot size calculations. A test for seed sample heterogeneity is now available. Mantel's permutation test can be found in this package. A new approach for calculating its power is implemented. biotools also contains tests for genetic covariance components. Heuristic approaches for performing non-parametric spatial predictions of generic response variables and spatial gene diversity are implemented.
General optimisation and specific tools for the parameter estimation (i.e. calibration) of complex models, including stochastic ones. It implements generic functions that can be used for fitting any type of models, especially those with non-differentiable objective functions, with the same syntax as base::optim. It supports multiple phases estimation (sequential parameter masking), constrained optimization (bounding box restrictions) and automatic parallel computation of numerical gradients. Some common maximum likelihood estimation methods and automated construction of the objective function from simulated model outputs is provided. See <https://roliveros-ramos.github.io/calibrar/> for more details.
Statistical methods for retrospectively detecting changes in location and/or dispersion of univariate and multivariate variables. Data values are assumed to be independent, can be individual (one observation at each instant of time) or subgrouped (more than one observation at each instant of time). Control limits are computed, often using a permutation approach, so that a prescribed false alarm probability is guaranteed without making any parametric assumptions on the stable (in-control) distribution. See G. Capizzi and G. Masarotto (2018) <doi:10.1007/978-3-319-75295-2_1> for an introduction to the package.
The EM algorithm is a powerful tool for computing maximum likelihood estimates with incomplete data. This package will help to applying EM algorithm based on triangular and trapezoidal fuzzy numbers (as two kinds of incomplete data). A method is proposed for estimating the unknown parameter in a parametric statistical model when the observations are triangular or trapezoidal fuzzy numbers. This method is based on maximizing the observed-data likelihood defined as the conditional probability of the fuzzy data; for more details and formulas see Denoeux (2011) <doi:10.1016/j.fss.2011.05.022>.
This package provides tools for simulating mathematical models of infectious disease dynamics. Epidemic model classes include deterministic compartmental models, stochastic individual-contact models, and stochastic network models. Network models use the robust statistical methods of exponential-family random graph models (ERGMs) from the Statnet suite of software packages in R. Standard templates for epidemic modeling include SI, SIR, and SIS disease types. EpiModel features an API for extending these templates to address novel scientific research aims. Full methods for EpiModel are detailed in Jenness et al. (2018, <doi:10.18637/jss.v084.i08>).
This package provides a shiny-based front end (the ExPanD app) and a set of functions for exploratory data analysis. Run as a web-based app, ExPanD enables users to assess the robustness of empirical evidence without providing them access to the underlying data. You can export a notebook containing the analysis of ExPanD and/or use the functions of the package to support your exploratory data analysis workflow. Refer to the vignettes of the package for more information on how to use ExPanD and/or the functions of this package.
Identifying labeled compounds in a 13C-tracer experiment in non-targeted fashion is a cumbersome process. This package facilitates such type of analyses by providing high level quality control plots, deconvoluting and evaluating spectra and performing a multitude of tests in an automatic fashion. The main idea is to use changing intensity ratios of ion pairs from peak list generated with xcms as candidates and evaluate those against base peak chromatograms and spectra information within the raw measurement data automatically. The functionality is described in Hoffmann et al. (2018) <doi:10.1021/acs.analchem.8b00356>.
Dealing with neutrosophic data of the form N=D+I(where N is a Neutrosophic number ,D is the determinant part of the number and I is the indeterminacy part) using the neutrosophic two way anova test keeps the type I error low. This algorithm calculates the fisher statistics when we have a neutrosophic data, also tests two hypothesizes, first is to test differences between treatments, and second is to test differences between sectors. For more information see Miari, Mahmoud; Anan, Mohamad Taher; Zeina, Mohamed Bisher(2022) <https://www.americaspg.com/articleinfo/21/show/1058>.
Normative data are often used to estimate the relative position of a raw test score in the population. This package allows for deriving regression-based normative data. It includes functions that enable the fitting of regression models for the mean and residual (or variance) structures, test the model assumptions, derive the normative data in the form of normative tables or automatic scoring sheets, and estimate confidence intervals for the norms. This package accompanies the book Van der Elst, W. (2024). Regression-based normative data for psychological assessment. A hands-on approach using R. Springer Nature.
The comprehensive knowledge of epigenetic modifications in plants, encompassing histone modifications in regulating gene expression, is not completely ingrained. It is noteworthy that histone deacetylation and histone H3 lysine 27 trimethylation (H3K27me3) play a role in repressing transcription in eukaryotes. In contrast, histone acetylation (H3K9ac) and H3K4me3 have been inevitably linked to the stimulation of gene expression, which significantly influences plant development and plays a role in plant responses to biotic and abiotic stresses. To our knowledge this the first multiclass classifier for predicting histone modification in plants. <doi:10.1186/s12864-019-5489-4>.
Datetimes and timestamps are invariably an imprecise notation, with any partial representation implying some amount of uncertainty. To handle this, parttime provides classes for embedding partial missingness as a central part of its datetime classes. This central feature allows for more ergonomic use of datetimes for challenging datetime computation, including calculations of overlapping date ranges, imputations, and more thoughtful handling of ambiguity that arises from uncertain time zones. This package was developed first and foremost with pharmaceutical applications in mind, but aims to be agnostic to application to accommodate general use cases just as conveniently.
The goal of PlotFTIR is to easily and quickly kick-start the production of journal-quality Fourier Transform Infra-Red (FTIR) spectral plots in R using ggplot2'. The produced plots can be published directly or further modified by ggplot2 functions. L'objectif de PlotFTIR est de démarrer facilement et rapidement la production des tracés spectraux de spectroscopie infrarouge à transformée de Fourier (IRTF) de qualité journal dans R à l'aide de ggplot2'. Les tracés produits peuvent être publiés directement ou modifiés davantage par les fonctions ggplot2'.
Researchers working with Qualitative Comparative Analysis (QCA) can use the package to estimate power of a sufficient term using permutation tests. A term can be anything: A condition, conjunction or disjunction of any combination of these. The package further allows users to plot the estimation results and to estimate the number of cases required to achieve a certain level of power, given a prespecified null and alternative hypothesis. Reference for the article introducing power estimation for QCA is: Rohlfing, Ingo (2018) <doi:10.1017/pan.2017.30> (ungated version: <doi:10.17605/OSF.IO/PC4DF>).
Dual interfaces, graphical and programmatic, designed for intuitive applications of Multilevel Regression and Poststratification (MRP). Users can apply the method to a variety of datasets, from electronic health records to sample survey data, through an end-to-end Bayesian data analysis workflow. The package provides robust tools for data cleaning, exploratory analysis, flexible model building, and insightful result visualization. For more details, see Si et al. (2020) <https://www150.statcan.gc.ca/n1/en/pub/12-001-x/2020002/article/00003-eng.pdf?st=iF1_Fbrh> and Si (2025) <doi:10.1214/24-STS932>.
It allows to rapidly compute, bootstrap and plot up to fourth-order Sobol'-based sensitivity indices using several state-of-the-art first and total-order estimators. Sobol indices can be computed either for models that yield a scalar as a model output or for systems of differential equations. The package also provides a suit of benchmark tests functions and several options to obtain publication-ready figures of the model output uncertainty and sensitivity-related analysis. An overview of the package can be found in Puy et al. (2022) <doi:10.18637/jss.v102.i05>.
This package provides a user-friendly R shiny app for performing various statistical tests on datasets. It allows users to upload data in numerous formats and perform statistical analyses. The app dynamically adapts its options based on the selected columns and supports both single and multiple column comparisons. The app's user interface is designed to streamline the process of selecting datasets, columns, and test options, making it easy for users to explore and interpret their data. The underlying functions for statistical tests are well-organized and can be used independently within other R scripts.
This package AMARETTO represents an algorithm that integrates copy number, DNA methylation and gene expression data to identify a set of driver genes by analyzing cancer samples and connects them to clusters of co-expressed genes, which we define as modules. AMARETTO can be applied in a pancancer setting to identify cancer driver genes and their modules on multiple cancer sites. AMARETTO captures modules enriched in angiogenesis, cell cycle and EMT, and modules that accurately predict survival and molecular subtypes. This allows AMARETTO to identify novel cancer driver genes directing canonical cancer pathways.
This package provides a lightweight unit testing framework. Main features:
install tests with the package;
test results are treated as data that can be stored and manipulated;
test files are R scripts interspersed with test commands, that can be programmed over;
fully automated build-install-test sequence for packages;
skip tests when not run locally (e.g. on CRAN);
flexible and configurable output printing;
compare computed output with output stored with the package;
run tests in parallel;
extensible by other packages;
report side effects.
Color and visualize wildlife distributions in space-time using raster data. In addition to enabling display of sequential change in distributions through the use of small multiples, colorist provides functions for extracting several features of interest from a sequence of distributions and for visualizing those features using HCL (hue-chroma-luminance) color palettes. Resulting maps allow for "fair" visual comparison of intensity values (e.g., occurrence, abundance, or density) across space and time and can be used to address questions about where, when, and how consistently a species, group, or individual is likely to be found.
Enables the construction of flexible urban delineations that can be tailored to specific applications or research questions, see Van Migerode et al. (2024) <DOI:10.1177/23998083241262545> and Van Migerode et al. (2025) <DOI:10.5281/zenodo.15173220>. Originally developed to flexibly reconstruct the Degree of Urbanisation classification of cities, towns and rural areas developed by Dijkstra et al. (2021) <DOI:10.1016/j.jue.2020.103312>. Now it also support a broader range of delineation approaches, using multiple datasets â including population, built-up area, and night-time light grids â and different thresholding methods.
This package implements methods for centrality related analyses of networks. While the package includes the possibility to build more than 20 indices, its main focus lies on index-free assessment of centrality via partial rankings obtained by neighborhood-inclusion or positional dominance. These partial rankings can be analyzed with different methods, including probabilistic methods like computing expected node ranks and relative rank probabilities (how likely is it that a node is more central than another?). The methodology is described in depth in the vignettes and in Schoch (2018) <doi:10.1016/j.socnet.2017.12.003>.
This package provides tools for analyzing and optimizing PTSD (Post-Traumatic Stress Disorder) diagnostic criteria using PCL-5 (PTSD Checklist for DSM-5) data. Functions identify optimal subsets of PCL-5 items that maintain diagnostic accuracy while reducing assessment burden. Includes tools for both hierarchical (cluster-based) and non-hierarchical symptom combinations, calculation of diagnostic metrics, and comparison with standard DSM-5 criteria. Model validation is conducted using holdout and cross-validation methods to assess robustness and generalizability of the results. For more details see Weidmann et al. (2025) <doi:10.31219/osf.io/6rk72_v1>.
Genomic alterations including single nucleotide substitution, copy number alteration, etc. are the major force for cancer initialization and development. Due to the specificity of molecular lesions caused by genomic alterations, we can generate characteristic alteration spectra, called signature (Wang, Shixiang, et al. (2021) <DOI:10.1371/journal.pgen.1009557> & Alexandrov, Ludmil B., et al. (2020) <DOI:10.1038/s41586-020-1943-3> & Steele Christopher D., et al. (2022) <DOI:10.1038/s41586-022-04738-6>). This package helps users to extract, analyze and visualize signatures from genomic alteration records, thus providing new insight into cancer study.
This is an interface for the Python package StepMix'. It is a Python package following the scikit-learn API for model-based clustering and generalized mixture modeling (latent class/profile analysis) of continuous and categorical data. StepMix handles missing values through Full Information Maximum Likelihood (FIML) and provides multiple stepwise Expectation-Maximization (EM) estimation methods based on pseudolikelihood theory. Additional features include support for covariates and distal outcomes, various simulation utilities, and non-parametric bootstrapping, which allows inference in semi-supervised and unsupervised settings. Software paper available at <doi:10.18637/jss.v113.i08>.