This package provides movies to help students to understand statistical concepts. The rpanel package <https://cran.r-project.org/package=rpanel> is used to create interactive plots that move to illustrate key statistical ideas and methods. There are movies to: visualise probability distributions (including user-supplied ones); illustrate sampling distributions of the sample mean (central limit theorem), the median, the sample maximum (extremal types theorem) and (the Fisher transformation of the) product moment correlation coefficient; examine the influence of an individual observation in simple linear regression; illustrate key concepts in statistical hypothesis testing. Also provided are dpqr functions for the distribution of the Fisher transformation of the correlation coefficient under sampling from a bivariate normal distribution.
Statistical methods for analyzing case-control point data. Methods include the ratio of kernel densities, the difference in K Functions, the spatial scan statistic, and q nearest neighbors of cases.
This package provides tools for smoothing and tidying spatial features (i.e. lines and polygons) to make them more aesthetically pleasing. Smooth curves, fill holes, and remove small fragments from lines and polygons.
This package provides a collection of recycled and modified R functions to aid in file manipulation, data exploration, wrangling, optimization, and object manipulation. Other functions aid in convenient data visualization, loop progression, software packaging, and installation.
Test for univariate and bivariate spatial patterns in spatial omics data with single-molecule resolution. The tests implemented allow for analysis of nested designs and are automatically calibrated to different biological specimens. Tests for aggregation, colocalization, gradients and vicinity to cell edge or centroid are provided.
This package provides functions for creating and annotating a composite plot in ggplot2'. Offers background themes and shortcut plotting functions that produce figures that are appropriate for the format of scientific journals. Some methods are described in Min and Zhou (2021) <doi:10.3389/fgene.2021.802894>.
This package enables automated selection of group specific signature, especially for rare population. The package is developed for generating specifc lists of signature genes based on Term Frequency-Inverse Document Frequency (TF-IDF) modified methods. It can also be used as a new gene-set scoring method or data transformation method. Multiple visualization functions are implemented in this package.
This package provides a flexible moving average algorithm for modeling drug exposure in pharmacoepidemiology studies as presented in the article: Ouchi, D., Giner-Soriano, M., Gómez-Lumbreras, A., Vedia Urgell, C.,Torres, F., & Morros, R. (2022). "Automatic Estimation of the Most Likely Drug Combination in Electronic Health Records Using the Smooth Algorithm : Development and Validation Study." JMIR medical informatics, 10(11), e37976. <doi:10.2196/37976>.
Bayesian analysis of censored linear mixed-effects models that replace Gaussian assumptions with a flexible class of distributions, such as the scale mixture of normal family distributions, considering a damped exponential correlation structure which was employed to account for within-subject autocorrelation among irregularly observed measures. For more details, see Kelin Zhong, Fernanda L. Schumacher, Luis M. Castro, Victor H. Lachos (2025) <doi:10.1002/sim.10295>.
This tiny package contains one function smirnov() which calculates two scaled taxonomic coefficients, Txy (coefficient of similarity) and Txx (coefficient of originality). These two characteristics may be used for the analysis of similarities between any number of taxonomic groups, and also for assessing uniqueness of giving taxon. It is possible to use smirnov() output as a distance measure: convert it to distance by "as.dist(1 - smirnov(x))".
This package provides the SMOTE with Boosting (SMOTEWB) algorithm. See F. SaÄ lam, M. A. Cengiz (2022) <doi:10.1016/j.eswa.2022.117023>. It is a SMOTE-based resampling technique which creates synthetic data on the links between nearest neighbors. SMOTEWB uses boosting weights to determine where to generate new samples and automatically decides the number of neighbors for each sample. It is robust to noise and outperforms most of the alternatives according to Matthew Correlation Coefficient metric. Alternative resampling methods are also available in the package.
Flexible multidimensional scaling (MDS) methods and extensions to the package smacof'. This package contains various functions, wrappers, methods and classes for fitting, plotting and displaying a large number of different flexible MDS models. These are: Torgerson scaling (Torgerson, 1958, ISBN:978-0471879459) with powers, Sammon mapping (Sammon, 1969, <doi:10.1109/T-C.1969.222678>) with ratio and interval optimal scaling, Multiscale MDS (Ramsay, 1977, <doi:10.1007/BF02294052>) with ratio and interval optimal scaling, s-stress MDS (ALSCAL; Takane, Young & De Leeuw, 1977, <doi:10.1007/BF02293745>) with ratio and interval optimal scaling, elastic scaling (McGee, 1966, <doi:10.1111/j.2044-8317.1966.tb00367.x>) with ratio and interval optimal scaling, r-stress MDS (De Leeuw, Groenen & Mair, 2016, <https://rpubs.com/deleeuw/142619>) with ratio, interval, splines and nonmetric optimal scaling, power-stress MDS (POST-MDS; Buja & Swayne, 2002 <doi:10.1007/s00357-001-0031-0>) with ratio and interval optimal scaling, restricted power-stress (Rusch, Mair & Hornik, 2021, <doi:10.1080/10618600.2020.1869027>) with ratio and interval optimal scaling, approximate power-stress with ratio optimal scaling (Rusch, Mair & Hornik, 2021, <doi:10.1080/10618600.2020.1869027>), Box-Cox MDS (Chen & Buja, 2013, <https://jmlr.org/papers/v14/chen13a.html>), local MDS (Chen & Buja, 2009, <doi:10.1198/jasa.2009.0111>), curvilinear component analysis (Demartines & Herault, 1997, <doi:10.1109/72.554199>), curvilinear distance analysis (Lee, Lendasse & Verleysen, 2004, <doi:10.1016/j.neucom.2004.01.007>), nonlinear MDS with optimal dissimilarity powers functions (De Leeuw, 2024, <https://github.com/deleeuw/smacofManual/blob/main/smacofPO(power)/smacofPO.pdf>), sparsified (power) MDS and sparsified multidimensional (power) distance analysis aka extended curvilinear (power) component analysis and extended curvilinear (power) distance analysis (Rusch, 2024, <doi:10.57938/355bf835-ddb7-42f4-8b85-129799fc240e>). Some functions are suitably flexible to allow any other sensible combination of explicit power transformations for weights, distances and input proximities with implicit ratio, interval, splines or nonmetric optimal scaling of the input proximities. Most functions use a Majorization-Minimization algorithm. Currently the methods are only available for one-mode two-way data (symmetric dissimilarity matrices).
Perform two-dimensional smoothing for spatial fields using FFT and the convolution theorem (see Gilleland 2013, <doi:10.5065/D61834G2>).
Allows users to easily build custom docker images <https://docs.docker.com/> from Amazon Web Service Sagemaker <https://aws.amazon.com/sagemaker/> using Amazon Web Service CodeBuild <https://aws.amazon.com/codebuild/>.
This package provides a collection of methods for smoothing numerical data, commencing with a port of the Matlab gaussian window smoothing function. In addition, several functions typically used in smoothing of financial data are included.
Mosaic diagram, scatterplot matrix, Andrews curves, parallel coordinate diagram, radar diagram, and Chernoff plots as a Shiny app, which allow the order of variables to be changed interactively. The apps are intended as teaching examples.
This package provides the filtering algorithms for the state space models on the Stiefel manifold as well as the corresponding sampling algorithms for uniform, vector Langevin-Bingham and matrix Langevin-Bingham distributions on the Stiefel manifold.
Simple class to hold contents of a SMET file as specified in Bavay (2021) <https://code.wsl.ch/snow-models/meteoio/-/blob/master/doc/SMET_specifications.pdf>. There numerical meteorological measurements are all based on MKS (SI) units and timestamp is standardized to UTC time.
Exploratory analysis on any input data describing the structure and the relationships present in the data. The package automatically select the variable and does related descriptive statistics. Analyzing information value, weight of evidence, custom tables, summary statistics, graphical techniques will be performed for both numeric and categorical predictors.
This package provides flexible hazard ratio curves allowing non-linear relationships between continuous predictors and survival. To better understand the effects that each continuous covariate has on the outcome, results are expressed in terms of hazard ratio curves, taking a specific covariate value as reference. Confidence bands for these curves are also derived.
Introduces a fast and efficient Surrogate Variable Analysis algorithm that captures variation of unknown sources (batch effects) for high-dimensional data sets. The algorithm is built on the irwsva.build function of the sva package and proposes a revision on it that achieves an order of magnitude faster running time while trading no accuracy loss in return.
Fast computation of multivariate analyses of small (10s to 100s markers) to big (1000s to 100000s) genotype data. Runs Principal Component Analysis allowing for centering, z-score standardization and scaling for genetic drift, projection of ancient samples to modern genetic space and multivariate tests for differences in group location (Permutation-Based Multivariate Analysis of Variance) and dispersion (Permutation-Based Multivariate Analysis of Dispersion).
Preview spatial data as leaflet maps with minimal effort. smartmap is optimized for interactive use and distinguishes itself from similar packages because it does not need real spatial ('sp or sf') objects an input; instead, it tries to automatically coerce everything that looks like spatial data to sf objects or leaflet maps. It - for example - supports direct mapping of: a vector containing a single coordinate pair, a two column matrix, a data.frame with longitude and latitude columns, or the path or URL to a (possibly compressed) shapefile'.
Implementation of the SIC epsilon-telescope method, either using single or distributional (multiparameter) regression. Includes classical regression with normally distributed errors and robust regression, where the errors are from the Laplace distribution. The "smooth generalized normal distribution" is used, where the estimation of an additional shape parameter allows the user to move smoothly between both types of regression. See O'Neill and Burke (2022) "Robust Distributional Regression with Automatic Variable Selection" for more details. <doi:10.48550/arXiv.2212.07317>. This package also contains the data analyses from O'Neill and Burke (2023). "Variable selection using a smooth information criterion for distributional regression models". <doi:10.1007/s11222-023-10204-8>.