Help to the occasional R user for synthesis and enhanced graphical visualization of redundancy analysis (RDA) and principal component analysis (PCA) methods and objects. Inputs are : data frame, RDA (package vegan') and PCA (package FactoMineR
') objects. Outputs are : synthesized results of RDA, displayed in console and saved in tables ; displayed and saved objects of PCA graphic visualization of individuals and variables projections with multiple graphic parameters.
Efficient Bayesian multinomial logistic regression based on heavy-tailed (hyper-LASSO, non-convex) priors. The posterior of coefficients and hyper-parameters is sampled with restricted Gibbs sampling for leveraging the high-dimensionality and Hamiltonian Monte Carlo for handling the high-correlation among coefficients. A detailed description of the method: Li and Yao (2018), Journal of Statistical Computation and Simulation, 88:14, 2827-2851, <arXiv:1405.3319>
.
The haversine is a function used to calculate the distance between a pair of latitude and longitude points while accounting for the assumption that the points are on a spherical globe. This package provides a fast, dataframe compatible, haversine function. For the first publication on the haversine calculation see Joseph de Mendoza y RÃ os (1795) <https://books.google.cat/books?id=030t0OqlX2AC>
(In Spanish).
Fit latent space network cluster models using an expectation-maximization algorithm. Enables flexible modeling of unweighted or weighted network data (with or without noise edges), supporting both directed and undirected networks (with or without degree heterogeneity). Designed to handle large networks efficiently, it allows users to explore network structure through latent space representations, identify clusters within network data, and simulate networks with varying clustering and connectivity patterns.
Miscellaneous functions for classification and visualization, e.g. regularized discriminant analysis, sknn()
kernel-density naive Bayes, an interface to svmlight and stepclass()
wrapper variable selection for supervised classification, partimat()
visualization of classification rules and shardsplot()
of cluster results as well as kmodes()
clustering for categorical data, corclust()
variable clustering, variable extraction from different variable clustering models and weight of evidence preprocessing.
Density evaluation and random number generation for the Matrix-Normal Inverse-Wishart (MNIW) distribution, as well as the the Matrix-Normal, Matrix-T, Wishart, and Inverse-Wishart distributions. Core calculations are implemented in a portable (header-only) C++ library, with matrix manipulations using the Eigen library for linear algebra. Also provided is a Gibbs sampler for Bayesian inference on a random-effects model with multivariate normal observations.
Test whether equality and order constraints hold for all individuals simultaneously by comparing Bayesian mixed models through Bayes factors. A tutorial style vignette and a quickstart guide are available, via vignette("manual", "quid"), and vignette("quickstart", "quid") respectively. See Haaf and Rouder (2017) <doi:10.1037/met0000156>; Haaf, Klaassen and Rouder (2019) <doi:10.31234/osf.io/a4xu9>; and Rouder & Haaf (2021) <doi:10.5334/joc.131>.
This package creates stratum orthogonal arrays (also known as strong orthogonal arrays). These are arrays with more levels per column than the typical orthogonal array, and whose low order projections behave like orthogonal arrays, when collapsing levels to coarser strata. Details are described in Groemping (2022) "A unifying implementation of stratum (aka strong) orthogonal arrays" <http://www1.bht-berlin.de/FB_II/reports/Report-2022-002.pdf>.
Markov chain Monte Carlo samplers for posterior simulations of conjugate Bayesian nonparametric mixture models. Functionality is provided for Gibbs sampling as in Algorithm 3 of Neal (2000) <DOI:10.1080/10618600.2000.10474879>, restricted Gibbs merge-split sampling as described in Jain & Neal (2004) <DOI:10.1198/1061860043001>, and sequentially-allocated merge-split sampling <DOI:10.1080/00949655.2021.1998502>, as well as summary and utility functions.
The curl()
and curl_download()
functions provide highly configurable drop-in replacements for base url()
and download.file()
with better performance, support for encryption, gzip compression, authentication, and other libcurl
goodies. The core of the package implements a framework for performing fully customized requests where data can be processed either in memory, on disk, or streaming via the callback or connection interfaces.
Logging functions in RcppSpdlog
provide access to the logging functionality from the spdlog C++ library. This package offers shorter convenience wrappers for the R functions which match the C++ functions, namely via, say, spdl::debug()
at the debug level. The actual formatting is done by the fmt::format()
function from the fmtlib library (that is also std::format()
in C++20 or later).
Network meta-analyses using Bayesian framework following Dias et al. (2013) <DOI:10.1177/0272989X12458724>. Based on the data input, creates prior, model file, and initial values needed to run models in rjags'. Able to handle binomial, normal and multinomial arm-level data. Can handle multi-arm trials and includes methods to incorporate covariate and baseline risk effects. Includes standard diagnostics and visualization tools to evaluate the results.
This package provides a collection of functions for calculating the M2 model fit statistic for diagnostic classification models as described by Liu et al. (2016) <DOI:10.3102/1076998615621293>. These functions provide multiple sources of information for model fit according to the M2 statistic, including the M2 statistic, the *p* value for that M2 statistic, and the Root Mean Square Error of Approximation based on the M2 statistic.
An implementation by Chen, Li, and Zhang (2022) <doi: 10.1093/bioadv/vbac041> of the Depth Importance in Precision Medicine (DIPM) method in Chen and Zhang (2022) <doi:10.1093/biostatistics/kxaa021> and Chen and Zhang (2020) <doi:10.1007/978-3-030-46161-4_16>. The DIPM method is a classification tree that searches for subgroups with especially poor or strong performance in a given treatment group.
The automated clustering and quantification of the digital PCR data is based on the combination of DBSCAN (Hahsler et al. (2019) <doi:10.18637/jss.v091.i01>) and c-means (Bezdek et al. (1981) <doi:10.1007/978-1-4757-0450-1>) algorithms. The analysis is independent of multiplexing geometry, dPCR
system, and input amount. The details about input data and parameters are available in the vignette.
An implementation of European Forestry Dynamics Model (EFDM) and an estimation algorithm for the transition probabilities. The EFDM is a large-scale forest model that simulates the development of the forest and estimates volume of wood harvested for any given forested area. This estimate can be broken down by, for example, species, site quality, management regime and ownership category. See Packalen et al. (2015) <doi:10.2788/153990>.
This package provides access to a range of functions for computing and visualizing the Full Bayesian Significance Test (FBST) and the e-value for testing a sharp hypothesis against its alternative, and the Full Bayesian Evidence Test (FBET) and the (generalized) Bayesian evidence value for testing a composite (or interval) hypothesis against its alternative. The methods are widely applicable as long as a posterior MCMC sample is available.
This package provides the following types of models: Models for contingency tables (i.e. log-linear models) Graphical Gaussian models for multivariate normal data (i.e. covariance selection models) Mixed interaction models. Documentation about gRim
is provided by vignettes included in this package and the book by Højsgaard, Edwards and Lauritzen (2012, <doi:10.1007/978-1-4614-2299-0>); see citation("gRim
") for details.
Automated General-to-Specific (GETS) modelling of the mean and variance of a regression, and indicator saturation methods for detecting and testing for structural breaks in the mean, see Pretis, Reade and Sucarrat (2018) <doi:10.18637/jss.v086.i03> for an overview of the package. In advanced use, the estimator and diagnostics tests can be fully user-specified, see Sucarrat (2021) <doi:10.32614/RJ-2021-024>.
This package provides seamless access to the WEkEO
Harmonised Data Access (HDA) API, enabling users to query, download, and process data efficiently from the HDA platform. With hdar', researchers and data scientists can integrate the extensive HDA datasets into their R workflows, enhancing their data analysis capabilities. Comprehensive information on the API functionality and usage is available at <https://gateway.prod.wekeo2.eu/hda-broker/docs>.
This package provides functions to implement a hierarchical approach which is designed to perform joint analysis of summary statistics using the framework of Mendelian Randomization or transcriptome analysis. Reference: Lai Jiang, Shujing Xu, Nicholas Mancuso, Paul J. Newcombe, David V. Conti (2020). "A Hierarchical Approach Using Marginal Summary Statistics for Multiple Intermediates in a Mendelian Randomization or Transcriptome Analysis." <bioRxiv><doi:10.1101/2020.02.03.924241>
.
An R package that implements the JICO algorithm [Wang, P., Wang, H., Li, Q., Shen, D., & Liu, Y. (2024). <Journal of Computational and Graphical Statistics, 33(3), 763-773>]. It aims at solving the multi-group regression problem. The algorithm decomposes the responses from multiple groups into shared and group-specific components, which are driven by low-rank approximations of joint and individual structures from the covariates respectively.
Conduct a noncompartmental analysis with industrial strength. Some features are 1) CDISC SDTM terms 2) Automatic or manual slope selection 3) Supporting both linear-up linear-down and linear-up log-down method 4) Interval(partial) AUCs with linear or log interpolation method 5) Produce pdf, rtf, text report files. * Reference: Gabrielsson J, Weiner D. Pharmacokinetic and Pharmacodynamic Data Analysis - Concepts and Applications. 5th ed. 2016. (ISBN:9198299107).
Fitting of non-parametric production frontiers for use in efficiency analysis. Methods are provided for both a smooth analogue of Data Envelopment Analysis (DEA) and a non-parametric analogue of Stochastic Frontier Analysis (SFA). Frontiers are constructed for multiple inputs and a single output using constrained kernel smoothing as in Racine et al. (2009), which allow for the imposition of monotonicity and concavity constraints on the estimated frontier.