This package provides helper functions to perform Bayesian model averaging using Markov chain Monte Carlo samples from separate models. Calculates weights and obtains draws from the model-averaged posterior for quantities of interest specified by the user. Weight calculations can be done using marginal likelihoods or log-predictive likelihoods as in Ando, T., & Tsay, R. (2010) <doi:10.1016/j.ijforecast.2009.08.001>.
This package implements zero-modified versions of the Complex Tri-Parametric Pearson distribution for overdispersed count data. The package addresses limitations of existing implementations when the parameter b approaches zero. It provides distribution functions, maximum likelihood estimation, and diagnostic tools for modeling count data with excess zeros. The methodology is based on Rodriguez-Avi and coauthors (2003) <doi:10.1007/s00362-002-0134-7>.
Generates both total- and level-specific R-squared measures from Rights and Sterbaâ s (2019) <doi:10.1037/met0000184> framework of R-squared measures for multilevel models with random intercepts and/or slopes, which is based on a complete decomposition of variance. Additionally generates graphical representations of these R-squared measures to allow visualizing and interpreting all measures in the framework together as an integrated set. This framework subsumes 10 previously-developed R-squared measures for multilevel models as special cases of 5 measures from the framework, and it also includes several newly-developed measures. Measures in the framework can be used to compute R-squared differences when comparing multilevel models (following procedures in Rights & Sterba (2020) <doi:10.1080/00273171.2019.1660605>). Bootstrapped confidence intervals can also be calculated. To use the confidence interval functionality, download bootmlm from <https://github.com/marklhc/bootmlm>.
The bit64 package provides serializable S3 atomic 64 bit (signed) integers that can be used in vectors, matrices, arrays and data.frames. Methods are available for coercion from and to logicals, integers, doubles, characters and factors as well as many elementwise and summary functions. Many fast algorithmic operations such as match and order support interactive data exploration and manipulation and optionally leverage caching.
This package computes the areas under the precision-recall (PR) and ROC curve for weighted (e.g. soft-labeled) and unweighted data. In contrast to other implementations, the interpolation between points of the PR curve is done by a non-linear piecewise function. In addition to the areas under the curves, the curves themselves can also be computed and plotted by a specific S3-method.
This package provides functions to allow you to easily pass command-line arguments into R, and functions to aid in submitting your R code in parallel on a cluster and joining the results afterward (e.g. multiple parameter values for simulations running in parallel, splitting up a permutation test in parallel, etc.). See `parseCommandArgs(...) for the main example of how to use this package.
Collection of indices and tools relating to clinical research that aid epidemiological cohort or retrospective chart review with big data. All indices and tools take commonly used lab values, patient demographics, and clinical measurements to compute various risk and predictive values for survival or further classification/stratification. References to original literature and validation contained in each function documentation. Includes all commonly available calculators available online.
This package provides a dimension reduction technique for outlier detection. DOBIN: a Distance based Outlier BasIs using Neighbours, constructs a set of basis vectors for outlier detection. This is not an outlier detection method; rather it is a pre-processing method for outlier detection. It brings outliers to the fore-front using fewer basis vectors (Kandanaarachchi, Hyndman 2020) <doi:10.1080/10618600.2020.1807353>.
Implementation of an Event Categorization Matrix (ECM) detonation detection model and a Bayesian variant. Functions are provided for importing and exporting data, fitting models, and applying decision criteria for categorizing new events. This package implements methods described in the paper "Bayesian Event Categorization Matrix Approach for Nuclear Detonations" Koermer, Carmichael, and Williams (2024) available on arXiv at <doi:10.48550/arXiv.2409.18227>.
Easy way to plot regular/weighted/conditional distributions by using formulas. The core of the package concerns distribution plots which are automatic: the many options are tailored to the data at hand to offer the nicest and most meaningful graphs possible -- with no/minimum user input. Further provide functions to plot conditional trends and box plots. See <https://lrberge.github.io/fplot/> for more information.
The Occluded Surface (OS) algorithm is a widely used approach for analyzing atomic packing in biomolecules as described by Pattabiraman N, Ward KB, Fleming PJ (1995) <doi:10.1002/jmr.300080603>. Here, we introduce fibos', an R and Python package that extends the OS methodology, as presented in Soares HHM, Romanelli JPR, Fleming PJ, da Silveira CH (2024) <doi:10.1101/2024.11.01.621530>.
Forest Many-Objective Robust Decision Making ('FoRDM') is a R toolkit for supporting robust forest management under deep uncertainty. It provides a forestry-focused application of Many-Objective Robust Decision Making ('MORDM') to forest simulation outputs, enabling users to evaluate robustness using regret- and satisficing'-based measures. FoRDM identifies robust solutions, generates Pareto fronts, and offers interactive 2D, 3D, and parallel-coordinate visualizations.
This package provides a framework and functions to create MOODLE quizzes. GIFTr takes dataframe of questions of four types: multiple choices, numerical, true or false and short answer questions, and exports a text file formatted in MOODLE GIFT format. You can prepare a spreadsheet in any software and import it into R to generate any number of questions with HTML', markdown and LaTeX support.
In high-dimensional settings: Estimate the number of distant spikes based on the Generalized Spiked Population (GSP) model. Estimate the population eigenvalues, angles between the sample and population eigenvectors, correlations between the sample and population PC scores, and the asymptotic shrinkage factors. Adjust the shrinkage bias in the predicted PC scores. Dey, R. and Lee, S. (2019) <doi:10.1016/j.jmva.2019.02.007>.
This package implements some item response models for multiple ratings, including the hierarchical rater model, conditional maximum likelihood estimation of linear logistic partial credit model and a wrapper function to the commercial FACETS program. See Robitzsch and Steinfeld (2018) for a description of the functionality of the package. See Wang, Su and Qiu (2014; <doi:10.1111/jedm.12045>) for an overview of modeling alternatives.
Builds and interprets multi-response machine learning models using tidymodels syntax. Users can supply a tidy model, and mrIML automates the process of fitting multiple response models to multivariate data and applying interpretable machine learning techniques across them. For more details see Fountain-Jones (2021) <doi:10.1111/1755-0998.13495> and Fountain-Jones et al. (2024) <doi:10.22541/au.172676147.77148600/v1>.
It includes four methods: DCOL-based K-profiles clustering, non-linear network reconstruction, non-linear hierarchical clustering, and variable selection for generalized additive model. References: Tianwei Yu (2018)<DOI: 10.1002/sam.11381>; Haodong Liu and others (2016)<DOI: 10.1371/journal.pone.0158247>; Kai Wang and others (2015)<DOI: 10.1155/2015/918954>; Tianwei Yu and others (2010)<DOI: 10.1109/TCBB.2010.73>.
This package contains functions useful for debugging, set operations on vectors, and UTC date and time functionality. It adds a few vector manipulation verbs to purrr and dplyr packages. It can also generate an R file to install and update packages to simplify deployment into production. The functions were developed at the data science firm Numeract LLC and are used in several packages and projects.
Different estimators are provided to solve the blind source separation problem for multivariate time series with stochastic volatility and supervised dimension reduction problem for multivariate time series. Different functions based on AMUSE and SOBI are also provided for estimating the dimension of the white noise subspace. The package is fully described in Nordhausen, Matilainen, Miettinen, Virta and Taskinen (2021) <doi:10.18637/jss.v098.i15>.
This package provides functions to compute Wasserstein barycenters of subset posteriors using the swapping algorithm developed by Puccetti, Rüschendorf and Vanduffel (2020) <doi:10.1016/j.jmaa.2017.02.003>. The Wasserstein barycenter is a geometric approach for combining subset posteriors. It allows for parallel and distributed computation of the posterior in case of complex models and/or big datasets, thereby increasing computational speed tremendously.
Latent variable modeling with Principal Component Analysis (PCA) and Partial Least Squares (PLS) are powerful methods for visualization, regression, classification, and feature selection of omics data where the number of variables exceeds the number of samples and with multicollinearity among variables. Orthogonal Partial Least Squares (OPLS) enables to separately model the variation correlated (predictive) to the factor of interest and the uncorrelated (orthogonal) variation. While performing similarly to PLS, OPLS facilitates interpretation.
This package provides imlementations of PCA, PLS, and OPLS for multivariate analysis and feature selection of omics data. In addition to scores, loadings and weights plots, the package provides metrics and graphics to determine the optimal number of components (e.g. with the R2 and Q2 coefficients), check the validity of the model by permutation testing, detect outliers, and perform feature selection (e.g. with Variable Importance in Projection or regression coefficients).
Fits linear models with endogenous regressor using latent instrumental variable approaches. The methods included in the package are Lewbel's (1997) <doi:10.2307/2171884> higher moments approach as well as Lewbel's (2012) <doi:10.1080/07350015.2012.643126> heteroscedasticity approach, Park and Gupta's (2012) <doi:10.1287/mksc.1120.0718> joint estimation method that uses Gaussian copula and Kim and Frees's (2007) <doi:10.1007/s11336-007-9008-1> multilevel generalized method of moment approach that deals with endogeneity in a multilevel setting. These are statistical techniques to address the endogeneity problem where no external instrumental variables are needed. See the publication related to this package in the Journal of Statistical Software for more details: <doi:10.18637/jss.v107.i03>. Note that with version 2.0.0 sweeping changes were introduced which greatly improve functionality and usability but break backwards compatibility.
The MBECS provides a set of functions to evaluate and mitigate unwated noise due to processing in batches. To that end it incorporates a host of batch correcting algorithms (BECA) from various packages. In addition it offers a correction and reporting pipeline that provides a preliminary look at the characteristics of a data-set before and after correcting for batch effects.
ravanan is a CWL implementation that is powered by GNU Guix and provides strong reproducibility guarantees. ravanan provides strong caching of intermediate results so the same step of a workflow is never run twice. ravanan captures logs from every step of the workflow for easy tracing back in case of job failures. ravanan currently runs on single machines and on slurm via its API.