The Bayesian Federated Inference ('BFI') method combines inference results obtained from local data sets in the separate centers. In this version of the package, the BFI methodology is programmed for linear, logistic and survival regression models. For GLMs, see Jonker, Pazira and Coolen (2024) <doi:10.1002/sim.10072>; for survival models, see Pazira, Massa, Weijers, Coolen and Jonker (2025) <doi:10.48550/arXiv.2404.17464>; and for heterogeneous populations, see Jonker, Pazira and Coolen (2025) <doi:10.1017/rsm.2025.6>.
This package provides a complete toolkit for connecting R environments with Large Language Models (LLMs). Provides utilities for describing R objects, package documentation, and workspace state in plain text formats optimized for LLM consumption. Supports multiple workflows: interactive copy-paste to external chat interfaces, programmatic tool registration with ellmer chat clients, batteries-included chat applications via shinychat', and exposure to external coding agents through the Model Context Protocol. Project configuration files enable stable, repeatable conversations with project-specific context and preferred LLM settings.
This package provides a client for retrieving data and metadata from central bank APIs including Banco de España (BdE), Banco de Portugal (BdP), Bank for International Settlements (BIS), Bank of Canada (BoC), Bank of England (BoE), Bank of Japan (BoJ), Banque de France (BdF), Czech National Bank (CNB), Deutsche Bundesbank (BBk), European Central Bank (ECB), National Bank of Poland (NBP), Norges Bank (NoB), Oesterreichische Nationalbank (OeNB), Sveriges Riksbank (SRb), and Swiss National Bank (SNB).
It provides functions to compute the values of different modifications of the Rand and Wallace indices. The indices are used to measure the stability or similarity of two partitions obtained on two different sets of units with a non-empty intercept. Splitting and merging of clusters can (depends on the selected index) have a different effect on the value of the indices. The indices are proposed in Cugmas and Ferligoj (2018) <http://ibmi.mf.uni-lj.si/mz/2018/no-1/Cugmas2018.pdf>.
This package provides a framework for organizing R projects with a standardized structure. Most analyses consist of three main components: code, results, and data, each with different requirements such as version control, sharing, and encryption. This package provides tools to set up and manage project directories, handle file paths consistently across operating systems, organize results using date-based structures, source code from specified directories, and perform file operations safely. It ensures consistency across projects while accommodating different requirements for various types of content.
This package contains functions for calculating the Federal Highway Administration (FHWA) Transportation Performance Management (TPM) performance measures. Currently, the package provides methods for the System Reliability and Freight (PM3) performance measures calculated from travel time data provided by The National Performance Management Research Data Set (NPMRDS), including Level of Travel Time Reliability (LOTTR), Truck Travel Time Reliability (TTTR), and Peak Hour Excessive Delay (PHED) metric scores for calculating statewide reliability performance measures. Implements <https://www.fhwa.dot.gov/tpm/guidance/pm3_hpms.pdf>.
This package provides a model for the growth of self-limiting populations using three, four, or five parameter functions, which have wide applications in a variety of fields. The dependent variable in a dynamical modeling could be the population size at time x, where x is the independent variable. In the analysis of quantitative polymerase chain reaction (qPCR), the dependent variable would be the fluorescence intensity and the independent variable the cycle number. This package then would calculate the TWW cycle threshold.
Suite of tropical geometric tools for use in machine learning applications. These methods may be summarized in the following references: Yoshida, et al. (2022) <doi:10.2140/astat.2023.14.37>, Barnhill et al. (2023) <doi:10.48550/arXiv.2303.02539>, Barnhill and Yoshida (2023) <doi:10.3390/math11153433>, Aliatimis et al. (2023) <doi:10.1007/s11538-024-01327-8>, Yoshida et al. (2022) <doi:10.1109/TCBB.2024.3420815>, and Yoshida et al. (2019) <doi:10.1007/s11538-018-0493-4>.
This package provides a general toolkit for downloading, managing, analyzing, and presenting data from the U.S. Census, including SF1 (Decennial short-form), SF3 (Decennial long-form), and the American Community Survey (ACS). Confidence intervals provided with ACS data are converted to standard errors to be bundled with estimates in complex acs objects. The package provides new methods to conduct standard operations on acs objects and present/plot data in statistically appropriate ways.
This LPE library is used to do significance analysis of microarray data with small number of replicates. It uses resampling based FDR adjustment, and gives less conservative results than traditional BH or BY procedures. Data accepted is raw data in txt format from MAS4, MAS5 or dChip. Data can also be supplied after normalization. LPE library is primarily used for analyzing data between two conditions. To use it for paired data, see LPEP library. For using LPE in multiple conditions, use HEM library.
This package provides a testing framework for testing the multivariate point null hypothesis. A testing framework described in Elder et al. (2022) <arXiv:2203.01897> to test the multivariate point null hypothesis. After the user selects a parameter of interest and defines the assumed data generating mechanism, this information should be encoded in functions for the parameter estimator and its corresponding influence curve. Some parameter and data generating mechanism combinations have codings in this package, and are explained in detail in the article.
This code provides a method to fit the hidden compact representation model as well as to identify the causal direction on discrete data. We implement an effective solution to recover the above hidden compact representation under the likelihood framework. Please see the Causal Discovery from Discrete Data using Hidden Compact Representation from NIPS 2018 by Ruichu Cai, Jie Qiao, Kun Zhang, Zhenjie Zhang and Zhifeng Hao (2018) <https://nips.cc/Conferences/2018/Schedule?showEvent=11274> for a description of some of our methods.
This package provides density, distribution, quantile, and random generation functions for the Modified Half-Normal (MHN) distribution, along with moments, mode, and the Fox-Wright Psi function used as the normalizing constant. The MHN distribution arises as a conditional posterior in Bayesian MCMC and generalizes the half-normal, truncated normal, and square-root gamma distributions. Implements efficient sampling via the Sun, Kong & Pal (2023) <doi:10.1080/03610926.2021.1934700> algorithms and the Gao & Wang (2025) <doi:10.1080/03610918.2025.2524551> RTDR method.
Variable selection techniques are essential tools for model selection and estimation in high-dimensional statistical models. Through this publicly available package, we provide a unified environment to carry out variable selection using iterative sure independence screening (SIS) (Fan and Lv (2008)<doi:10.1111/j.1467-9868.2008.00674.x>) and all of its variants in generalized linear models (Fan and Song (2009)<doi:10.1214/10-AOS798>) and the Cox proportional hazards model (Fan, Feng and Wu (2010)<doi:10.1214/10-IMSCOLL606>).
Uniform Error Index is the weighted average of different error measures. Uniform Error Index utilizes output from different error function and gives more robust and stable error values. This package has been developed to compute Uniform Error Index from ten different loss function like Error Square, Square of Square Error, Quasi Likelihood Error, LogR-Square, Absolute Error, Absolute Square Error etc. The weights are determined using Principal Component Analysis (PCA) algorithm of Yeasin and Paul (2024) <doi:10.1007/s11227-023-05542-3>.
This package provides functions to assist in the processing and exploration of data from environmental monitoring programs. The package name stands for "water quality" and reflects the original focus on time series data for physical and chemical properties of water, as well as the biota. Intended for programs that sample approximately monthly, quarterly or annually at discrete stations, a feature of many legacy data sets. Most of the functions should be useful for analysis of similar-frequency time series regardless of the subject matter.
This package provides fast and easy access to German census grid data from the 2011 and 2022 censuses <https://www.zensus2022.de/>, including a wide range of socio-economic indicators at multiple spatial resolutions (100m, 1km, 10km). Enables efficient download, processing, and analysis of large census datasets covering population, households, families, dwellings, and buildings. Harmonized data structures allow direct comparison with the 2011 census, supporting temporal and spatial analyses. Facilitates conversion of data into common formats for spatial analysis and mapping ('terra', sf', ggplot2').
This package aims to streamline and accelerate the process of saving and loading R objects, improving speed and compression compared to other methods. The package provides two compression formats: the qs2 format, which uses R serialization via the C API while optimizing compression and disk I/O, and the qdata format, featuring custom serialization for slightly faster performance and better compression. Additionally, the qs2 format can be directly converted to the standard RDS format, ensuring long-term compatibility with future versions of R.
This package provides a new method for interpretable heterogeneous treatment effects characterization in terms of decision rules via an extensive exploration of heterogeneity patterns by an ensemble-of-trees approach, enforcing high stability in the discovery. It relies on a two-stage pseudo-outcome regression, and it is supported by theoretical convergence guarantees. Bargagli-Stoffi, F. J., Cadei, R., Lee, K., & Dominici, F. (2023) Causal rule ensemble: Interpretable Discovery and Inference of Heterogeneous Treatment Effects. arXiv preprint <doi:10.48550/arXiv.2009.09036>.
The development of ISM was made by Warfield in 1974. ISM is the process of collaborating distinct or related essentials into a simplified and an organized format. Hence, ISM is a methodology that seeks the interrelationships among the various elements considered and endows with a hierarchical and multilevel structure. To run this package user needs to provide a matrix (VAXO) converted into 0's and 1's. Warfield,J.N. (1974) <doi:10.1109/TSMC.1974.5408524> Warfield,J.N. (1974, E-ISSN:2168-2909).
Computes the implied weights of linear regression models for estimating average causal effects and provides diagnostics based on these weights. These diagnostics rely on the analyses in Chattopadhyay and Zubizarreta (2023) <doi:10.1093/biomet/asac058> where several regression estimators are represented as weighting estimators, in connection to inverse probability weighting. lmw provides tools to diagnose representativeness, balance, extrapolation, and influence for these models, clarifying the target population of inference. Tools are also available to simplify estimating treatment effects for specific target populations of interest.
This package provides a facility to generate balanced semi-Latin rectangles with any cell size (preferably up to ten) with given number of treatments, see Uto, N.P. and Bailey, R.A. (2020). "Balanced Semi-Latin rectangles: properties, existence and constructions for block size two". Journal of Statistical Theory and Practice, 14(3), 1-11, <doi:10.1007/s42519-020-00118-3>. It also provides facility to generate partially balanced semi-Latin rectangles for cell size 2, 3 and 4 for any number of treatments.
Efficient Bayesian implementations of probit, logit, multinomial logit and binomial logit models. Functions for plotting and tabulating the estimation output are available as well. Estimation is based on Gibbs sampling where the Markov chain Monte Carlo algorithms are based on the latent variable representations and marginal data augmentation algorithms described in "Gregor Zens, Sylvia Frühwirth-Schnatter & Helga Wagner (2023). Ultimate Pólya Gamma Samplers â Efficient MCMC for possibly imbalanced binary and categorical data, Journal of the American Statistical Association <doi:10.1080/01621459.2023.2259030>".
This is an implementation of the Generalized Discrimination Score (also known as Two Alternatives Forced Choice Score, 2AFC) for various representations of forecasts and verifying observations. The Generalized Discrimination Score is a generic forecast verification framework which can be applied to any of the following verification contexts: dichotomous, polychotomous (ordinal and nominal), continuous, probabilistic, and ensemble. A comprehensive description of the Generalized Discrimination Score, including all equations used in this package, is provided by Mason and Weigel (2009) <doi:10.1175/MWR-D-10-05069.1>.