Efficient Bayesian implementations of probit, logit, multinomial logit and binomial logit models. Functions for plotting and tabulating the estimation output are available as well. Estimation is based on Gibbs sampling where the Markov chain Monte Carlo algorithms are based on the latent variable representations and marginal data augmentation algorithms described in "Gregor Zens, Sylvia Frühwirth-Schnatter & Helga Wagner (2023). Ultimate Pólya Gamma Samplers â Efficient MCMC for possibly imbalanced binary and categorical data, Journal of the American Statistical Association <doi:10.1080/01621459.2023.2259030>".
This is an implementation of the Generalized Discrimination Score (also known as Two Alternatives Forced Choice Score, 2AFC) for various representations of forecasts and verifying observations. The Generalized Discrimination Score is a generic forecast verification framework which can be applied to any of the following verification contexts: dichotomous, polychotomous (ordinal and nominal), continuous, probabilistic, and ensemble. A comprehensive description of the Generalized Discrimination Score, including all equations used in this package, is provided by Mason and Weigel (2009) <doi:10.1175/MWR-D-10-05069.1>.
Column Text Format (CTF) is a new tabular data format designed for simplicity and performance. CTF is the simplest column store you can imagine: plain text files for each column in a table, and a metadata file. The underlying plain text means the data is human readable and familiar to programmers, unlike specialized binary formats. CTF is faster than row oriented formats like CSV when loading a subset of the columns in a table. This package provides functions to read and write CTF data from R.
Functions, S4 classes/methods and a graphical user interface (GUI) to design surveys to substantiate freedom from disease using a modified hypergeometric function (see Cameron and Baldock, 1997, <doi:10.1016/s0167-5877(97)00081-0>). Herd sensitivities are computed according to sampling strategies "individual sampling" or "limited sampling" (see M. Ziller, T. Selhorst, J. Teuffert, M. Kramer and H. Schlueter, 2002, <doi:10.1016/S0167-5877(01)00245-8>). Methods to compute the a-posteriori alpha-error are implemented. Risk-based targeted sampling is supported.
Diagnostic tools as residual analysis, global, local and total-local influence for the multivariate model from the random intercept Poisson generalized log gamma model are available in this package. Including also, the estimation process by maximum likelihood method, for details see Fabio, L. C; Villegas, C. L.; Carrasco, J.M.F and de Castro, M. (2023) <doi:10.1080/03610926.2021.1939380> and Fábio, L. C.; Villegas, C.; Mamun, A. S. M. A. and Carrasco, J. M. F. (2025) <doi:10.28951/bjb.v43i1.728>.
Population dynamic models underpin a range of analyses and applications in ecology and epidemiology. The various approaches for analysing population dynamics models (MPMs, IPMs, ODEs, POMPs, PVA) each require the model to be defined in a different way. This makes it difficult to combine different modelling approaches and data types to solve a given problem. pop aims to provide a flexible and easy to use common interface for constructing population dynamic models and enabling to them to be fitted and analysed in lots of different ways.
In the big data setting, working data sets are often distributed on multiple machines. However, classical statistical methods are often developed to solve the problems of single estimation or inference. We employ a novel parallel quasi-likelihood method in generalized linear models, to make the variances between different sub-estimators relatively similar. Estimates are obtained from projection subsets of data and later combined by suitably-chosen unknown weights. The philosophy of the package is described in Guo G. (2020) <doi:10.1007/s00180-020-00974-4>.
Optogenetics is a new tool to study neuronal circuits that have been genetically modified to allow stimulation by flashes of light. This package implements the methodological framework, Point-process Response model for Optogenetics (PRO), for analyzing data from these experiments. This method provides explicit nonlinear transformations to link the flash point-process with the spiking point-process. Such response functions can be used to provide important and interpretable scientific insights into the properties of the biophysical process that governs neural spiking in response to optogenetic stimulation.
This package implements a method for fitting a bounded probability distribution to quantiles (for example stated by an expert), see Bornkamp and Ickstadt (2009) for details. For this purpose B-splines are used, and the density is obtained by penalized least squares based on a Brier entropy penalty. The package provides methods for fitting the distribution as well as methods for evaluating the underlying density and cdf. In addition methods for plotting the distribution, drawing random numbers and calculating quantiles of the obtained distribution are provided.
This package provides users with the ability to query the Human Cell Atlas data repository for single-cell experiment data. The `projects()`, `files()`, `samples()` and `bundles()` functions retrieve summary information on each of these indexes; corresponding `*_details()` are available for individual entries of each index. File-based resources can be downloaded using `files_download()`. Advanced use of the package allows the user to page through large result sets, and to flexibly query the list-of-lists structure representing query responses.
The desirable Dietary Pattern (DDP)/ PPH score measures the variety of food consumption. The (weighted) score is calculated based on the type of food. This package is intended to calculate the DDP/ PPH score that is faster than traditional method via a manual calculation by BKP (2017) <http://bkp.pertanian.go.id/storage/app/uploads/public/5bf/ca9/06b/5bfca906bc654274163456.pdf> and is simpler than the nutrition survey <http://www.nutrisurvey.de>. The database to create weights and baseline values is the Indonesia national survey in 2017.
Facilitate the analysis of teams in a corporate setting: assess the diversity per grade and job, present the results, search for bias (in hiring and/or promoting processes). It also provides methods to simulate the effect of bias, random team-data, etc. White paper: Philippe J.S. De Brouwer (2021) <http://www.de-brouwer.com/assets/div/div-white-paper.pdf>. Book (chapter 36): Philippe J.S. De Brouwer (2020, ISBN:978-1-119-63272-6) and Philippe J.S. De Brouwer (2020) <doi:10.1002/9781119632757>.
Treatments of a one-way layout, being equivalent to a control, can be selected with this package. Bonferroni adjusted "two one-sided t-tests" (TOST) and related simultaneous confidence intervals are given for both differences or ratios of means of normally distributed data. For the case of equal variances and balanced sample sizes for the treatment groups, the single-step procedure of Bofinger and Bofinger (1995) <doi:10.1111/j.2517-6161.1995.tb02058.x> can be chosen. For non-normal data, the Wilcoxon test is applied.
Multiple testing procedures for heterogeneous and discrete tests as described in Döhler and Roquain (2020) <doi:10.1214/20-EJS1771>. The main algorithms of the paper are available as continuous, discrete and weighted versions. They take as input the results of a test procedure from package DiscreteTests', or a set of observed p-values and their discrete support under their nulls. A shortcut function to obtain such p-values and supports is also provided, along with wrappers allowing to apply discrete procedures directly to data.
Builds and runs c++ code for classes that encapsulate state space model, particle filtering algorithm pairs. Algorithms include the Bootstrap Filter from Gordon et al. (1993) <doi:10.1049/ip-f-2.1993.0015>, the generic SISR filter, the Auxiliary Particle Filter from Pitt et al (1999) <doi:10.2307/2670179>, and a variety of Rao-Blackwellized particle filters inspired by Andrieu et al. (2002) <doi:10.1111/1467-9868.00363>. For more details on the c++ library pf', see Brown (2020) <doi:10.21105/joss.02599>.
Developed to perform the estimation and inference for regression coefficient parameters in longitudinal marginal models using the method of quadratic inference functions. Like generalized estimating equations, this method is also a quasi-likelihood inference method. It has been showed that the method gives consistent estimators of the regression coefficients even if the correlation structure is misspecified, and it is more efficient than GEE when the correlation structure is misspecified. Based on Qu, A., Lindsay, B.G. and Li, B. (2000) <doi:10.1093/biomet/87.4.823>.
This package provides researchers with a simple set of diagnostic tools for monitoring the progress and reliability of raters conducting content coding tasks. Goehring (2024) <https://bengoehring.github.io/improving-content-analysis-tools-for-working-with-undergraduate-research-assistants.pdf> argues that supervisors---especially supervisors of small teams---should utilize computational tools to monitor reliability in real time. As such, this package provides easy-to-use functions for calculating inter-rater reliability statistics and measuring the reliability of one coder compared to the rest of the team.
Single-cell RNA-sequencing (scRNA-seq) is widely used to explore cellular variation. The analysis of scRNA-seq data often starts from clustering cells into subpopulations. This initial step has a high impact on downstream analyses, and hence it is important to be accurate. However, there have not been unsupervised metric designed for scRNA-seq to evaluate clustering performance. Hence, we propose clustering deviation index (CDI), an unsupervised metric based on the modeling of scRNA-seq UMI counts to evaluate clustering of cells.
Compute differential causal effects (dce) on (biological) networks. Given observational samples from a control experiment and non-control (e.g., cancer) for two genes A and B, we can compute differential causal effects with a (generalized) linear regression. If the causal effect of gene A on gene B in the control samples is different from the causal effect in the non-control samples the dce will differ from zero. We regularize the dce computation by the inclusion of prior network information from pathway databases such as KEGG.
The Well-Plate Maker (WPM) is a shiny application deployed as an R package. Functions for a command-line/script use are also available. The WPM allows users to generate well plate maps to carry out their experiments while improving the handling of batch effects. In particular, it helps controlling the "plate effect" thanks to its ability to randomize samples over multiple well plates. The algorithm for placing the samples is inspired by the backtracking algorithm: the samples are placed at random while respecting specific spatial constraints.
The method implemented in this package performs bottom-up hierarchical clustering, using a Dirichlet Process (infinite mixture) to model uncertainty in the data and Bayesian model selection to decide at each step which clusters to merge. This avoids several limitations of traditional methods, for example how many clusters there should be and how to choose a principled distance metric. This implementation accepts multinomial (i.e. discrete, with 2+ categories) or time-series data. This version also includes a randomised algorithm which is more efficient for larger data sets.
This package provides functions useful in the design and ANOVA of experiments. The content falls into the following groupings:
- data, 
- factor manipulation functions, 
- design functions, 
- ANOVA functions, 
- matrix functions, 
- projector and canonical efficiency functions, and 
- miscellaneous functions. 
There is a vignette called DesignNotes describing how to use the design functions for randomizing and assessing designs. The ANOVA functions facilitate the extraction of information when the Error function has been used in the call to aov.
Generalized factor model is implemented for ultra-high dimensional data with mixed-type variables. Two algorithms, variational EM and alternate maximization, are designed to implement the generalized factor model, respectively. The factor matrix and loading matrix together with the number of factors can be well estimated. This model can be employed in social and behavioral sciences, economy and finance, and genomics, to extract interpretable nonlinear factors. More details can be referred to Wei Liu, Huazhen Lin, Shurong Zheng and Jin Liu. (2021) <doi:10.1080/01621459.2021.1999818>.
This package provides a collection of tools to create, use and maintain modularized model code written in the modeling language GAMS (<https://www.gams.com/>). Out-of-the-box GAMS does not come with support for modularized model code. This package provides the tools necessary to convert a standard GAMS model to a modularized one by introducing a modularized code structure together with a naming convention which emulates local environments. In addition, this package provides tools to monitor the compliance of the model code with modular coding guidelines.