This package provides tools for conditional and spatially dependent density estimation using Spatial Logistic Gaussian Processes (SLGPs). The approach represents probability densities through finite-rank Gaussian process priors transformed via a spatial logistic density transformation, enabling flexible non-parametric modeling of heterogeneous data. Functionality includes density prediction, quantile and moment estimation, sampling methods, and preprocessing routines for basis functions. Applications arise in spatial statistics, machine learning, and uncertainty quantification. The methodology builds on the framework of Leonard (1978) <doi:10.1111/j.2517-6161.1978.tb01655.x>, Lenk (1988) <doi:10.1080/01621459.1988.10478625>, Tokdar (2007) <doi:10.1198/106186007X210206>, Tokdar (2010) <doi:10.1214/10-BA605>, and is further aligned with recent developments in Bayesian non-parametric modelling: see Gautier (2023) <https://boristheses.unibe.ch/4377/>, and Gautier (2025) <doi:10.48550/arXiv.2110.02876>).
This package provides with parametric Risk Neutral Densities (RNDs) and cumulative densities of futures prices on fixed-income products. It relies on options on Short Term Interest Rate futures or options on government bond futures. It models the futures price as a mixture of lognormal densities. It also provides with the RNDs and cumulative densities of the money market rate or the government bond yield inferred from the futures price, using the RND of the futures price. It eventually provides with the probability attached to each bond in the delivery basket of a government bond futures to be the cheapest at maturity, using the RND of the bond futures price. The package leverages on the works of Melick, W. R. and Thomas, C. P. (1997) <doi:10.2307/2331318> and B. Bahra (1998) <doi:10.2139/ssrn.77429>.
This package implements a spatially varying change point model with unique intercepts, slopes, variance intercepts and slopes, and change points at each location. Inference is within the Bayesian setting using Markov chain Monte Carlo (MCMC). The response variable can be modeled as Gaussian (no nugget), probit or Tobit link and the five spatially varying parameter are modeled jointly using a multivariate conditional autoregressive (MCAR) prior. The MCAR is a unique process that allows for a dissimilarity metric to dictate the local spatial dependencies. Full details of the package can be found in the accompanying vignette. Furthermore, the details of the package can be found in the corresponding paper published in Spatial Statistics by Berchuck et al (2019): "A spatially varying change points model for monitoring glaucoma progression using visual field data", <doi:10.1016/j.spasta.2019.02.001>.
Four ensemble-based methods (SMOTEBoost, RUSBoost, UnderBagging, and SMOTEBagging) for class imbalance problem are implemented for binary classification. Such methods adopt ensemble methods and data re-sampling techniques to improve model performance in presence of class imbalance problem. One special feature offers the possibility to choose multiple supervised learning algorithms to build weak learners within ensemble models. References: Nitesh V. Chawla, Aleksandar Lazarevic, Lawrence O. Hall, and Kevin W. Bowyer (2003) <doi:10.1007/978-3-540-39804-2_12>, Chris Seiffert, Taghi M. Khoshgoftaar, Jason Van Hulse, and Amri Napolitano (2010) <doi:10.1109/TSMCA.2009.2029559>, R. Barandela, J. S. Sanchez, R. M. Valdovinos (2003) <doi:10.1007/s10044-003-0192-z>, Shuo Wang and Xin Yao (2009) <doi:10.1109/CIDM.2009.4938667>, Yoav Freund and Robert E. Schapire (1997) <doi:10.1006/jcss.1997.1504>.
The Inductive Subgroup Comparison Approach ('ISCA') offers a way to compare groups that are internally differentiated and heterogeneous. It starts by identifying the social structure of a reference group against which a minority or another group is to be compared, yielding empirical subgroups to which minority members are then matched based on how similar they are. The modelling of specific outcomes then occurs within specific subgroups in which majority and minority members are matched. ISCA is characterized by its data-driven, probabilistic, and iterative approach and combines fuzzy clustering, Monte Carlo simulation, and regression analysis. ISCA_random_assignments() assigns subjects probabilistically to subgroups. ISCA_clustertable() provides summary statistics of each cluster across iterations. ISCA_modeling() provides Ordinary Least Squares regression results for each cluster across iterations. For further details please see Drouhot (2021) <doi:10.1086/712804>.
This package implements a regularization method for cumulative link models using the Smooth-Effect-on-Response Penalty (SERP). This method allows flexible modeling of ordinal data by enabling a smooth transition from a general cumulative link model to a simplified version of the same model. As the tuning parameter increases from zero to infinity, the subject-specific effects for each variable converge to a single global effect. The approach addresses common issues in cumulative link models, such as parameter unidentifiability and numerical instability, by maximizing a penalized log-likelihood instead of the standard non-penalized version. Fitting is performed using a modified Newton's method. Additionally, the package includes various model performance metrics and descriptive tools. For details on the implemented penalty method, see Ugba (2021) <doi:10.21105/joss.03705> and Ugba et al. (2021) <doi:10.3390/stats4030037>.
Inference concerning equilibrium and random mating in autopolyploids. Methods are available to test for equilibrium and random mating at any even ploidy level (>2) in the presence of double reduction at biallelic loci. For autopolyploid populations in equilibrium, methods are available to estimate the degree of double reduction. We also provide functions to calculate genotype frequencies at equilibrium, or after one or several rounds of random mating, given rates of double reduction. The main function is hwefit(). This material is based upon work supported by the National Science Foundation under Grant No. 2132247. The opinions, findings, and conclusions or recommendations expressed are those of the author and do not necessarily reflect the views of the National Science Foundation. For details of these methods, see Gerard (2023a) <doi:10.1111/biom.13722> and Gerard (2023b) <doi:10.1111/1755-0998.13856>.
This package provides functions for analyzing the association between one single response categorical variable (SRCV) and one multiple response categorical variable (MRCV), or between two or three MRCVs. A modified Pearson chi-square statistic can be used to test for marginal independence for the one or two MRCV case, or a more general loglinear modeling approach can be used to examine various other structures of association for the two or three MRCV case. Bootstrap- and asymptotic-based standardized residuals and model-predicted odds ratios are available, in addition to other descriptive information. Statisical methods implemented are described in Bilder et al. (2000) <doi:10.1080/03610910008813665>, Bilder and Loughin (2004) <doi:10.1111/j.0006-341X.2004.00147.x>, Bilder and Loughin (2007) <doi:10.1080/03610920600974419>, and Koziol and Bilder (2014) <https://journal.r-project.org/articles/RJ-2014-014/>.
The Sequence of Physical Processes (SPP) framework is a way of interpreting the transient data derived from oscillatory rheological tests. It is designed to allow both the linear and non-linear deformation regimes to be understood within a single unified framework. This code provides a convenient way to determine the SPP framework metrics for a given sample of oscillatory data. It will produce a text file containing the SPP metrics, which the user can then plot using their software of choice. It can also produce a second text file with additional derived data (components of tangent, normal, and binormal vectors), as well as pre-plotted figures if so desired. It is the R version of the Package SPP by Simon Rogers Group for Soft Matter (Simon A. Rogers, Brian M. Erwin, Dimitris Vlassopoulos, Michel Cloitre (2011) <doi:10.1122/1.3544591>).
Mixture Nested Effects Models (mnem) is an extension of Nested Effects Models and allows for the analysis of single cell perturbation data provided by methods like Perturb-Seq (Dixit et al., 2016) or Crop-Seq (Datlinger et al., 2017). In those experiments each of many cells is perturbed by a knock-down of a specific gene, i.e. several cells are perturbed by a knock-down of gene A, several by a knock-down of gene B, ... and so forth. The observed read-out has to be multi-trait and in the case of the Perturb-/Crop-Seq gene are expression profiles for each cell. mnem uses a mixture model to simultaneously cluster the cell population into k clusters and and infer k networks causally linking the perturbed genes for each cluster. The mixture components are inferred via an expectation maximization algorithm.
This package implements the EM algorithm with one-step Gradient Descent method to estimate the parameters of the Block-Basu bivariate Pareto distribution with location and scale. We also found parametric bootstrap and asymptotic confidence intervals based on the observed Fisher information of scale and shape parameters, and exact confidence intervals for location parameters. Details are in Biplab Paul and Arabin Kumar Dey (2023) <doi:10.48550/arXiv.1608.02199> "An EM algorithm for absolutely continuous Marshall-Olkin bivariate Pareto distribution with location and scale"; E L Lehmann and George Casella (1998) <doi:10.1007/b98854> "Theory of Point Estimation"; Bradley Efron and R J Tibshirani (1994) <doi:10.1201/9780429246593> "An Introduction to the Bootstrap"; A P Dempster, N M Laird and D B Rubin (1977) <www.jstor.org/stable/2984875> "Maximum Likelihood from Incomplete Data via the EM Algorithm".
The recovery of visual sensitivity in a dark environment is known as dark adaptation. In a clinical or research setting the recovery is typically measured after a dazzling flash of light and can be described by the Mahroo, Lamb and Pugh (MLP) model of dark adaptation. The functions in this package take dark adaptation data and use nonlinear regression to find the parameters of the model that best describe the data. They do this by firstly, generating rapid initial objective estimates of data adaptation parameters, then a multi-start algorithm is used to reduce the possibility of a local minimum. There is also a bootstrap method to calculate parameter confidence intervals. The functions rely upon a dark list or object. This object is created as the first step in the workflow and parts of the object are updated as it is processed.
This package provides a collection of functions developed to support the tutorial on using Exploratory Structural Equiation Modeling (ESEM) (Asparouhov & Muthén, 2009) <https://www.statmodel.com/download/EFACFA810.pdf>) with Longitudinal Study of Australian Children (LSAC) dataset (Mohal et al., 2023) <doi:10.26193/QR4L6Q>. The package uses tidyverse','psych', lavaan','semPlot and provides additional functions to conduct ESEM. The package provides general functions to complete ESEM, including esem_c(), creation of target matrix (if it is used) make_target(), generation of the Confirmatory Factor Analysis (CFA) model syntax esem_cfa_syntax(). A sample data is provided - the package includes a sample data of the Strengths and Difficulties Questionnaire of the Longitudinal Study of Australian Children (SDQ LSAC) in sdq_lsac(). ESEM package vignette presents the tutorial demonstrating the use of ESEM on SDQ LSAC data.
This package provides functions to conduct a model-agnostic asymptotic hypothesis test for the identification of interaction effects in black-box machine learning models. The null hypothesis assumes that a given set of covariates does not contribute to interaction effects in the prediction model. The test statistic is based on the difference of variances of partial dependence functions (Friedman (2008) <doi:10.1214/07-AOAS148> and Welchowski (2022) <doi:10.1007/s13253-021-00479-7>) with respect to the original black-box predictions and the predictions under the null hypothesis. The hypothesis test can be applied to any black-box prediction model, and the null hypothesis of the test can be flexibly specified according to the research question of interest. Furthermore, the test is computationally fast to apply as the null distribution does not require resampling or refitting black-box prediction models.
This package provides new data-structure support for multi-precision computing for R users. The package supports 16-bit, 32-bit, and 64-bit operations. To the best of our knowledge, MPCR differs from the currently available packages in the following: MPCR introduces a new data structure that supports three different precisions (16-bit, 32-bit, and 64-bit), allowing for optimized memory allocation based on the desired precision. This feature offers significant advantages in memory optimization. MPCR extends support to all basic linear algebra methods across different precisions. Optional GPU acceleration via CUDA is available for 32-bit and 64-bit operations when CUDA Toolkit is detected during installation, while 16-bit operations are GPU-only and limited to matrix-matrix multiplication. MPCR maintains a consistent interface with normal R functions, allowing for seamless code integration and a user-friendly experience.
This package provides functions for generating pseudo-random numbers that follow a uniform distribution [0,1]. Randomness tests were conducted using the National Institute of Standards and Technology test suite<https://csrc.nist.gov/pubs/sp/800/22/r1/upd1/final>, along with additional tests. The sequence generated depends on the initial values and parameters. The package includes a linear congruence map as the decision map and three chaotic maps to generate the pseudo-random sequence, which follow a uniform distribution. Other distributions can be generated from the uniform distribution using the Inversion Principle Method and BOX-Muller transformation. Small perturbations in seed values result in entirely different sequences of numbers due to the sensitive nature of the maps being used. The chaotic nature of the maps helps achieve randomness in the generator. Additionally, the generator is capable of producing random bits.
The strength of evidence provided by epidemiological and observational studies is inherently limited by the potential for unmeasured confounding. We focus on three key quantities: the observed bound of the confidence interval closest to the null, the relationship between an unmeasured confounder and the outcome, for example a plausible residual effect size for an unmeasured continuous or binary confounder, and the relationship between an unmeasured confounder and the exposure, for example a realistic mean difference or prevalence difference for this hypothetical confounder between exposure groups. Building on the methods put forth by Cornfield et al. (1959), Bross (1966), Schlesselman (1978), Rosenbaum & Rubin (1983), Lin et al. (1998), Lash et al. (2009), Rosenbaum (1986), Cinelli & Hazlett (2020), VanderWeele & Ding (2017), and Ding & VanderWeele (2016), we can use these quantities to assess how an unmeasured confounder may tip our result to insignificance.
Learning and inference over dynamic Bayesian networks of arbitrary Markovian order. Extends some of the functionality offered by the bnlearn package to learn the networks from data and perform exact inference. It offers three structure learning algorithms for dynamic Bayesian networks: Trabelsi G. (2013) <doi:10.1007/978-3-642-41398-8_34>, Santos F.P. and Maciel C.D. (2014) <doi:10.1109/BRC.2014.6880957>, Quesada D., Bielza C. and Larrañaga P. (2021) <doi:10.1007/978-3-030-86271-8_14>. It also offers the possibility to perform forecasts of arbitrary length. A tool for visualizing the structure of the net is also provided via the visNetwork package. Further detailed information and examples can be found in our Journal of Statistical Software paper Quesada D., Larrañaga P. and Bielza C. (2025) <doi:10.18637/jss.v115.i06>.
This package provides the probability density function (PDF), cumulative distribution function (CDF), the first-order and second-order partial derivatives of the PDF, and a fitting function for the diffusion decision model (DDM; e.g., Ratcliff & McKoon, 2008, <doi:10.1162/neco.2008.12-06-420>) with across-trial variability in the drift rate. Because the PDF, its partial derivatives, and the CDF of the DDM both contain an infinite sum, they need to be approximated. fddm implements all published approximations (Navarro & Fuss, 2009, <doi:10.1016/j.jmp.2009.02.003>; Gondan, Blurton, & Kesselmeier, 2014, <doi:10.1016/j.jmp.2014.05.002>; Blurton, Kesselmeier, & Gondan, 2017, <doi:10.1016/j.jmp.2016.11.003>; Hartmann & Klauer, 2021, <doi:10.1016/j.jmp.2021.102550>) plus new approximations. All approximations are implemented purely in C++ providing faster speed than existing packages.
This package provides a variety of latent Markov models, including hidden Markov models, hidden semi-Markov models, state-space models and continuous-time variants can be formulated and estimated within the same framework via directly maximising the likelihood function using the so-called forward algorithm. Applied researchers often need custom models that standard software does not easily support. Writing tailored R code offers flexibility but suffers from slow estimation speeds. We address these issues by providing easy-to-use functions (written in C++ for speed) for common tasks like the forward algorithm. These functions can be combined into custom models in a Lego-type approach, offering up to 10-20 times faster estimation via standard numerical optimisers. To aid in building fully custom likelihood functions, several vignettes are included that show how to simulate data from and estimate all the above model classes.
This package provides functions to fit point process models using the Palm likelihood. First proposed by Tanaka, Ogata, and Stoyan (2008) <DOI:10.1002/bimj.200610339>, maximisation of the Palm likelihood can provide computationally efficient parameter estimation for point process models in situations where the full likelihood is intractable. This package is chiefly focused on Neyman-Scott point processes, but can also fit the void processes proposed by Jones-Todd et al. (2019) <DOI:10.1002/sim.8046>. The development of this package was motivated by the analysis of capture-recapture surveys on which individuals cannot be identified---the data from which can conceptually be seen as a clustered point process (Stevenson, Borchers, and Fewster, 2019 <DOI:10.1111/biom.12983>). As such, some of the functions in this package are specifically for the estimation of cetacean density from two-camera aerial surveys.
This package contains functions for statistical data analysis based on spatially-clustered techniques. The package allows estimating the spatially-clustered spatial regression models presented in Cerqueti, Maranzano \& Mattera (2024), "Spatially-clustered spatial autoregressive models with application to agricultural market concentration in Europe", arXiv preprint 2407.15874 <doi:10.48550/arXiv.2407.15874>. Specifically, the current release allows the estimation of the spatially-clustered linear regression model (SCLM), the spatially-clustered spatial autoregressive model (SCSAR), the spatially-clustered spatial Durbin model (SCSEM), and the spatially-clustered linear regression model with spatially-lagged exogenous covariates (SCSLX). From release 0.0.2, the library contains functions to estimate spatial clustering based on Adiajacent Matrix K-Means (AMKM) as described in Zhou, Liu \& Zhu (2019), "Weighted adjacent matrix for K-means clustering", Multimedia Tools and Applications, 78 (23) <doi:10.1007/s11042-019-08009-x>.
An advanced version of package s2dverification'. Intended for seasonal to decadal (s2d) climate forecast verification, but also applicable to other types of forecasts or general climate analysis. This package is specifically designed for comparing experimental and observational datasets. It provides functionality for data retrieval, post-processing, skill score computation against observations, and visualization. Compared to s2dverification', s2dv is more compatible with the package startR', able to use multiple cores for computation and handle multi-dimensional arrays with a higher flexibility. The Climate Data Operators (CDO) version used in development is 1.9.8. Implements methods described in Wilks (2011) <doi:10.1016/B978-0-12-385022-5.00008-7>, DelSole and Tippett (2016) <doi:10.1175/MWR-D-15-0218.1>, Kharin et al. (2012) <doi:10.1029/2012GL052647>, Doblas-Reyes et al. (2003) <doi:10.1007/s00382-003-0350-4>.
Computes the penalized maximum likelihood estimates of factor loadings and unique variances for various tuning parameters. The pathwise coordinate descent along with EM algorithm is used. This package also includes a new graphical tool which outputs path diagram, goodness-of-fit indices and model selection criteria for each regularization parameter (Yamamoto, M., Hirose, K. and Nagata, H., 2017 <doi:10.1007/s41237-016-0007-3>). The user can change the regularization parameter by manipulating scrollbars, which is helpful to find a suitable value of regularization parameter. As a penalty, we can choose either the minimax concave penalty (Hirose, K. and Yamamoto, M., 2015 <doi:10.1007/s11222-014-9458-0>; Hirose, K. and Yamamoto, M., 2014 <doi:10.1016/j.csda.2014.05.011>) or the product-based elastic net penalty (Hirose, K. and Terada, Y., 2023 <doi:10.1007/s11336-022-09868-4>).