The wavelet-based variance transformation method is used for system modelling and prediction. It refines predictor spectral representation using Wavelet Theory, which leads to improved model specifications and prediction accuracy. Details of methodologies used in the package can be found in Jiang, Z., Sharma, A., & Johnson, F. (2020) <doi:10.1029/2019WR026962>, Jiang, Z., Rashid, M. M., Johnson, F., & Sharma, A. (2020) <doi:10.1016/j.envsoft.2020.104907>, and Jiang, Z., Sharma, A., & Johnson, F. (2021) <doi:10.1016/J.JHYDROL.2021.126816>.
This package creates complex autoregressive distributed lag (ARDL) models and constructs the underlying unrestricted and restricted error correction model (ECM) automatically, just by providing the order. It also performs the bounds-test for cointegration as described in Pesaran et al. (2001) <doi:10.1002/jae.616> and provides the multipliers and the cointegrating equation. The validity and the accuracy of this package have been verified by successfully replicating the results of Pesaran et al. (2001) in Natsiopoulos and Tzeremes (2022) <doi:10.1002/jae.2919>.
MCMC algorithms & processing functions for: 1. single response multiple regression, see Papageorgiou, G. (2018) <doi: 10.32614/RJ-2018-069>, 2. multivariate response multiple regression, with nonparametric models for the means, the variances and the correlation matrix, with variable selection, see Papageorgiou, G. and Marshall, B. C. (2020) <doi: 10.1080/10618600.2020.1739534>, 3. joint mean-covariance models for multivariate responses, see Papageorgiou, G. (2022) <doi: 10.1002/sim.9376>, and 4.Dirichlet process mixtures, see Papageorgiou, G. (2019) <doi: 10.1111/anzs.12273>.
Detects multiple changes in slope using the CPOP dynamic programming approach of Fearnhead, Maidstone, and Letchford (2019) <doi:10.1080/10618600.2018.1512868>. This method finds the best continuous piecewise linear fit to data under a criterion that measures fit to data using the residual sum of squares, but penalizes complexity based on an L0 penalty on changes in slope. Further information regarding the use of this package with detailed examples can be found in Fearnhead and Grose (2024) <doi:10.18637/jss.v109.i07>.
This package provides a set of functions to perform Raju, van der Linden and Fleer's (1995, <doi:10.1177/014662169501900405>) Differential Functioning of Items and Tests (DFIT) analyses. It includes functions to use the Monte Carlo Item Parameter Replication approach (Oshima, Raju, & Nanda, 2006, <doi:10.1111/j.1745-3984.2006.00001.x>) for obtaining the associated statistical significance tests cut-off points. They may also be used for a priori and post-hoc power calculations (Cervantes, 2017, <doi:10.18637/jss.v076.i05>).
This package provides functions for treatment effect estimation, hypothesis testing, and future study design for settings where the surrogate is used in place of the primary outcome for individuals for whom the surrogate is valid, and the primary outcome is purposefully measured in the remaining patients. More details are available in: Knowlton, R., Parast, L. (2024) ``Efficient Testing Using Surrogate Information," Biometrical Journal, 67(6): e70086, <doi:10.1002/bimj.70086>. A tutorial for this package can be found at <https://www.laylaparast.com/etsi>.
Fits the space-time Epidemic Type Aftershock Sequence ('ETAS') model to earthquake catalogs using a stochastic declustering approach. The ETAS model is a spatio-temporal marked point process model and a special case of the Hawkes process. The package is based on a Fortran program by Jiancang Zhuang (available at <https://bemlar.ism.ac.jp/zhuang/software.html>), which is modified and translated into C++ and C such that it can be called from R. Parallel computing with OpenMP is possible on supported platforms.
User-friendly and fast set of functions for estimating parameters of hierarchical Bayesian species distribution models (Latimer and others 2006 <doi:10.1890/04-0609>). Such models allow interpreting the observations (occurrence and abundance of a species) as a result of several hierarchical processes including ecological processes (habitat suitability, spatial dependence and anthropogenic disturbance) and observation processes (species detectability). Hierarchical species distribution models are essential for accurately characterizing the environmental response of species, predicting their probability of occurrence, and assessing uncertainty in the model results.
Identify Cancer Dysfunctional Sub-pathway by integrating gene expression, DNA methylation and copy number variation, and pathway topological information. 1)We firstly calculate the gene risk scores by integrating three kinds of data: DNA methylation, copy number variation, and gene expression. 2)Secondly, we perform a greedy search algorithm to identify the key dysfunctional sub-pathways within the pathways for which the discriminative scores were locally maximal. 3)Finally, the permutation test was used to calculate statistical significance level for these key dysfunctional sub-pathways.
Interpretability of complex machine learning models is a growing concern. This package helps to understand key factors that drive the decision made by complicated predictive model (so called black box model). This is achieved through local approximations that are either based on additive regression like model or CART like model that allows for higher interactions. The methodology is based on Tulio Ribeiro, Singh, Guestrin (2016) <doi:10.1145/2939672.2939778>. More details can be found in Staniak, Biecek (2018) <doi:10.32614/RJ-2018-072>.
To determine the number of quantitative assays needed for a sample of data using pooled testing methods, which include mini-pooling (MP), MP with algorithm (MPA), and marker-assisted MPA (mMPA). To estimate the number of assays needed, the package also provides a tool to conduct Monte Carlo (MC) to simulate different orders in which the sample would be collected to form pools. Using MC avoids the dependence of the estimated number of assays on any specific ordering of the samples to form pools.
Functions, examples and data from the first and the second edition of "Numerical Methods and Optimization in Finance" by M. Gilli, D. Maringer and E. Schumann (2019, ISBN:978-0128150658). The package provides implementations of optimisation heuristics (Differential Evolution, Genetic Algorithms, Particle Swarm Optimisation, Simulated Annealing and Threshold Accepting), and other optimisation tools, such as grid search and greedy search. There are also functions for the valuation of financial instruments such as bonds and options, for portfolio selection and functions that help with stochastic simulations.
This package provides functions for classifying sparseness in 2 x 2 categorical data where one or more cells have zero counts. The classification uses three widely applied summary measures: Risk Difference (RD), Relative Risk (RR), and Odds Ratio (OR). Helps in selecting suitable continuity corrections for zero cells in multi-centre or meta-analysis studies. Also supports sensitivity analysis and can detect phenomena such as Simpson's paradox. The methodology is based on Subbiah and Srinivasan (2008) <doi:10.1016/j.spl.2008.06.023>.
This is the R API for the nfer formalism (<http://nfer.io/>). nfer was developed to specify event stream abstractions for spacecraft telemetry such as the Mars Science Laboratory. Users write rules using a syntax that borrows heavily from Allen's Temporal Logic that, when applied to an event stream, construct a hierarchy of temporal intervals with data. The R API supports loading rules from a file or mining them from historical data. Traces of events or pools of intervals are provided as data frames.
We provide a toolbox to fit and simulate a univariate or multivariate damped random walk process that is also known as an Ornstein-Uhlenbeck process or a continuous-time autoregressive model of the first order, i.e., CAR(1) or CARMA(1, 0). This process is suitable for analyzing univariate or multivariate time series data with irregularly-spaced observation times and heteroscedastic measurement errors. When it comes to the multivariate case, the number of data points (measurements/observations) available at each observation time does not need to be the same, and the length of each time series can vary. The number of time series data sets that can be modeled simultaneously is limited to ten in this version of the package. We use Kalman-filtering to evaluate the resulting likelihood function, which leads to a scalable and efficient computation in finding maximum likelihood estimates of the model parameters or in drawing their posterior samples. Please pay attention to loading the data if this package is used for astronomical data analyses; see the details in the manual. Also see Hu and Tak (2020) <arXiv:2005.08049>.
This package provides smooth additive quantile regression models, fitted using the methods of Fasiolo et al. (2017). Differently from quantreg, the smoothing parameters are estimated automatically by marginal loss minimization, while the regression coefficients are estimated using either PIRLS or Newton algorithm. The learning rate is determined so that the Bayesian credible intervals of the estimated effects have approximately the correct coverage. The main function is qgam() which is similar to gam() in the mgcv package, but fits non-parametric quantile regression models.
Rasqal is a C library that handles Resource Description Framework (RDF) query language syntaxes, query construction and execution of queries returning results as bindings, boolean, RDF graphs/triples or syntaxes. The supported query languages are SPARQL Query 1.0, SPARQL Query 1.1, SPARQL Update 1.1 (no executing) and the Experimental SPARQL extensions (LAQRS). Rasqal can write binding query results in the SPARQL XML, SPARQL JSON, CSV, TSV, HTML, ASCII tables, RDF/XML and Turtle/N3 and read them in SPARQL XML, RDF/XML and Turtle/N3.
Genome-wide association studies (GWAS) is a widely used tool for identification of genetic variants associated with phenotypes and diseases, though complex diseases featuring many genetic variants with small effects present difficulties for traditional these studies. By leveraging pleiotropy, the statistical power of a single GWAS can be increased. This package provides functions for fitting graph-GPA, a statistical framework to prioritize GWAS results by integrating pleiotropy. GGPA package provides user-friendly interface to fit graph-GPA models, implement association mapping, and generate a phenotype graph.
Efficient methods for Bayesian inference of state space models via Markov chain Monte Carlo (MCMC) based on parallel importance sampling type weighted estimators (Vihola, Helske, and Franks, 2020, <doi:10.1111/sjos.12492>), particle MCMC, and its delayed acceptance version. Gaussian, Poisson, binomial, negative binomial, and Gamma observation densities and basic stochastic volatility models with linear-Gaussian state dynamics, as well as general non-linear Gaussian models and discretised diffusion models are supported. See Helske and Vihola (2021, <doi:10.32614/RJ-2021-103>) for details.
How to fit a straight line through a set of points with errors in both coordinates? The bfsl package implements the York regression (York, 2004 <doi:10.1119/1.1632486>). It provides unbiased estimates of the intercept, slope and standard errors for the best-fit straight line to independent points with (possibly correlated) normally distributed errors in both x and y. Other commonly used errors-in-variables methods, such as orthogonal distance regression, geometric mean regression or Deming regression are special cases of the bfsl solution.
Facilitates the identification of counterfactual queries in structural causal models via the ID* and IDC* algorithms by Shpitser, I. and Pearl, J. (2007, 2008) <doi:10.48550/arXiv.1206.5294>, <https://jmlr.org/papers/v9/shpitser08a.html>. Provides a simple interface for defining causal diagrams and counterfactual conjunctions. Construction of parallel worlds graphs and counterfactual graphs is carried out automatically based on the counterfactual query and the causal diagram. See Tikka, S. (2023) <doi:10.32614/RJ-2023-053> for a tutorial of the package.
Fitting (hierarchical) hidden Markov models to financial data via maximum likelihood estimation. See Oelschläger, L. and Adam, T. "Detecting Bearish and Bullish Markets in Financial Time Series Using Hierarchical Hidden Markov Models" (2021, Statistical Modelling) <doi:10.1177/1471082X211034048> for a reference on the method. A user guide is provided by the accompanying software paper "fHMM: Hidden Markov Models for Financial Time Series in R", Oelschläger, L., Adam, T., and Michels, R. (2024, Journal of Statistical Software) <doi:10.18637/jss.v109.i09>.
This package implements statistical methods for group factor analysis, focusing on estimating the number of global and local factors and extracting them. Several algorithms are implemented, including Canonical Correlation-based Estimation by Choi et al. (2021) <doi:10.1016/j.jeconom.2021.09.008>, Generalised Canonical Correlation Estimation by Lin and Shin (2023) <doi:10.2139/ssrn.4295429>, Circularly Projected Estimation by Chen (2022) <doi:10.1080/07350015.2022.2051520>, and the Aggregated Projection Method by Hu et al. (2025) <doi:10.1080/01621459.2025.2491154>.
This package provides an extension of the shadow-test approach to computerized adaptive testing (CAT) implemented in the TestDesign package for the assessment framework involving multiple tests administered periodically throughout the year. This framework is referred to as the Multiple Administrations Adaptive Testing (MAAT) and supports multiple item pools vertically scaled and multiple phases (stages) of CAT within each test. Between phases and tests, transitioning from one item pool (and associated constraints) to another is allowed as deemed necessary to enhance the quality of measurement.