The Central Bank of the Republic of Turkey (CBRT) provides one of the most comprehensive time series databases on the Turkish economy. The CBRT package provides functions for accessing the CBRT's electronic data delivery system <https://evds3.tcmb.gov.tr/>. It contains the lists of all data categories and data groups for searching the available variables (data series). As of February 17, 2026, there were 47,986 variables in the dataset. The lists of data categories and data groups can be updated by the user at any time. A specific variable, a group of variables, or all variables in a data group can be downloaded at different frequencies using a variety of aggregation methods.
Nonnegative matrix factorization (NMF) is a technique to factorize a matrix with nonnegative values into the product of two matrices. Covariates are also allowed. Parallel computing is an option to enhance the speed and high-dimensional and large scale (and/or sparse) data are allowed. Relevant papers include: Wang Y. X. and Zhang Y. J. (2012). Nonnegative matrix factorization: A comprehensive review. IEEE Transactions on Knowledge and Data Engineering, 25(6): 1336-1353 <doi:10.1109/TKDE.2012.51> and Kim H. and Park H. (2008). Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM Journal on Matrix Analysis and Applications, 30(2): 713-730 <doi:10.1137/07069239X>.
Find the permutation symmetry group such that the covariance matrix of the given data is approximately invariant under it. Discovering such a permutation decreases the number of observations needed to fit a Gaussian model, which is of great use when it is smaller than the number of variables. Even if that is not the case, the covariance matrix found with gips approximates the actual covariance with less statistical error. The methods implemented in this package are described in Graczyk et al. (2022) <doi:10.1214/22-AOS2174>. Documentation about gips is provided via its website at <https://przechoj.github.io/gips/> and the paper by Chojecki, Morgen, KoÅ odziejek (2025, <doi:10.18637/jss.v112.i07>).
The minorization-maximization (MM) algorithm is a powerful tool for maximizing nonconcave target function. However, for most existing MM algorithms, the surrogate function in the minorization step is constructed in a case-specific manner and requires manual programming. To address this limitation, we develop the R package MMAD, which systematically integrates the assembly--decomposition technology in the MM framework. This new package provides a comprehensive computational toolkit for one-stop inference of complex target functions, including function construction, evaluation, minorization and optimization via MM algorithm. By representing the target function through a hierarchical composition of assembly functions, we design a hierarchical algorithmic structure that supports both bottom-up operations (construction, evaluation) and top-down operation (minorization).
marr (Maximum Rank Reproducibility) is a nonparametric approach that detects reproducible signals using a maximal rank statistic for high-dimensional biological data. In this R package, we implement functions that measures the reproducibility of features per sample pair and sample pairs per feature in high-dimensional biological replicate experiments. The user-friendly plot functions in this package also plot histograms of the reproducibility of features per sample pair and sample pairs per feature. Furthermore, our approach also allows the users to select optimal filtering threshold values for the identification of reproducible features and sample pairs based on output visualization checks (histograms). This package also provides the subset of data filtered by reproducible features and/or sample pairs.
This package provides efficient implementation of the Cross-Covariance Isolate Detect (CCID) methodology for the estimation of the number and location of multiple change-points in the second-order (cross-covariance or network) structure of multivariate, possibly high-dimensional time series. The method is motivated by the detection of change points in functional connectivity networks for functional magnetic resonance imaging (fMRI), electroencephalography (EEG), magentoencephalography (MEG) and electrocorticography (ECoG) data. The main routines in the package have been extensively tested on fMRI data. For details on the CCID methodology, please see Anastasiou et al (2022), Cross-covariance isolate detect: A new change-point method for estimating dynamic functional connectivity. Medical Image Analysis, Volume 75.
Fit response surfaces for datasets with latent-variable Gaussian process modeling, predict responses for new inputs, and plot latent variables locations in the latent space (only 1D or 2D). The input variables of the datasets can be quantitative, qualitative/categorical or mixed. The output variable of the datasets is a scalar (quantitative). The optimization of the likelihood function is done using a successive approximation/relaxation algorithm similar to another GP modeling package "GPM". The modeling method is published in "A Latent Variable Approach to Gaussian Process Modeling with Qualitative and Quantitative Factors" by Yichi Zhang, Siyu Tao, Wei Chen, and Daniel W. Apley (2018) <arXiv:1806.07504>. The package is developed in IDEAL of Northwestern University.
This package provides functions to perform the Sequential Probability Ratio Test (SPRT) for hypothesis testing in Binomial, Poisson and Normal distributions. The package allows users to specify Type I and Type II error probabilities, decision thresholds, and compare null and alternative hypotheses sequentially as data accumulate. It includes visualization tools for plotting the likelihood ratio path and decision boundaries, making it easier to interpret results. The methods are based on Wald (1945) <doi:10.1214/aoms/1177731118>, who introduced the SPRT as one of the earliest and most powerful sequential analysis techniques. This package is useful in quality control, clinical trials, and other applications requiring early decision-making.The term SPRT is an abbreviation and used intentionally.
The Molecular Signatures Database ('MSigDB') is one of the most widely used and comprehensive databases of gene sets for performing gene set enrichment analysis <doi:10.1016/j.cels.2015.12.004>. The msig package provides you with powerful, easy-to-use and flexible query functions for the MsigDB database. There are 2 query modes in the msig package: online query and local query. Both queries contain 2 steps: gene set name and gene. The online search is divided into 2 modes: registered search and non-registered browse. For registered search, email that you registered should be provided. Local queries can be made from local database, which can be updated by msig_update() function.
Uncertainty propagation analysis in spatial environmental modelling following methodology described in Heuvelink et al. (2007) <doi:10.1080/13658810601063951> and Brown and Heuvelink (2007) <doi:10.1016/j.cageo.2006.06.015>. The package provides functions for examining the uncertainty propagation starting from input data and model parameters, via the environmental model onto model outputs. The functions include uncertainty model specification, stochastic simulation and propagation of uncertainty using Monte Carlo (MC) techniques. Uncertain variables are described by probability distributions. Both numerical and categorical data types are handled. Spatial auto-correlation within an attribute and cross-correlation between attributes is accommodated for. The MC realizations may be used as input to the environmental models called from R, or externally.
This package provides two main functionalities. 1 - Given a system of simultaneous equation, it decomposes the matrix of coefficients weighting the endogenous variables into three submatrices: one includes the subset of coefficients that have a causal nature in the model, two include the subset of coefficients that have a interdependent nature in the model, either at systematic level or induced by the correlation between error terms. 2 - Given a decomposed model, it tests for the significance of the interdependent relationships acting in the system, via Maximum likelihood and Wald test, which can be built starting from the function output. For theoretical reference see Faliva (1992) <doi:10.1007/BF02589085> and Faliva and Zoia (1994) <doi:10.1007/BF02589041>.
This package implements Weighted-Average Least Squares model averaging for negative binomial regression models of Huynh (2024) <doi:10.48550/arXiv.2404.11324>, generalized linear models of De Luca, Magnus, Peracchi (2018) <doi:10.1016/j.jeconom.2017.12.007> and linear regression models of Magnus, Powell, Pruefer (2010) <doi:10.1016/j.jeconom.2009.07.004>, see also Magnus, De Luca (2016) <doi:10.1111/joes.12094>. Weighted-Average Least Squares for the linear regression model is based on the original MATLAB code by Magnus and De Luca <https://www.janmagnus.nl/items/WALS.pdf>, see also Kumar, Magnus (2013) <doi:10.1007/s13571-013-0060-9> and De Luca, Magnus (2011) <doi:10.1177/1536867X1201100402>.
Over sixty clustering algorithms are provided in this package with consistent input and output, which enables the user to try out algorithms swiftly. Additionally, 26 statistical approaches for the estimation of the number of clusters as well as the mirrored density plot (MD-plot) of clusterability are implemented. The packages is published in Thrun, M.C., Stier Q.: "Fundamental Clustering Algorithms Suite" (2021), SoftwareX, <DOI:10.1016/j.softx.2020.100642>. Moreover, the fundamental clustering problems suite (FCPS) offers a variety of clustering challenges any algorithm should handle when facing real world data, see Thrun, M.C., Ultsch A.: "Clustering Benchmark Datasets Exploiting the Fundamental Clustering Problems" (2020), Data in Brief, <DOI:10.1016/j.dib.2020.105501>.
This package provides a tool to process and analyse data collected with wearable raw acceleration sensors as described in Migueles and colleagues (JMPB 2019), and van Hees and colleagues (JApplPhysiol 2014; PLoSONE 2015). The package has been developed and tested for binary data from GENEActiv <https://activinsights.com/>, binary (.gt3x) and .csv-export data from Actigraph <https://ametris.com/> devices, and binary (.cwa) and .csv-export data from Axivity <https://axivity.com>. These devices are currently widely used in research on human daily physical activity. Further, the package can handle accelerometer data file from any other sensor brand providing that the data is stored in csv format. Also the package allows for external function embedding.
Price volatility refers to the degree of variation in series over a certain period of time. This volatility is especially noticeable in agricultural commodities, adding uncertainty for farmers, traders, and others in the agricultural supply chain. Commonly and popularly used four volatility models viz, GARCH, Glosten Jagannatan Runkle-GARCH (GJR-GARCH) model, exponentially weighted moving average (EWMA) model and Multiplicative Error Model (MEM) are selected and implemented. PWAVE, weighted ensemble model based on particle swarm optimization (PSO) is proposed to combine the forecast obtained from all the candidate models. This package has been developed using algorithm of Paul et al. <doi:10.1007/s40009-023-01218-x> and Yeasin and Paul (2024) <doi:10.1007/s11227-023-05542-3>.
Although model selection is ubiquitous in scientific discovery, the stability and uncertainty of the selected model is often hard to evaluate. How to characterize the random behavior of the model selection procedure is the key to understand and quantify the model selection uncertainty. This R package offers several graphical tools to visualize the distribution of the selected model. For example, Gplot(), Hplot(), VDSM_scatterplot() and VDSM_heatmap(). To the best of our knowledge, this is the first attempt to visualize such a distribution. About what distribution of selected model is and how it work please see Qin,Y.and Wang,L. (2021) "Visualization of Model Selection Uncertainty" <https://homepages.uc.edu/~qinyn/VDSM/VDSM.html>.
Estimation of the average treatment effect when controlling for high-dimensional confounders using debiased inverse propensity score weighting (DIPW). DIPW relies on the propensity score following a sparse logistic regression model, but the regression curves are not required to be estimable. Despite this, our package also allows the users to estimate the regression curves and take the estimated curves as input to our methods. Details of the methodology can be found in Yuhao Wang and Rajen D. Shah (2020) "Debiased Inverse Propensity Score Weighting for Estimation of Average Treatment Effects with High-Dimensional Confounders" <arXiv:2011.08661>. The package relies on the optimisation software MOSEK <https://www.mosek.com/> which must be installed separately; see the documentation for Rmosek'.
Computation of predictive information criteria (PIC) from select model object classes for model selection in predictive contexts. In contrast to the more widely used Akaike Information Criterion (AIC), which are derived under the assumption that target(s) of prediction (i.e. validation data) are independently and identically distributed to the fitting data, the PIC are derived under less restrictive assumptions and thus generalize AIC to the more practically relevant case of training/validation data heterogeneity. The methodology featured in this package is based on Flores (2021) <https://iro.uiowa.edu/esploro/outputs/doctoral/A-new-class-of-information-criteria/9984097169902771?institution=01IOWA_INST> "A new class of information criteria for improved prediction in the presence of training/validation data heterogeneity".
This package provides a novel meta-learning framework for forecast model selection using time series features. Many applications require a large number of time series to be forecast. Providing better forecasts for these time series is important in decision and policy making. We propose a classification framework which selects forecast models based on features calculated from the time series. We call this framework FFORMS (Feature-based FORecast Model Selection). FFORMS builds a mapping that relates the features of time series to the best forecast model using a random forest. seer package is the implementation of the FFORMS algorithm. For more details see our paper at <https://www.monash.edu/business/econometrics-and-business-statistics/research/publications/ebs/wp06-2018.pdf>.
This package provides a collection of procedures for analysing, visualising, and managing single-case data. Multi-phase and multi-baseline designs are supported. Analysing methods include regression models (multilevel, multivariate, bayesian), between case standardised mean difference, overlap indices ('PND', PEM', PAND', NAP', PET', tau-u', IRD', baseline corrected tau', CDC'), and randomization tests. Data preparation functions support outlier detection, handling missing values, scaling, and custom transformations. An export function helps to generate html, word, and latex tables in a publication friendly style. A shiny app allows to use scan in a graphical user interface. More details can be found in the online book Analyzing single-case data with R and scan', Juergen Wilbert (2026) <https://jazznbass.github.io/scan-Book/>.
In randomized controlled trial (RCT), balancing covariate is often one of the most important concern. CARM package provides functions to balance the covariates and generate allocation sequence by covariate-adjusted Adaptive Randomization via Mahalanobis-distance (ARM) for RCT. About what ARM is and how it works please see Y. Qin, Y. Li, W. Ma, H. Yang, and F. Hu (2024). "Adaptive randomization via Mahalanobis distance" Statistica Sinica. <doi:10.5705/ss.202020.0440>. In addition, the package is also suitable for the randomization process of multi-arm trials. For details, please see Yang H, Qin Y, Wang F, et al. (2023). "Balancing covariates in multi-arm trials via adaptive randomization" Computational Statistics & Data Analysis.<doi:10.1016/j.csda.2022.107642>.
This package provides functions to download, parse, and tidy statistical data published by HM Revenue and Customs ('HMRC') on GOV.UK'. Returns annotated hmrc_tbl data frames with provenance metadata (source URL, fetch time, vintage, cell methods) for reproducible fiscal research. Covers monthly tax receipts (41 tax heads from 2008), VAT (from 1973), fuel duties (from 1990), tobacco duties (from 1991), annual Corporation Tax receipts, stamp duty, research and development tax credit statistics (from 2000), tax gap estimates, Income Tax liabilities by income range, and monthly property transaction counts. File URLs are resolved at runtime via the GOV.UK Content API <https://www.gov.uk/api/content>, so data is always current without hardcoded URLs. Files are cached locally between sessions.
This package implements indirect demographic methods for estimating adult mortality from orphanhood data. The package includes the standard Brass and Hill (1973) method <https://scholar.google.com/scholar_lookup?&title=Estimating%20Adult%20Mortality%20from%20Orphanhood&pages=111-23&publication_year=1973&author=Brass%2CW.&author=Hill.%2CK.>, the regression-based approach developed by Timaeus (1992) <https://pubmed.ncbi.nlm.nih.gov/12317481/>, and the adjustments proposed by Luy (2012) <doi:10.1007/s13524-012-0101-4> for low-mortality populations. A relational model is used to harmonize estimates into comparable adult mortality indicators. The package also provides diagnostic tools to assess the sensitivity of results to assumptions about the mean age of childbearing and the choice of model life table family.
This package performs iterative proportional updating given a seed table and an arbitrary number of marginal distributions. This is commonly used in population synthesis, survey raking, matrix rebalancing, and other applications. For example, a household survey may be weighted to match the known distribution of households by size from the census. An origin/ destination trip matrix might be balanced to match traffic counts. The approach used by this package is based on a paper from Arizona State University (Ye, Xin, et. al. (2009) <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.537.723&rep=rep1&type=pdf>). Some enhancements have been made to their work including primary and secondary target balance/importance, general marginal agreement, and weight restriction.