Perform seasonal adjustment of weekly data. The package provides a user-friendly interface for computing seasonally adjusted estimates of weekly data and includes functions for the creation of country-specific prior adjustment variables, as well as diagnostic tools to assess the quality of the adjustments. The method is described in more detail in Ginker (2023) <doi:10.13140/RG.2.2.12221.44000>.
This package performs brace expansions on strings. Made popular by Unix shells, brace expansion allows users to concisely generate certain character vectors by taking a single string and (recursively) expanding the comma-separated lists and double-period-separated integer and character sequences enclosed within braces in that string. The double-period-separated numeric integer expansion also supports padding the resulting numbers with zeros.
Calculate various cardiovascular disease risk scores from the Framingham Heart Study (FHS), the American College of Cardiology (ACC), and the American Heart Association (AHA) as described in Dâ agostino, et al (2008) <doi:10.1161/circulationaha.107.699579>, Goff, et al (2013) <doi:10.1161/01.cir.0000437741.48606.98>, and Mclelland, et al (2015) <doi:10.1016/j.jacc.2015.08.035>.
Automatic specification and estimation of reserve demand curves for central bank operations. The package can help to choose the best demand curve and identify additional explanatory variables. Various plot and predict options are included. For more details, see Chen et al. (2023) <https://www.imf.org/en/Publications/WP/Issues/2023/09/01/Modeling-the-Reserve-Demand-to-Facilitate-Central-Bank-Operations-538754>.
Dynamic treatment regime estimation and inference via G-estimation, dynamic weighted ordinary least squares (dWOLS) and Q-learning. Inference via bootstrap and recursive sandwich estimation. Estimation and inference for survival outcomes via Dynamic Weighted Survival Modeling (DWSurv). Extension to continuous treatment variables. Wallace et al. (2017) <DOI:10.18637/jss.v080.i02>; Simoneau et al. (2020) <DOI:10.1080/00949655.2020.1793341>.
This package contains a single function dclust() for divisive hierarchical clustering based on recursive k-means partitioning (k = 2). Useful for clustering large datasets where computation of a n x n distance matrix is not feasible (e.g. n > 10,000 records). For further information see Steinbach, Karypis and Kumar (2000) <http://glaros.dtc.umn.edu/gkhome/fetch/papers/docclusterKDDTMW00.pdf>.
This package provides a collection of datasets essential for functional genomic analysis. Gene names, gene positions, cytoband information, sourced from Ensembl and phenotypes association graph prepared from GWAScatalog are included. Data is available in both GRCh37 and 38 builds. These datasets facilitate a wide range of genomic studies, including the identification of genetic variants, exploration of genomic features, and post-GWAS functional analysis.
Generalized LassO applied to knot selection in multivariate B-splinE Regression (GLOBER) implements a novel approach for estimating functions in a multivariate nonparametric regression model based on an adaptive knot selection for B-splines using the Generalized Lasso. For further details we refer the reader to the paper Savino, M. E. and Lévy-Leduc, C. (2023), <arXiv:2306.00686>.
An approach to analyzing Likert response items, with an emphasis on visualizations. The stacked bar plot is the preferred method for presenting Likert results. Tabular results are also implemented along with density plots to assist researchers in determining whether Likert responses can be used quantitatively instead of qualitatively. See the likert(), summary.likert(), and plot.likert() functions to get started.
The proposed method aims at predicting the longitudinal mean response trajectory by a kernel-based estimator. The kernel estimator is constructed by imposing weights based on subject-wise similarity on L2 metric space between predictor trajectories as well as time proximity. Users could also perform variable selections to derive functional predictors with predictive significance by the proposed multiplicative model with multivariate Gaussian kernels.
Complete analytical environment for the construction and analysis of matrix population models and integral projection models. Includes the ability to construct historical matrices, which are 2d matrices comprising 3 consecutive times of demographic information. Estimates both raw and function-based forms of historical and standard ahistorical matrices. It also estimates function-based age-by-stage matrices and raw and function-based Leslie matrices.
Generate maximum projection (MaxPro) designs for quantitative and/or qualitative factors. Details of the MaxPro criterion can be found in: (1) Joseph, Gul, and Ba. (2015) "Maximum Projection Designs for Computer Experiments", Biometrika, 102, 371-380, and (2) Joseph, Gul, and Ba. (2018) "Designing Computer Experiments with Multiple Types of Factors: The MaxPro Approach", Journal of Quality Technology, to appear.
Implementation of hypothesis testing procedures described in Hansen (1992) <doi:10.1002/jae.3950070506>, Carrasco, Hu, & Ploberger (2014) <doi:10.3982/ECTA8609>, Dufour & Luger (2017) <doi:10.1080/07474938.2017.1307548>, and Rodriguez Rondon & Dufour (2024) <https://grodriguezrondon.com/files/RodriguezRondon_Dufour_2025_MonteCarlo_LikelihoodRatioTest_MarkovSwitchingModels_20251014.pdf> that can be used to identify the number of regimes in Markov switching models.
Additive proportional odds model for ordinal data using Laplace P-splines. The combination of Laplace approximations and P-splines enable fast and flexible inference in a Bayesian framework. Specific approximations are proposed to account for the asymmetry in the marginal posterior distributions of non-penalized parameters. For more details, see Lambert and Gressani (2023) <doi:10.1177/1471082X231181173> ; Preprint: <arXiv:2210.01668>).
Tool for producing Pen's parade graphs, useful for visualizing inequalities in income, wages or other variables, as proposed by Pen (1971, ISBN: 978-0140212594). Income or another economic variable is captured by the vertical axis, while the population is arranged in ascending order of income along the horizontal axis. Pen's income parades provide an easy-to-interpret visualization of economic inequalities.
This package provides a fast implementation of the SWAG algorithm for Generalized Linear Models which allows to perform a meta-learning procedure that combines screening and wrapper methods to find a set of extremely low-dimensional attribute combinations. The package then performs test on the network of selected models to identify the variables that are highly predictive by using entropy-based network measures.
This package provides tools for modeling non-continuous linear responses of ecological communities to environmental data. The package is straightforward through three steps: (1) data ordering (function OrdData()), (2) split-moving-window analysis (function SMW()) and (3) piecewise redundancy analysis (function pwRDA()). Relevant references include Cornelius and Reynolds (1991) <doi:10.2307/1941559> and Legendre and Legendre (2012, ISBN: 9780444538697).
The goal of SAFEPG is to predict climate-related extreme losses by fitting a frequency-severity model. It improves predictive performance by introducing a sign-aligned regularization term, which ensures consistent signs for the coefficients across the frequency and severity components. This enhancement not only increases model accuracy but also enhances its interpretability, making it more suitable for practical applications in risk assessment.
This package provides functions to perform split robust least angle regression. The approach first uses the least angle regression algorithm to split the variables into the models of an ensemble and robust estimates of the correlation between predictors. An elastic net estimator is then applied to the selected predictors in each model using the imputed data from the detect deviating cell (DDC) method.
This package infers the V genotype of an individual from immunoglobulin (Ig) repertoire sequencing data (AIRR-Seq, Rep-Seq). Includes detection of any novel alleles. This information is then used to correct existing V allele calls from among the sample sequences. Citations: Gadala-Maria, et al (2015) <doi:10.1073/pnas.1417683112>, Gadala-Maria, et al (2019) <doi:10.3389/fimmu.2019.00129>.
This package provides functions for defining and conducting a time series prediction process including pre(post)processing, decomposition, modelling, prediction and accuracy assessment. The generated models and its yielded prediction errors can be used for benchmarking other time series prediction methods and for creating a demand for the refinement of such methods. For this purpose, benchmark data from prediction competitions may be used.
This package provides a Tcl/Tk Graphical User Interface (GUI) to display images than can be zoomed and panned using the mouse and keyboard shortcuts. tkImgR read and write different image formats (PPM/PGM, PNG and GIF) using the standard Tcl/Tk distribution (>=8.6), but other formats (JPEG, TIFF, CR2) can be handled using the tkImg package for Tcl/Tk'.
The vcfpp.h (<https://github.com/Zilong-Li/vcfpp>) provides an easy-to-use C++ API of htslib', offering full functionality for manipulating Variant Call Format (VCF) files. The vcfppR package serves as the R bindings of the vcfpp.h library, enabling rapid processing of both compressed and uncompressed VCF files. Explore a range of powerful features for efficient VCF data manipulation.
An implementation of three procedures developed by John Tukey: FUNOP (FUll NOrmal Plot), FUNOR-FUNOM (FUll NOrmal Rejection-FUll NOrmal Modification), and vacuum cleaner. Combined, they provide a way to identify, treat, and analyze outliers in two-way (i.e., contingency) tables, as described in his landmark paper "The Future of Data Analysis", Tukey, John W. (1962) <https://www.jstor.org/stable/2237638>.