Search, query, and download tabular and geospatial data from the British Columbia Data Catalogue (<https://catalogue.data.gov.bc.ca/>). Search catalogue data records based on keywords, data licence, sector, data format, and B.C. government organization. View metadata directly in R, download many data formats, and query geospatial data available via the B.C. government Web Feature Service ('WFS') using dplyr syntax.
Automatic specification and estimation of reserve demand curves for central bank operations. The package can help to choose the best demand curve and identify additional explanatory variables. Various plot and predict options are included. For more details, see Chen et al. (2023) <https://www.imf.org/en/Publications/WP/Issues/2023/09/01/Modeling-the-Reserve-Demand-to-Facilitate-Central-Bank-Operations-538754>.
This package contains a single function dclust() for divisive hierarchical clustering based on recursive k-means partitioning (k = 2). Useful for clustering large datasets where computation of a n x n distance matrix is not feasible (e.g. n > 10,000 records). For further information see Steinbach, Karypis and Kumar (2000) <http://glaros.dtc.umn.edu/gkhome/fetch/papers/docclusterKDDTMW00.pdf>.
Dynamic treatment regime estimation and inference via G-estimation, dynamic weighted ordinary least squares (dWOLS) and Q-learning. Inference via bootstrap and recursive sandwich estimation. Estimation and inference for survival outcomes via Dynamic Weighted Survival Modeling (DWSurv). Extension to continuous treatment variables. Wallace et al. (2017) <DOI:10.18637/jss.v080.i02>; Simoneau et al. (2020) <DOI:10.1080/00949655.2020.1793341>.
This package implements the Fourier cumulative sum (CUSUM) cointegration test for detecting cointegration relationships in time series data with structural breaks. The test uses Fourier approximations to capture smooth structural changes and CUSUM statistics to test for cointegration stability. Based on methodology described in Zaghdoudi (2025) <doi:10.46557/001c.144076>. The corrected Akaike Information Criterion (AICc) is used for optimal frequency selection.
This package provides a collection of datasets essential for functional genomic analysis. Gene names, gene positions, cytoband information, sourced from Ensembl and phenotypes association graph prepared from GWAScatalog are included. Data is available in both GRCh37 and 38 builds. These datasets facilitate a wide range of genomic studies, including the identification of genetic variants, exploration of genomic features, and post-GWAS functional analysis.
Generalized LassO applied to knot selection in multivariate B-splinE Regression (GLOBER) implements a novel approach for estimating functions in a multivariate nonparametric regression model based on an adaptive knot selection for B-splines using the Generalized Lasso. For further details we refer the reader to the paper Savino, M. E. and Lévy-Leduc, C. (2023), <arXiv:2306.00686>.
An approach to analyzing Likert response items, with an emphasis on visualizations. The stacked bar plot is the preferred method for presenting Likert results. Tabular results are also implemented along with density plots to assist researchers in determining whether Likert responses can be used quantitatively instead of qualitatively. See the likert(), summary.likert(), and plot.likert() functions to get started.
Complete analytical environment for the construction and analysis of matrix population models and integral projection models. Includes the ability to construct historical matrices, which are 2d matrices comprising 3 consecutive times of demographic information. Estimates both raw and function-based forms of historical and standard ahistorical matrices. It also estimates function-based age-by-stage matrices and raw and function-based Leslie matrices.
The proposed method aims at predicting the longitudinal mean response trajectory by a kernel-based estimator. The kernel estimator is constructed by imposing weights based on subject-wise similarity on L2 metric space between predictor trajectories as well as time proximity. Users could also perform variable selections to derive functional predictors with predictive significance by the proposed multiplicative model with multivariate Gaussian kernels.
The goal of McMiso is to provide functions for isotonic regression when there are multiple independent variables. The functions solve the optimization problem using recursion and leverage parallel computing to improve speed, and are useful for situations with relatively large number of covariates. The estimation method follows the projective Bayes solution described in Cheung and Diaz (2023) <doi:10.1093/jrsssb/qkad014>.
Implementation of hypothesis testing procedures described in Hansen (1992) <doi:10.1002/jae.3950070506>, Carrasco, Hu, & Ploberger (2014) <doi:10.3982/ECTA8609>, Dufour & Luger (2017) <doi:10.1080/07474938.2017.1307548>, and Rodriguez Rondon & Dufour (2024) <https://grodriguezrondon.com/files/RodriguezRondon_Dufour_2025_MonteCarlo_LikelihoodRatioTest_MarkovSwitchingModels_20251014.pdf> that can be used to identify the number of regimes in Markov switching models.
Generate maximum projection (MaxPro) designs for quantitative and/or qualitative factors. Details of the MaxPro criterion can be found in: (1) Joseph, Gul, and Ba. (2015) "Maximum Projection Designs for Computer Experiments", Biometrika, 102, 371-380, and (2) Joseph, Gul, and Ba. (2018) "Designing Computer Experiments with Multiple Types of Factors: The MaxPro Approach", Journal of Quality Technology, to appear.
Additive proportional odds model for ordinal data using Laplace P-splines. The combination of Laplace approximations and P-splines enable fast and flexible inference in a Bayesian framework. Specific approximations are proposed to account for the asymmetry in the marginal posterior distributions of non-penalized parameters. For more details, see Lambert and Gressani (2023) <doi:10.1177/1471082X231181173> ; Preprint: <arXiv:2210.01668>).
Tool for producing Pen's parade graphs, useful for visualizing inequalities in income, wages or other variables, as proposed by Pen (1971, ISBN: 978-0140212594). Income or another economic variable is captured by the vertical axis, while the population is arranged in ascending order of income along the horizontal axis. Pen's income parades provide an easy-to-interpret visualization of economic inequalities.
The goal of SAFEPG is to predict climate-related extreme losses by fitting a frequency-severity model. It improves predictive performance by introducing a sign-aligned regularization term, which ensures consistent signs for the coefficients across the frequency and severity components. This enhancement not only increases model accuracy but also enhances its interpretability, making it more suitable for practical applications in risk assessment.
This package provides a fast implementation of the SWAG algorithm for Generalized Linear Models which allows to perform a meta-learning procedure that combines screening and wrapper methods to find a set of extremely low-dimensional attribute combinations. The package then performs test on the network of selected models to identify the variables that are highly predictive by using entropy-based network measures.
This package provides tools for modeling non-continuous linear responses of ecological communities to environmental data. The package is straightforward through three steps: (1) data ordering (function OrdData()), (2) split-moving-window analysis (function SMW()) and (3) piecewise redundancy analysis (function pwRDA()). Relevant references include Cornelius and Reynolds (1991) <doi:10.2307/1941559> and Legendre and Legendre (2012, ISBN: 9780444538697).
This package provides functions for defining and conducting a time series prediction process including pre(post)processing, decomposition, modelling, prediction and accuracy assessment. The generated models and its yielded prediction errors can be used for benchmarking other time series prediction methods and for creating a demand for the refinement of such methods. For this purpose, benchmark data from prediction competitions may be used.
This package provides a Tcl/Tk Graphical User Interface (GUI) to display images than can be zoomed and panned using the mouse and keyboard shortcuts. tkImgR read and write different image formats (PPM/PGM, PNG and GIF) using the standard Tcl/Tk distribution (>=8.6), but other formats (JPEG, TIFF, CR2) can be handled using the tkImg package for Tcl/Tk'.
Utilities for restricted mean survival time (RMST) and time-varying restricted mean survival time quantities computed from survival curves provided on a time grid. The package is model-agnostic and accepts only a time vector and survival matrices, returning RMST-based quantities and bootstrap summaries. For restricted mean survival time methodology, see Royston and Parmar (2013) <doi:10.1186/1471-2288-13-152>.
This package infers the V genotype of an individual from immunoglobulin (Ig) repertoire sequencing data (AIRR-Seq, Rep-Seq). Includes detection of any novel alleles. This information is then used to correct existing V allele calls from among the sample sequences. Citations: Gadala-Maria, et al (2015) <doi:10.1073/pnas.1417683112>, Gadala-Maria, et al (2019) <doi:10.3389/fimmu.2019.00129>.
An implementation of three procedures developed by John Tukey: FUNOP (FUll NOrmal Plot), FUNOR-FUNOM (FUll NOrmal Rejection-FUll NOrmal Modification), and vacuum cleaner. Combined, they provide a way to identify, treat, and analyze outliers in two-way (i.e., contingency) tables, as described in his landmark paper "The Future of Data Analysis", Tukey, John W. (1962) <https://www.jstor.org/stable/2237638>.
The vcfpp.h (<https://github.com/Zilong-Li/vcfpp>) provides an easy-to-use C++ API of htslib', offering full functionality for manipulating Variant Call Format (VCF) files. The vcfppR package serves as the R bindings of the vcfpp.h library, enabling rapid processing of both compressed and uncompressed VCF files. Explore a range of powerful features for efficient VCF data manipulation.