Searches for, accesses, and retrieves Statistics Canada data tables, as well as individual vectors, as tidy data frames. This package enriches the tables with metadata, deals with encoding issues, allows for bilingual English or French language data retrieval, and bundles convenience functions to make it easier to work with retrieved table data. For more efficient data access the package allows for caching data in a local database and database level filtering, data manipulation and summarizing.
This package contains methods for observed-score linking and equating under the single-group, equivalent-groups, and nonequivalent-groups with anchor test(s) designs. Equating types include identity, mean, linear, general linear, equipercentile, circle-arc, and composites of these. Equating methods include synthetic, nominal weights, Tucker, Levine observed score, Levine true score, Braun/Holland, frequency estimation, and chained equating. Plotting and summary methods, and methods for multivariate presmoothing and bootstrap error estimation are also provided.
This package provides a toolset for generating Ecological Limit Function (ELF) models and evaluating potential species loss resulting from flow change, based on the elfgen framework. ELFs describe the relation between aquatic species richness (fish or benthic macroinvertebrates) and stream size characteristics (streamflow or drainage area). Journal publications are available outlining framework methodology (Kleiner et al. (2020) <doi:10.1111/1752-1688.12876>) and application (Rapp et al. (2020) <doi:10.1111/1752-1688.12877>).
The summation notation suggested by Einstein (1916) <doi:10.1002/andp.19163540702> is a concise mathematical notation that implicitly sums over repeated indices of n-dimensional arrays. Many ordinary matrix operations (e.g. transpose, matrix multiplication, scalar product, diag()', trace etc.) can be written using Einstein notation. The notation is particularly convenient for expressing operations on arrays with more than two dimensions because the respective operators ('tensor products') might not have a standardized name.
This package provides a function that uses a genetic algorithm to search for a subset of size k from the integers 1:n, such that a user-supplied objective function is minimized at that subset. The selection step is done by tournament selection based on ranks, and elitism may be used to retain a portion of the best solutions from one generation to the next. Population objective function values may optionally be evaluated in parallel.
L1 estimation for linear regression using Barrodale and Roberts method <doi:10.1145/355616.361024> and the EM algorithm <doi:10.1023/A:1020759012226>. Estimation of mean and covariance matrix using the multivariate Laplace distribution, density, distribution function, quantile function and random number generation for univariate and multivariate Laplace distribution <doi:10.1080/03610929808832115>. Implementation of Naik and Plungpongpun <doi:10.1007/0-8176-4487-3_7> for the Generalized spatial median estimator is included.
This package provides a set of functions for some multivariate analyses utilizing a structural equation modeling (SEM) approach through the OpenMx package. These analyses include canonical correlation analysis (CANCORR), redundancy analysis (RDA), and multivariate principal component regression (MPCR). It implements procedures discussed in Gu and Cheung (2023) <doi:10.1111/bmsp.12301>, Gu, Yung, and Cheung (2019) <doi:10.1080/00273171.2018.1512847>, and Gu et al. (2023) <doi:10.1080/00273171.2022.2141675>.
This package implements variable selection procedures for low to moderate size generalized linear regressions models. It includes the STOPES functions for linear regression (Capanu M, Giurcanu M, Begg C, Gonen M, Optimized variable selection via repeated data splitting, Statistics in Medicine, 2020, 19(6):2167-2184) as well as subsampling based optimization methods for generalized linear regression models (Marinela Capanu, Mihai Giurcanu, Colin B Begg, Mithat Gonen, Subsampling based variable selection for generalized linear models).
This package provides functionality for image processing and shape analysis in the context of reconstructed medical images generated by deep learning-based methods or standard image processing algorithms and produced from different medical imaging types, such as X-ray, Computational Tomography (CT), Magnetic Resonance Imaging (MRI), and pathology imaging. Specifically, offers tools to segment regions of interest and to extract quantitative shape descriptors for applications in signal processing, statistical analysis and modeling, and machine learning.
Using any importation code designed for SAS users to read ASCII files into sas7bdat files, this package parses through the INPUT block of a .sas syntax file to design the parameters needed for a read.fwf() function call. This allows the user to specify the location of the ASCII (often a .dat') file and the location of the SAS syntax file, and then load the data frame directly into R in just one step.
The ta-test is a modified two-sample or two-group t-test of Gosset (1908). In small samples with less than 15 replicates,the ta-test significantly reduces type I error rate but has almost the same power with the t-test and hence can greatly enhance reliability or reproducibility of discoveries in biology and medicine. The ta-test can test single null hypothesis or multiple null hypotheses without needing to correct p-values.
R-msigdb provides the Molecular Signatures Database in a R accessible objects. Signatures are stored in GeneSet class objects form the GSEABase package and the entire database is stored in a GeneSetCollection object. These data are then hosted on the ExperimentHub. Data used in this package was obtained from the MSigDB of the Broad Institute. Metadata for each gene set is stored along with the gene set in the GeneSet class object.
Postprocessors refine predictions outputted from machine learning models to improve predictive performance or better satisfy distributional limitations. This package introduces tailor objects, which compose iterative adjustments to model predictions. A number of pre-written adjustments are provided with the package, such as calibration. See Lichtenstein, Fischhoff, and Phillips (1977) <doi:10.1007/978-94-010-1276-8_19>. Other methods and utilities to compose new adjustments are also included. Tailors are tightly integrated with the tidymodels framework.
With the functions in this package you can check the validity of the Greek Tax Identification Number (AFM) and the Greek Personal Number (PA) <https://pa.gov.gr>. The PA is a new universal ID for Greek citizens across all public services and it is to replace older numbers issued by various Greek state agencies. Its format is a 12-character ID consisting of three alphanumeric characters followed by the nine numerical digits of the AFM.
This package implements the efficient estimator of bid-ask spreads from open, high, low, and close prices described in Ardia, Guidotti, & Kroencke (JFE, 2024) <doi:10.1016/j.jfineco.2024.103916>. It also provides an implementation of the estimators described in Roll (JF, 1984) <doi:10.1111/j.1540-6261.1984.tb03897.x>, Corwin & Schultz (JF, 2012) <doi:10.1111/j.1540-6261.2012.01729.x>, and Abdi & Ranaldo (RFS, 2017) <doi:10.1093/rfs/hhx084>.
Multilevel ecological data series (MEDS) are sequences of observations ordered according to temporal/spatial hierarchies that are defined by sample designs, with sample variability confined to ecological factors. Dendroclimatic MEDS of tree rings and climate are modeled into normalized fluctuations of tree growth and aridity. Modeled fluctuations (model frames) are compared with Mantel correlograms on multiple levels defined by sample design. Package implementation can be understood by running examples in modelFrame(), and muleMan() functions.
This package provides a collection of ergonomic large language model assistants designed to help you complete repetitive, hard-to-automate tasks quickly. After selecting some code, press the keyboard shortcut you've chosen to trigger the package app, select an assistant, and watch your chore be carried out. While the package ships with a number of chore helpers for R package development, users can create custom helpers just by writing some instructions in a markdown file.
Hardware-based support for CRC32C cyclic redundancy checksum function is made available for x86_64 systems with SSE2 support as well as for arm64', and detected at build-time via cmake with a software-based fallback. This functionality is exported at the C'-language level for use by other packages. CRC32C is described in RFC 3270 at <https://datatracker.ietf.org/doc/html/rfc3720> and is based on Castagnoli et al <doi:10.1109/26.231911>.
Interconverts between ordered lists and compact string notation. Useful for capturing code lists, and pair-wise codes and decodes, for text storage. Analogous to factor levels and labels. Generics encode() and decode() perform interconversion, while codes() and decodes() extract components of an encoding. The function encoded() checks whether something is interpretable as an encoding. If a vector has an encoded guide attribute, as_factor() uses it to coerce to factor.
This package provides tools for transforming R expressions. Provides functions for finding, extracting, and replacing patterns in R language objects, similarly to how regular expressions can be used to find, extract, and replace patterns in text. Also provides functions for generating code using specially-formatted template files and for translating R expressions into similar expressions in other programming languages. The package may be helpful for advanced uses of R expressions, such as developing domain-specific languages.
This package contains functions to simplify the use of data mining methods (classification, regression, clustering, etc.), for students and beginners in R programming. Various R packages are used and wrappers are built around the main functions, to standardize the use of data mining methods (input/output): it brings a certain loss of flexibility, but also a gain of simplicity. The package name came from the French "Fouille de Données en Master 2 Informatique Décisionnelle".
The Flow Analysis Summary Statistics Tool for R, fasstr', provides various functions to tidy and screen daily stream discharge data, calculate and visualize various summary statistics and metrics, and compute annual trending and volume frequency analyses. It features useful function arguments for filtering of and handling dates, customizing data and metrics, and the ability to pull daily data directly from the Water Survey of Canada hydrometric database (<https://collaboration.cmc.ec.gc.ca/cmc/hydrometrics/www/>).
This package provides the standard operations for signal processing on graphs: graph Fourier transform, spectral graph wavelet transform, visualization tools. It also implements a data driven method for graph signal denoising/regression, for details see De Loynes, Navarro, Olivier (2019) <arxiv:1906.01882>. The package also provides an interface to the SuiteSparse Matrix Collection, <https://sparse.tamu.edu/>, a large and widely used set of sparse matrix benchmarks collected from a wide range of applications.
Create animated biplots that enables dynamic visualisation of temporal or sequential changes in multivariate data by animating a single biplot across the levels of a time variable. It builds on objects from the biplotEZ package, Lubbe S, le Roux N, Nienkemper-Swanepoel J, Ganey R, Buys R, Adams Z, Manefeldt P (2024) <doi:10.32614/CRAN.package.biplotEZ>, allowing users to create animated biplots that reveal how both samples and variables evolve over time.