Allows to provide live interpretations and explanations of statistical functions in R. These interpretations and explanations are shown when the explained function is called by the user. They can interact with the values of the explained function's actual results to offer relevant, meaningful insights. The xplain interpretations and explanations are based on an easy-to-use XML format that allows to include R code to interact with the returns of the explained function.
ASSIGN is a computational tool to evaluate the pathway deregulation/activation status in individual patient samples. ASSIGN employs a flexible Bayesian factor analysis approach that adapts predetermined pathway signatures derived either from knowledge-based literature or from perturbation experiments to the cell-/tissue-specific pathway signatures. The deregulation/activation level of each context-specific pathway is quantified to a score, which represents the extent to which a patient sample encompasses the pathway deregulation/activation signature.
Banksy is an R package that incorporates spatial information to cluster cells in a feature space (e.g. gene expression). To incorporate spatial information, BANKSY computes the mean neighborhood expression and azimuthal Gabor filters that capture gene expression gradients. These features are combined with the cell's own expression to embed cells in a neighbor-augmented product space which can then be clustered, allowing for accurate and spatially-aware cell typing and tissue domain segmentation.
This package provides functionality for untargeted LC-MS metabolomics research as specified in the associated protocol article in the Metabolomics Data Processing and Data Analysis—Current Best Practices special issue of the Metabolites journal (2020). This includes tabular data preprocessing and quality control, uni- and multivariate analysis as well as quality control visualizations, feature-wise visualizations and results visualizations. Raw data preprocessing and functionality related to biological context, such as pathway analysis, is not included.
The spicyR package provides a framework for performing inference on changes in spatial relationships between pairs of cell types for cell-resolution spatial omics technologies. spicyR consists of three primary steps: (i) summarizing the degree of spatial localization between pairs of cell types for each image; (ii) modelling the variability in localization summary statistics as a function of cell counts and (iii) testing for changes in spatial localizations associated with a response variable.
The goal is to print an "aperçu", a short view of a vector, a matrix, a data.frame, a list or an array. By default, it prints the first 5 elements of each dimension. By default, the number of columns is equal to the number of lines. If you want to control the selection of the elements, you can pass a list, with each element being a vector giving the selection for each dimension.
Searches for, accesses, and retrieves Statistics Canada data tables, as well as individual vectors, as tidy data frames. This package enriches the tables with metadata, deals with encoding issues, allows for bilingual English or French language data retrieval, and bundles convenience functions to make it easier to work with retrieved table data. For more efficient data access the package allows for caching data in a local database and database level filtering, data manipulation and summarizing.
The summation notation suggested by Einstein (1916) <doi:10.1002/andp.19163540702> is a concise mathematical notation that implicitly sums over repeated indices of n-dimensional arrays. Many ordinary matrix operations (e.g. transpose, matrix multiplication, scalar product, diag()', trace etc.) can be written using Einstein notation. The notation is particularly convenient for expressing operations on arrays with more than two dimensions because the respective operators ('tensor products') might not have a standardized name.
This package contains methods for observed-score linking and equating under the single-group, equivalent-groups, and nonequivalent-groups with anchor test(s) designs. Equating types include identity, mean, linear, general linear, equipercentile, circle-arc, and composites of these. Equating methods include synthetic, nominal weights, Tucker, Levine observed score, Levine true score, Braun/Holland, frequency estimation, and chained equating. Plotting and summary methods, and methods for multivariate presmoothing and bootstrap error estimation are also provided.
This package provides a toolset for generating Ecological Limit Function (ELF) models and evaluating potential species loss resulting from flow change, based on the elfgen framework. ELFs describe the relation between aquatic species richness (fish or benthic macroinvertebrates) and stream size characteristics (streamflow or drainage area). Journal publications are available outlining framework methodology (Kleiner et al. (2020) <doi:10.1111/1752-1688.12876>) and application (Rapp et al. (2020) <doi:10.1111/1752-1688.12877>).
This package provides a function that uses a genetic algorithm to search for a subset of size k from the integers 1:n, such that a user-supplied objective function is minimized at that subset. The selection step is done by tournament selection based on ranks, and elitism may be used to retain a portion of the best solutions from one generation to the next. Population objective function values may optionally be evaluated in parallel.
L1 estimation for linear regression using Barrodale and Roberts method <doi:10.1145/355616.361024> and the EM algorithm <doi:10.1023/A:1020759012226>. Estimation of mean and covariance matrix using the multivariate Laplace distribution, density, distribution function, quantile function and random number generation for univariate and multivariate Laplace distribution <doi:10.1080/03610929808832115>. Implementation of Naik and Plungpongpun <doi:10.1007/0-8176-4487-3_7> for the Generalized spatial median estimator is included.
This package provides a set of functions for some multivariate analyses utilizing a structural equation modeling (SEM) approach through the OpenMx package. These analyses include canonical correlation analysis (CANCORR), redundancy analysis (RDA), and multivariate principal component regression (MPCR). It implements procedures discussed in Gu and Cheung (2023) <doi:10.1111/bmsp.12301>, Gu, Yung, and Cheung (2019) <doi:10.1080/00273171.2018.1512847>, and Gu et al. (2023) <doi:10.1080/00273171.2022.2141675>.
This package implements variable selection procedures for low to moderate size generalized linear regressions models. It includes the STOPES functions for linear regression (Capanu M, Giurcanu M, Begg C, Gonen M, Optimized variable selection via repeated data splitting, Statistics in Medicine, 2020, 19(6):2167-2184) as well as subsampling based optimization methods for generalized linear regression models (Marinela Capanu, Mihai Giurcanu, Colin B Begg, Mithat Gonen, Subsampling based variable selection for generalized linear models).
This package provides functionality for image processing and shape analysis in the context of reconstructed medical images generated by deep learning-based methods or standard image processing algorithms and produced from different medical imaging types, such as X-ray, Computational Tomography (CT), Magnetic Resonance Imaging (MRI), and pathology imaging. Specifically, offers tools to segment regions of interest and to extract quantitative shape descriptors for applications in signal processing, statistical analysis and modeling, and machine learning.
Using any importation code designed for SAS users to read ASCII files into sas7bdat files, this package parses through the INPUT block of a .sas syntax file to design the parameters needed for a read.fwf() function call. This allows the user to specify the location of the ASCII (often a .dat') file and the location of the SAS syntax file, and then load the data frame directly into R in just one step.
The ta-test is a modified two-sample or two-group t-test of Gosset (1908). In small samples with less than 15 replicates,the ta-test significantly reduces type I error rate but has almost the same power with the t-test and hence can greatly enhance reliability or reproducibility of discoveries in biology and medicine. The ta-test can test single null hypothesis or multiple null hypotheses without needing to correct p-values.
R-msigdb provides the Molecular Signatures Database in a R accessible objects. Signatures are stored in GeneSet class objects form the GSEABase package and the entire database is stored in a GeneSetCollection object. These data are then hosted on the ExperimentHub. Data used in this package was obtained from the MSigDB of the Broad Institute. Metadata for each gene set is stored along with the gene set in the GeneSet class object.
Postprocessors refine predictions outputted from machine learning models to improve predictive performance or better satisfy distributional limitations. This package introduces tailor objects, which compose iterative adjustments to model predictions. A number of pre-written adjustments are provided with the package, such as calibration. See Lichtenstein, Fischhoff, and Phillips (1977) <doi:10.1007/978-94-010-1276-8_19>. Other methods and utilities to compose new adjustments are also included. Tailors are tightly integrated with the tidymodels framework.
With the functions in this package you can check the validity of the Greek Tax Identification Number (AFM) and the Greek Personal Number (PA) <https://pa.gov.gr>. The PA is a new universal ID for Greek citizens across all public services and it is to replace older numbers issued by various Greek state agencies. Its format is a 12-character ID consisting of three alphanumeric characters followed by the nine numerical digits of the AFM.
This package implements the efficient estimator of bid-ask spreads from open, high, low, and close prices described in Ardia, Guidotti, & Kroencke (JFE, 2024) <doi:10.1016/j.jfineco.2024.103916>. It also provides an implementation of the estimators described in Roll (JF, 1984) <doi:10.1111/j.1540-6261.1984.tb03897.x>, Corwin & Schultz (JF, 2012) <doi:10.1111/j.1540-6261.2012.01729.x>, and Abdi & Ranaldo (RFS, 2017) <doi:10.1093/rfs/hhx084>.
Multilevel ecological data series (MEDS) are sequences of observations ordered according to temporal/spatial hierarchies that are defined by sample designs, with sample variability confined to ecological factors. Dendroclimatic MEDS of tree rings and climate are modeled into normalized fluctuations of tree growth and aridity. Modeled fluctuations (model frames) are compared with Mantel correlograms on multiple levels defined by sample design. Package implementation can be understood by running examples in modelFrame(), and muleMan() functions.
Hardware-based support for CRC32C cyclic redundancy checksum function is made available for x86_64 systems with SSE2 support as well as for arm64', and detected at build-time via cmake with a software-based fallback. This functionality is exported at the C'-language level for use by other packages. CRC32C is described in RFC 3270 at <https://datatracker.ietf.org/doc/html/rfc3720> and is based on Castagnoli et al <doi:10.1109/26.231911>.
This package provides a collection of ergonomic large language model assistants designed to help you complete repetitive, hard-to-automate tasks quickly. After selecting some code, press the keyboard shortcut you've chosen to trigger the package app, select an assistant, and watch your chore be carried out. While the package ships with a number of chore helpers for R package development, users can create custom helpers just by writing some instructions in a markdown file.