Allows the user to categorise a continuous predictor variable in a logistic or a Cox proportional hazards regression setting, by maximising the discriminative ability of the model. I Barrio, I Arostegui, MX Rodriguez-Alvarez, JM Quintana (2015) <doi:10.1177/0962280215601873>. I Barrio, MX Rodriguez-Alvarez, L Meira-Machado, C Esteban, I Arostegui (2017) <https://www.idescat.cat/sort/sort411/41.1.3.barrio-etal.pdf>.
This package provides methods and plotting functions for displaying categorical data on an interactive heatmap using plotly'. Provides functionality for strictly categorical heatmaps, heatmaps illustrating categorized continuous data and annotated heatmaps. Also, there are various options to interact with the x-axis to prevent overlapping axis labels, e.g. via simple sliders or range sliders. Besides the viewer pane, resulting plots can be saved as a standalone HTML file, embedded in R Markdown documents or in a Shiny app.
This package is Cytometry dATa anALYSis Tools (CATALYST). Mass cytometry like Cytometry by time of flight (CyTOF) uses heavy metal isotopes rather than fluorescent tags as reporters to label antibodies, thereby substantially decreasing spectral overlap and allowing for examination of over 50 parameters at the single cell level. While spectral overlap is significantly less pronounced in CyTOF than flow cytometry, spillover due to detection sensitivity, isotopic impurities, and oxide formation can impede data interpretability. CATALYST
was designed to provide a pipeline for preprocessing of cytometry data, including:
normalization using bead standards;
single-cell deconvolution;
bead-based compensation.
This package provides the tools to produce catseye plots, principally by catseyesplot()
function which calls R's standard plot()
function internally, or alternatively by the catseyes()
function to overlay the catseye plot onto an existing R plot window. Catseye plots illustrate the normal distribution of the mean (picture a normal bell curve reflected over its base and rotated 90 degrees), with a shaded confidence interval; they are an intuitive way of illustrating and comparing normally distributed estimates, and are arguably a superior alternative to standard confidence intervals, since they show the full distribution rather than fixed quantile bounds. The catseyesplot and catseyes functions require pre-calculated means and standard errors (or standard deviations), provided as numeric vectors; this allows the flexibility of obtaining this information from a variety of sources, such as direct calculation or prediction from a model. Catseye plots, as illustrations of the normal distribution of the means, are described in Cumming (2013 & 2014). Cumming, G. (2013). The new statistics: Why and how. Psychological Science, 27, 7-29. <doi:10.1177/0956797613504966> pmid:24220629.
This package provides functions for computing the one-sided p-values of the Cochran-Armitage trend test statistic for the asymptotic and the exact conditional test. The computation of the p-value for the exact test is performed using an algorithm following an idea by Mehta, et al. (1992) <doi:10.2307/1390598>.
To improve estimation accuracy and stability in statistical modeling, catalytic prior distributions are employed, integrating observed data with synthetic data generated from a simpler model's predictive distribution. This approach enhances model robustness, stability, and flexibility in complex data scenarios. The catalytic prior distributions are introduced by Huang et al. (2020, <doi:10.1073/pnas.1920913117>), Li and Huang (2023, <doi:10.48550/arXiv.2312.01411>
).
This package addresses two broad areas. It allows for in-depth analysis of spatial transcriptomic data by identifying tissue neighbourhoods. These are contiguous regions of tissue surrounding individual cells. CatsCradle
allows for the categorisation of neighbourhoods by the cell types contained in them and the genes expressed in them. In particular, it produces Seurat objects whose individual elements are neighbourhoods rather than cells. In addition, it enables the categorisation and annotation of genes by producing Seurat objects whose elements are genes.
Did you ever wish you could make scatter plots with cat shaped points? Now you can!
This package contains some commonly used categorical variable encoders, such as LabelEncoder
and OneHotEncoder
'. Inspired by the encoders implemented in Python sklearn.preprocessing package (see <http://scikit-learn.org/stable/modules/preprocessing.html>).
Calculates significant annotations (categories) in each of two (or more) feature (i.e. gene) lists, determines the overlap between the annotations, and returns graphical and tabular data about the significant annotations and which combinations of feature lists the annotations were found to be significant. Interactive exploration is facilitated through the use of RCytoscape (heavily suggested).
Datasets used in the book "Categorical Data Analysis" by Agresti (2012, ISBN:978-0-470-46363-5) but not printed in the book. Datasets and help pages were automatically produced from the source <https://users.stat.ufl.edu/~aa/cda/data.html> by the R script foo.R, which can be found in the GitHub
repository.
Simple, fast, and automatic encodings for category data using a data.table backend. Most of the methods are an implementation of "Sufficient Representation for Categorical Variables" by Johannemann, Hadad, Athey, Wager (2019) <arXiv:1908.09874>
, particularly their mean, sparse principal component analysis, low rank representation, and multinomial logit encodings.