Utility functions for the statistical analysis of corpus frequency data. This package is a companion to the open-source course "Statistical Inference: A Gentle Introduction for Computational Linguists and Similar Creatures" ('SIGIL').
An interactive document on the topic of classification tree analysis using rmarkdown and shiny packages. Runtime examples are provided in the package function as well as at <https://kartikeyab.shinyapps.io/CTShiny/>.
Package to analyze the clinical utility of a biomarker. It provides the clinical utility curve, clinical utility table, efficacy of a biomarker, clinical efficacy curve and tests to compare efficacy between markers.
This package contains one main function deduped()
which speeds up slow, vectorized functions by only performing computations on the unique values of the input and expanding the results at the end.
Build donut/pie charts with ggplot2 layer by layer, exploiting the advantages of polar symmetry. Leverage layouts to distribute labels effectively. Connect labels to donut segments using pins. Streamline annotation and highlighting.
This package provides methods for estimating multi-stage optimal dynamic treatment regimes for survival outcomes with dependent censoring. Cho, H., Holloway, S. T., and Kosorok, M. R. (2020) <arXiv:2012.03294>
.
Work within the dplyr workflow to add random variates to your data frame. Variates can be added at any level of an existing column. Also, bounds can be specified for simulated variates.
This package contains a collection of examples of evidence factors in observational studies from the book Replication and Evidence Factors in Observational Studies by Paul R. Rosenbaum (2021) <doi:10.1201/9781003039648>.
Predicts cytoplasmic effector proteins using genomic data by searching for motifs of interest using regular expression searches and hidden Markov models (HMM) based in Haas et al. (2009) <doi:10.1038/nature08358>.
This SVG elements generator can easily generate SVG elements such as rect, line, circle, ellipse, polygon, polyline, text and group. Also, it can combine and output SVG elements into a SVG file.
This cointegration based Time Delay Neural Network Model hybrid model allows the researcher to make use of the information extracted by the cointegrating vector as an input in the neural network model.
Obtain Formula 1 data via the Jolpica API <https://jolpi.ca> and the unofficial API <https://www.formula1.com/en/timing/f1-live> via the fastf1 Python library <https://docs.fastf1.dev/>.
Finds the critical sample size ("critical point of stability") for a correlation to stabilize in Schoenbrodt and Perugini's definition of sequential stability (see <doi:10.1016/j.jrp.2013.05.009>).
This package provides ggplot2 geoms that allow groups of data points to be outlined or highlighted for emphasis. This is particularly useful when working with dense datasets that are prone to overplotting.
This package contains data sets, programmes and illustrations discussed in the book, "Introduction to Probability, Statistics and R: Foundations for Data-Based Sciences." Sahu (2024, isbn:9783031378645) describes the methods in detail.
Multivariate Expectation-Maximization (EM) based imputation framework that offers several different algorithms. These include regularisation methods like Lasso and Ridge regression, tree-based models and dimensionality reduction methods like PCA and PLS.
Various functions and a Shiny app to enrich the results of Multiple Correspondence Analysis with interpretive axes and planes (see Moschidis, Markos, and Thanopoulos, 2022; <doi:10.1108/ACI-07-2022-0191>).
This package provides a graph proposed by Rosenbaum is useful for checking some properties of various sorts of latent scale, this program generates commands to obtain the graph using dot from graphviz'.
Computing metabolite set enrichment analysis (MSEA) (Yamamoto, H. et al. (2014) <doi:10.1186/1471-2105-15-51>) and single sample enrichment analysis (SSEA) (Yamamoto, H. (2023) <doi:10.51094/jxiv.262>).
Fit multi-level models with possibly correlated random effects using Markov Chain Monte Carlo simulation. Such models allow smoothing over space and time and are useful in, for example, small area estimation.
This package provides a simple n-gram (contiguous sequences of n items from a given sequence of text) tokenizer to be used with the tm package with no rJava'/'RWeka
dependency.
Estimates win ratio or Mann-Whitney parameter for two group comparisons using ordered composite endpoints with right censoring as described in Follmann, Fay, Hamasaki, and Evans (2020)<doi:10.1002/sim.7890>.
Given a certain coverage level, obtains simultaneous confidence bands for the survival and cumulative hazard functions such that the area between is minimized. Produces an approximate solution based on local time arguments.
This wrapper houses PathLit
API endpoints for R. The usage of these endpoints require the use of an API key which can be obtained at <https://www.pathlit.io/docs/cli/>.