Efficient computation of likelihoods in design-based choice response time models, including the Decision Diffusion Model, is supported. The package enables rapid evaluation of likelihood functions for both single- and multi-subject models across trial-level data. It also offers fast initialisation of starting parameters for genetic sampling with many Markov chains, facilitating estimation in complex models typically found in experimental psychology and behavioural science. These optimisations help reduce computational overhead in large-scale model fitting tasks.
This package provides Shiny gadgets to search, type, and insert IPA symbols into documents or scripts, requiring only knowledge about phonetics or X-SAMPA'. Also provides functions to facilitate the rendering of IPA symbols in LaTeX and PDF format, making IPA symbols properly rendered in all output formats. A minimal R Markdown template for authoring Linguistics related documents is also bundled with the package. Some helper functions to facilitate authoring with R Markdown is also provided.
My PhD supervisor once told me that everyone doing newspaper analysis starts by writing code to read in files from the LexisNexis newspaper archive (retrieved e.g., from <https://www.lexisnexis.com/> or any of the partner sites). However, while this is a nice exercise I do recommend, not everyone has the time. This package takes files downloaded from the newspaper archive of LexisNexis', reads them into R and offers functions for further processing.
This package provides a set of functions useful when evaluating the results of presence-absence models. Package includes functions for calculating threshold dependent measures such as confusion matrices, pcc, sensitivity, specificity, and Kappa, and produces plots of each measure as the threshold is varied. It will calculate optimal threshold choice according to a choice of optimization criteria. It also includes functions to plot the threshold independent ROC curves along with the associated AUC (area under the curve).
The RIT font collection provides versions of ten font families in Malayalam (the language spoken in the southern Indian state of Kerala) script in TrueType and WOFF2 formats. The fonts are: RIT Rachana, RIT Panmana, RIT MeeraNew, RIT TN Joy, RIT Karuna, RIT Keralayeeam, RIT Sundar, RIT Uroob, RIT Ezhuthu, and RIT Kutty.
A LaTeX package that will help users to make use of these Unicode-compliant fonts in LaTeX documents with XeTeX or LuaTeX is also provided.
Package to assess the calibration of probabilistic classifiers using confidence bands for monotonic functions. Besides testing the classical goodness-of-fit null hypothesis of perfect calibration, the confidence bands calculated within that package facilitate inverted goodness-of-fit tests whose rejection allows for a sought-after conclusion of a sufficiently well-calibrated model. The package creates flexible graphical tools to perform these tests. For construction details see also Dimitriadis, Dümbgen, Henzi, Puke, Ziegel (2022) <arXiv:2203.04065>.
Online data collection tools like Google Forms often export multiple-response questions with data concatenated in cells. The concat.split (cSplit) family of functions provided by this package splits such data into separate cells. This package also includes functions to stack groups of columns and to reshape wide data, even when the data are "unbalanced"---something which reshape (from base R) does not handle, and which melt and dcast from reshape2 do not easily handle.
This package provides a Shiny app that can disconnect for a variety of reasons: an unrecoverable error occurred in the app, the server went down, the user lost internet connection, or any other reason that might cause the Shiny app to lose connection to its server. With shinydisconnect, you can call disonnectMessage anywhere in a Shiny app's UI to add a nice message when this happens. It works locally (running Shiny apps within RStudio) and on Shiny servers.
This package provides tools for extracting word and phrase frequencies from the Child Language Data Exchange System (CHILDES) database via the childesr API. Supports type-level word counts, token-mode searches with simple wildcard patterns and part-of-speech filters, optional stemming, and Zipf-scaled frequencies. Provides normalization per number of tokens or utterances, speaker-role breakdowns, dataset summaries, and export to Excel workbooks for reproducible child language research. The CHILDES database is maintained at <https://talkbank.org/childes/>.
Modular and unified R6-based interface for counterfactual explanation methods. The following methods are currently implemented: Burghmans et al. (2022) <doi:10.48550/arXiv.2104.07411>, Dandl et al. (2020) <doi:10.1007/978-3-030-58112-1_31> and Wexler et al. (2019) <doi:10.1109/TVCG.2019.2934619>. Optional extensions allow these methods to be applied to a variety of models and use cases. Once generated, the counterfactuals can be analyzed and visualized by provided functionalities.
This package provides a flexible tool for enrichment analysis based on user-defined sets. It allows users to perform over-representation analysis of the custom sets among any specified ranked feature list, hence making enrichment analysis applicable to various types of data from different scientific fields. EnrichIntersect also enables an interactive means to visualize identified associations based on, for example, the mix-lasso model (Zhao et al., 2022 <doi:10.1016/j.isci.2022.104767>) or similar methods.
Find topics in texts which are semantically embedded using techniques like word2vec or Glove. This topic modelling technique models each word with a categorical distribution whose natural parameter is the inner product between a word embedding and an embedding of its assigned topic. The techniques are explained in detail in the paper Topic Modeling in Embedding Spaces by Adji B. Dieng, Francisco J. R. Ruiz, David M. Blei (2019), available at <doi:10.48550/arXiv.1907.04907>.
Conduct post-selection inference for regression coefficients in linear models after they have been selected by adjusted R squared. The p-values and confidence intervals are valid after model selection with the same data. This allows the user to use all data for both model selection and inference without losing control over the type I error rate. The provided tests are more powerful than data splitting, which bases inference on less data since it discards all information used for selection.
Sequences sampled at different time points can be used to infer molecular phylogenies on natural time scales, but if the sequences records inaccurate sampling times, that are not the actual sampling times, then it will affect the molecular phylogenetic analysis. This shiny application helps exploring temporal characteristics of the evolutionary trees through linear regression analysis and with the ability to identify and remove incorrect labels. The method was extended to support exploring other phylogenetic signals under strict and relaxed models.
This package contains a collection of 9 datasets, andrews and bakulski cord blood, blood gse35069, blood gse35069 chen, blood gse35069 complete, combined cord blood, cord bloo d gse68456, gervin and lyle cord blood, guintivano dlpfc and saliva gse48472. The data are used to estimate cell counts using Extrinsic epigenetic age acceleration (EEAA) method. It also contains a collection of 12 datasets to use with MethylClock package to estimate chronological and gestational DNA methylation with estimators to use with different methylation clocks.
Fit growth curves to various known microbial growth models automatically to estimate growth parameters. Growth curves can be plotted with their uncertainty band. Growth models are: modified Gompertz model (Zwietering et al. (1990) <doi:10.1128/aem.56.6.1875-1881.1990>), Baranyi model (Baranyi and Roberts (1994) <doi:10.1016/0168-1605%2894%2990157-0>), Rosso model (Rosso et al. (1993) <doi:10.1006/jtbi.1993.1099>) and linear model (Dantigny (2005) <doi:10.1016/j.ijfoodmicro.2004.10.013>).
This package implements sentiment analysis using huggingface <https://huggingface.co> transformer zero-shot classification model pipelines for text and image data. The default text pipeline is Cross-Encoder's DistilRoBERTa <https://huggingface.co/cross-encoder/nli-distilroberta-base> and default image/video pipeline is Open AI's CLIP <https://huggingface.co/openai/clip-vit-base-patch32>. All other zero-shot classification model pipelines can be implemented using their model name from <https://huggingface.co/models?pipeline_tag=zero-shot-classification>.
This package provides the data that were used in the http://quinlanlab.org/tutorials/bedtools/bedtools.html. It includes a subset of the DnaseI hypersensitivity data from "Maurano et al. Systematic Localization of Common Disease-Associated Variation in Regulatory DNA. Science. 2012. Vol. 337 no. 6099 pp. 1190-1195." The rest of the tracks were originally downloaded from the UCSC table browser. See the HelloRanges vignette for a port of the bedtools tutorial to R.
The currentSurvival package contains functions for the estimation of the current cumulative incidence (CCI) and the current leukaemia-free survival (CLFS). The CCI is the probability that a patient is alive and in any disease remission (e.g. complete cytogenetic remission in chronic myeloid leukaemia) after initiating his or her therapy (e.g. tyrosine kinase therapy for chronic myeloid leukaemia). The CLFS is the probability that a patient is alive and in any disease remission after achieving the first disease remission.
Test the marginal correlation between a scalar response variable with a vector of explanatory variables using the max-type test with bootstrap. The test is based on the max-type statistic and its asymptotic distribution under the null hypothesis of no marginal correlation. The bootstrap procedure is used to approximate the null distribution of the test statistic. The package provides a function for performing the test. For more technical details, refer to Zhang and Laber (2014) <doi:10.1080/01621459.2015.1106403>.
Volcano plots represent a useful way to visualise the results of differential expression analyses. This package provides a highly-configurable function that produces publication-ready volcano plots. EnhancedVolcano will attempt to fit as many point labels in the plot window as possible, thus avoiding clogging up the plot with labels that could not otherwise have been read. Other functionality allows the user to identify up to 4 different types of attributes in the same plot space via color, shape, size, and shade parameter configurations.
This package provides functions to calculate the requisite sample size for studies where ICC is the primary outcome. Can also be used for calculation of power. In both cases it allows the user to test the impact of changing input variables by calculating the outcome for several different values of input variables. Based off the work of Zou. Zou, G. Y. (2012). Sample size formulas for estimating intraclass correlation coefficients with precision and assurance. Statistics in medicine, 31(29), 3972-3981.
Values different types of assets and calibrates discount curves for quantitative financial analysis. It covers fixed coupon assets, floating note assets, interest and cross currency swaps with different payment frequencies. Enables the calibration of spot, instantaneous forward and basis curves, making it a powerful tool for accurate and flexible bond valuation and curve generation. The valuation and calibration techniques presented here are consistent with industry standards and incorporates author's own calculations. Tuckman, B., Serrat, A. (2022, ISBN: 978-1-119-83555-4).
Programs for detecting and cleaning outliers in single time series and in time series from homogeneous and heterogeneous databases using an Orthogonal Greedy Algorithm (OGA) for saturated linear regression models. The programs implement the procedures presented in the paper entitled "Efficient Outlier Detection for Large Time Series Databases" by Pedro Galeano, Daniel Peña and Ruey S. Tsay (2025), working paper, Universidad Carlos III de Madrid. Version 1.1.1 contains some improvements in parallelization with respect to version 1.0.1.