Imports conversation transcripts into R, concatenates them into a single dataframe appending event identifiers, cleans and formats the text, then yokes user-specified psycholinguistic database values to each word. ConversationAlign then computes alignment indices between two interlocutors across each transcript for >40 possible semantic, lexical, and affective dimensions. In addition to alignment, ConversationAlign also produces a table of analytics (e.g., token count, type-token-ratio) in a summary table describing your particular text corpus.
Implementation of the Future API <doi:10.32614/RJ-2021-048> on top of the batchtools package. This allows you to process futures, as defined by the future package, in parallel out of the box, not only on your local machine or ad-hoc cluster of machines, but also via high-performance compute ('HPC') job schedulers such as LSF', OpenLava', Slurm', SGE', and TORQUE / PBS', e.g. y <- future.apply::future_lapply(files, FUN = process)'.
Inference of ligand-receptor (L-R) interactions from single-cell expression (transcriptomics/proteomics) data. SingleCellSignalR v2 inferences rely on the statistical model we introduced in the BulkSignalR package as well as the original SingleCellSignalR LR-score (both are available). SingleCellSignalR v2 can be regarded as a wrapper to BulkSignalR fundamental classes. This also enables v2 users to work with any species, whereas only Mus musculus & Homo sapiens were available before in SingleCellSignalR v1.
Testing for trajectory presence and heterogeneity on multivariate data. Two statistical methods (Tenha & Song 2022) <doi:10.1371/journal.pcbi.1009829> are implemented. The tree dimension test quantifies the statistical evidence for trajectory presence. The subset specificity measure summarizes pattern heterogeneity using the minimum subtree cover. There is no user tunable parameters for either method. Examples are included to illustrate how to use the methods on single-cell data for studying gene and pathway expression dynamics and pathway expression specificity.
The aim is to take in data.frame inputs and utilises methods, such as recursive feature engineering, to enable the features to be removed. What this does differently from the other packages, is that it gives you the choice to remove the variables manually, or it automated this process. Feature selection is a concept in machine learning, and statistical pipelines, whereby unimportant, or less predictive variables are eliminated from the analysis, see Boughaci (2018) <doi:10.1007/s40595-018-0107-y>.
Implement tableGrob object as a clickable image map. The clickableImageMap package is designed to be more convenient and more configurable than the edit() function. Limitations that I have encountered with edit() are cannot control (1) positioning (2) size (3) appearance and formatting of fonts In contrast, when the table is implemented as a tableGrob', all of these features are controllable. In particular, the ggplot2 grid system allows exact positioning of the table relative to other graphics etc.
The XCB util module provides a number of libraries which sit on top of libxcb, the core X protocol library, and some of the extension libraries. These experimental libraries provide convenience functions and interfaces which make the raw X protocol more usable. Some of the libraries also provide client-side code which is not strictly part of the X protocol but which has traditionally been provided by Xlib.
The XCB util-renderutil module provides the following library:
- renderutil: Convenience functions for the Render extension.
Arithmetic operations scalar multiplication, addition, subtraction, multiplication and division of LR fuzzy numbers (which are on the basis of extension principle) have a complicate form for using in fuzzy Statistics, fuzzy Mathematics, machine learning, fuzzy data analysis and etc. Calculator for LR Fuzzy Numbers package relieve and aid applied users to achieve a simple and closed form for some complicated operator based on LR fuzzy numbers and also the user can easily draw the membership function of the obtained result by this package.
The packages provides position specific weight matrices (PWMs) for 303 human serine/threonine and 93 tyrosine kinases originally published in Johnson et al. 2023 (doi:10.1038/s41586-022-05575-3) and Yaron-Barir et al. 2024 (doi:10.1038/s41586-024-07407-y). The package includes basic functionality to score user provided phosphosites. It also includes pre-computed PWM scores ("background scores") for a large collection of curated human phosphosites which can be used to rank PWM scores relative to the background scores ("percentile rank").
With this package you can build a Storable instance of a record type from Storable instances of its elements in an elegant way. It does not do any magic, just a bit arithmetic to compute the right offsets, that would be otherwise done manually or by a preprocessor like C2HS. There is no guarantee that the generated memory layout is compatible with that of a corresponding C struct. However, the module generates the smallest layout that is possible with respect to the alignment of the record elements.
Inference methods for factor copula models for continuous data in Krupskii and Joe (2013) <doi:10.1016/j.jmva.2013.05.001>, Krupskii and Joe (2015) <doi:10.1016/j.jmva.2014.11.002>, Fan and Joe (2024) <doi:10.1016/j.jmva.2023.105263>, one factor truncated vine models in Joe (2018) <doi:10.1002/cjs.11481>, and Gaussian oblique factor models. Functions for computing tail-weighted dependence measures in Lee, Joe and Krupskii (2018) <doi:10.1080/10485252.2017.1407414> and estimating tail dependence parameter.
Generative Adversarial Networks are applied to generate generative data for a data source. A generative model consisting of a generator and a discriminator network is trained. During iterative training the distribution of generated data is converging to that of the data source. Direct applications of generative data are the created functions for data evaluation, missing data completion and data classification. A software service for accelerated training of generative models on graphics processing units is available. Reference: Goodfellow et al. (2014) <doi:10.48550/arXiv.1406.2661>.
This package provides an implementation of simplicial complexes for Topological Data Analysis (TDA). The package includes functions to compute faces, boundary operators, Betti numbers, Euler characteristic, and to construct simplicial complexes. It also implements persistent homology, from building filtrations to computing persistence diagrams, with the aim of helping readers understand the core concepts of computational topology. Methods are based on standard references in persistent homology such as Zomorodian and Carlsson (2005) <doi:10.1007/s00454-004-1146-y> and Chazal and Michel (2021) <doi:10.3389/frai.2021.667963>.
Interactive visualization of effects, response functions and marginal effects for different kinds of regression models. In this version linear regression models, generalized linear models, generalized additive models and linear mixed-effects models are supported. Major features are the interactive approach and the handling of the effects of categorical covariates: if two or more factors are used as covariates every combination of the levels of each factor is treated separately. The automatic calculation of marginal effects and a number of possibilities to customize the graphical output are useful features as well.
Nested Partially Balanced Bipartite Block (NPBBB) designs involve two levels of blocking: (i) The block design (ignoring sub-block classification) serves as a partially balanced bipartite block (PBBB) design, and (ii) The sub-block design (ignoring block classification) also serves as a PBBB design. More details on constructions of the PBBB designs and their characterization properties are available in Vinayaka et al.(2023) <doi:10.1080/03610926.2023.2251623>. This package calculates A-efficiency values for both block and sub-block structures, along with all parameters of a given NPBBB design.
This package implements Surprisal analysis for gene expression data such as RNA-seq or microarray experiments. Surprisal analysis is an information-theoretic method that decomposes gene expression data into a baseline state and constraint-associated deviations, capturing coordinated gene expression patterns under different biological conditions. References: Kravchenko-Balasha N. et al. (2014) <doi:10.1371/journal.pone.0108549>. Zadran S. et al. (2014) <doi:10.1073/pnas.1414714111>. Su Y. et al. (2019) <doi:10.1371/journal.pcbi.1007034>. Bogaert K. A. et al. (2018) <doi:10.1371/journal.pone.0195142>.
This package provides a set of tools to analyze and visualize the relationships between host-associated microbiomes of hybrid organisms and those of their progenitor species. Though not necessary, installing the microViz package is recommended as a check for phyloseq objects. To install microViz from R Universe use the following command: install.packages("microViz", repos = c(davidbarnett = "https://david-barnett.r-universe.dev", getOption("repos"))). To install microViz from GitHub use the following commands: install.packages("devtools") followed by devtools::install_github("david-barnett/microViz").
This package implements the methodology of Huling, Smith, and Chen (2020) <doi:10.1080/01621459.2020.1801449>, which allows for subgroup identification for semi-continuous outcomes by estimating individualized treatment rules. It uses a two-part modeling framework to handle semi-continuous data by separately modeling the positive part of the outcome and an indicator of whether each outcome is positive, but still results in a single treatment rule. High dimensional data is handled with a cooperative lasso penalty, which encourages the coefficients in the two models to have the same sign.
This package implements the methods for assessing heterogeneous cluster-specific treatment effects in partially nested designs as described in Liu (2024) <doi:10.1037/met0000723>. The estimation uses the multiply robust method, allowing for the use of machine learning methods in model estimation (e.g., random forest, neural network, and the super learner ensemble). Partially nested designs (also known as partially clustered designs) are designs where individuals in the treatment arm are assigned to clusters (e.g., teachers, tutoring groups, therapists), whereas individuals in the control arm have no such clustering.
This package provides influence function-based methods to evaluate a longitudinal surrogate marker in a censored time-to-event outcome setting, with plug-in and targeted maximum likelihood estimation options. Details are described in: Agniel D and Parast L (2025). "Robust Evaluation of Longitudinal Surrogate Markers with Censored Data." Journal of the Royal Statistical Society: Series B <doi:10.1093/jrsssb/qkae119>. A tutorial for this package can be found at <https://www.laylaparast.com/survivalsurrogate> and a Shiny App implementing the package can be found at <https://parastlab.shinyapps.io/survivalsurrogateApp/>.
This package provides a metric expressing the quality of a UMAP layout. This is a package that contains the Saturn_coefficient() function that reads an input matrix, its dimensionality reduction produced by UMAP, and evaluates the quality of this dimensionality reduction by producing a real value in the [0; 1] interval. We call this real value Saturn coefficient. A higher value means better dimensionality reduction; a lower value means worse dimensionality reduction. Reference: Davide Chicco et al. "The Saturn coefficient for evaluating the quality of UMAP dimensionality reduction results" (2025, in preparation).
This is a R package to compute the automorphisms between pairwise aligned DNA sequences represented as elements from a Genomic Abelian group. In a general scenario, from genomic regions till the whole genomes from a given population (from any species or close related species) can be algebraically represented as a direct sum of cyclic groups or more specifically Abelian p-groups. Basically, we propose the representation of multiple sequence alignments of length N bp as element of a finite Abelian group created by the direct sum of homocyclic Abelian group of prime-power order.
This package provides a basic implementation of the change in mean detection method outlined in: Taylor, Wayne A. (2000) <https://variation.com/wp-content/uploads/change-point-analyzer/change-point-analysis-a-powerful-new-tool-for-detecting-changes.pdf>. The package recursively uses the mean-squared error change point calculation to identify candidate change points. The candidate change points are then re-estimated and Taylor's backwards elimination process is then employed to come up with a final set of change points. Many of the underlying functions are written in C++ for improved performance.
Implementation of selected Tidyverse functions within DataSHIELD', an open-source federated analysis solution in R. Currently, DataSHIELD contains very limited tools for data manipulation, so the aim of this package is to improve the researcher experience by implementing essential functions for data manipulation, including subsetting, filtering, grouping, and renaming variables. This is the clientside package which should be installed locally, and is used in conjuncture with the serverside package dsTidyverse which is installed on the remote server holding the data. For more information, see <https://tidyverse.org/> and <https://datashield.org/>.