The CEMiTool package unifies the discovery and the analysis of coexpression gene modules in a fully automatic manner, while providing a user-friendly html report with high quality graphs. Our tool evaluates if modules contain genes that are over-represented by specific pathways or that are altered in a specific sample group. Additionally, CEMiTool is able to integrate transcriptomic data with interactome information, identifying the potential hubs on each network.
This package provides reference data required for ewce. Expression Weighted Celltype Enrichment (EWCE) is used to determine which cell types are enriched within gene lists. The package provides tools for testing enrichments within simple gene lists (such as human disease associated genes) and those resulting from differential expression studies. The package does not depend upon any particular Single Cell Transcriptome dataset and user defined datasets can be loaded in and used in the analyses.
This package implements Bayesian dynamic factor analysis with Stan'. Dynamic factor analysis is a dimension reduction tool for multivariate time series. bayesdfa extends conventional dynamic factor models in several ways. First, extreme events may be estimated in the latent trend by modeling process error with a student-t distribution. Second, alternative constraints (including proportions are allowed). Third, the estimated dynamic factors can be analyzed with hidden Markov models to evaluate support for latent regimes.
In the context of high-throughput genetic data, CoDaCoRe identifies a set of sparse biomarkers that are predictive of a response variable of interest (Gordon-Rodriguez et al., 2021) <doi:10.1093/bioinformatics/btab645>. More generally, CoDaCoRe can be applied to any regression problem where the independent variable is Compositional (CoDa), to derive a set of scale-invariant log-ratios (ILR or SLR) that are maximally associated to a dependent variable.
Easy visualization for datasets with more than two categorical variables and additional continuous variables. The package is particularly useful for exploring complex categorical data in the context of pathway analysis across multiple conditions. This package is now in maintenance-only mode and kept for legacy compatibility; for new projects and active development, please use the successor package ggdiceplot (see <https://github.com/maflot/ggdiceplot> and <https://dice-and-domino-plot.readthedocs.io/en/latest/>).
Easily import multi-frequency acoustic data stored in HAC files (see <doi:10.17895/ices.pub.5482> for more information on the format), and produce echogram visualisations with predefined or customized color palettes. It is also possible to merge consecutive echograms; mask or delete unwanted echogram areas; model and subtract background noise; and more important, develop, test and interpret different combinations of frequencies in order to perform acoustic filtering of the echogram's data.
Interface to Eurostatâ s API (SDMX 2.1) with fast data.table-based import of data, labels, and metadata. On top of the core functionality, data search and data description/comparison functions are also provided. Use <https://github.com/alekrutkowski/eurodata_codegen> â a point-and-click app for rapid and easy generation of richly-commented R code â to import a Eurostat dataset or its subset (based on the eurodata::importData() function).
Decision curve analysis is a method for evaluating and comparing prediction models that incorporates clinical consequences, requires only the data set on which the models are tested, and can be applied to models that have either continuous or dichotomous results. The ggscidca package adds coloured bars of discriminant relevance to the traditional decision curve. Improved practicality and aesthetics. This method was described by Balachandran VP (2015) <doi:10.1016/S1470-2045(14)71116-7>.
Used for analyzing immune responses and predicting vaccine efficacy using machine learning and advanced data processing techniques. Immunaut integrates both unsupervised and supervised learning methods, managing outliers and capturing immune response variability. It performs multiple rounds of predictive model testing to identify robust immunogenicity signatures that can predict vaccine responsiveness. The platform is designed to handle high-dimensional immune data, enabling researchers to uncover immune predictors and refine personalized vaccination strategies across diverse populations.
Collection of functions for fast manipulation, handling, and analysis of large-scale networks based on family and social data. Functions are utility functions used to manipulate data in three "formats": sparse adjacency matrices, pedigree trio family data, and pedigree family data. When possible, the functions should be able to handle millions of data points quickly for use in combination with data from large public national registers and databases. Kenneth Lange (2003, ISBN:978-8181281135).
This package provides an interface to the PubChem database via the PUG REST <https://pubchem.ncbi.nlm.nih.gov/docs/pug-rest> and PUG View <https://pubchem.ncbi.nlm.nih.gov/docs/pug-view> services. This package allows users to automatically access chemical and biological data from PubChem', including compounds, substances, assays, and various other data types. Functions are available to retrieve data in different formats, perform searches, and access detailed annotations.
Procrustes matching of the posterior samples of person and item latent positions from latent space item response models. The methods implemented in this package are based on work by Borg, I., Groenen, P. (1997, ISBN:978-0-387-94845-4), Jeon, M., Jin, I. H., Schweinberger, M., Baugh, S. (2021) <doi:10.1007/s11336-021-09762-5>, and Andrew, D. M., Kevin M. Q., Jong Hee Park. (2011) <doi:10.18637/jss.v042.i09>.
This package performs canonical correlation for survey data, including multiple tests of significance for secondary canonical correlations. A key feature of this package is that it incorporates survey data structure directly in a novel test of significance via a sequence of simple linear regression models on the canonical variates. See reference - Cruz-Cano, Cohen, and Mead-Morse (2024) "Canonical Correlation Analysis of Survey data: the SurveyCC R package" The R Journal under review.
This package implements the Smoothness-Penalized Deconvolution method for estimating a probability density under measurement error of Kent and Ruppert (2023) <doi:10.1080/01621459.2023.2259028>. The estimator is formed by computing a histogram of the error-contaminated data, and then finding an estimate that minimizes a reconstruction error plus a smoothness-inducing penalty term. The primary function, sped(), takes the data and error distribution, and returns the estimator as a function.
This package provides a set of methods to implement Generalized Method of Moments and Maximal Likelihood methods for Random Utility Models. These methods are meant to provide inference on rank comparison data. These methods accept full, partial, and pairwise rankings, and provides methods to break down full or partial rankings into their pairwise components. Please see Generalized Method-of-Moments for Rank Aggregation from NIPS 2013 for a description of some of our methods.
This package provides a spatio-dynamic modelling package that focuses on three characteristic wetland plant communities in a semiarid Mediterranean wetland in response to hydrological pressures from the catchment. The package includes the data on watershed hydrological pressure and the initial raster maps of plant communities but also allows for random initial distribution of plant communities. For more detailed info see: Martinez-Lopez et al. (2015) <doi:10.1016/j.ecolmodel.2014.11.024>.
On discrete data spectral analysis is performed by Fourier and Hilbert transforms as well as with model based analysis called Lomb-Scargle method. Fragmented and irregularly spaced data can be processed in almost all methods. Both, FFT as well as LOMB methods take multivariate data and return standardized PSD. For didactic reasons an analytical approach for deconvolution of noise spectra and sampling function is provided. A user friendly interface helps to interpret the results.
Fits a wide variety of multivariate spatio-temporal models with simultaneous and lagged interactions among variables (including vector autoregressive spatio-temporal ('VAST') dynamics) for areal, continuous, or network spatial domains. It includes time-variable, space-variable, and space-time-variable interactions using dynamic structural equation models ('DSEM') as expressive interface, and the mgcv package to specify splines via the formula interface. See Thorson et al. (2025) <doi:10.1111/geb.70035> for more details.
BEAST2 (<https://www.beast2.org>) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. Tracer (<https://github.com/beast-dev/tracer/>) is a GUI tool to parse and analyze the files generated by BEAST2'. This package provides a way to parse and analyze BEAST2 input files without active user input, but using R function calls instead.
Extendable R6 file comparison classes, including a shiny app for combining the comparison functionality into a file comparison application. The package idea originates from pharma companies drug development processes, where statisticians and statistical programmers need to review and compare different versions of the same outputs and datasets. The package implementation itself is not tied to any specific industry and can be used in any context for easy file comparisons between different file version sets.
This package facilitates the analysis of single-cell RNA-seq UMI matrices. It does this by computing partitions of a cell similarity graph into small homogeneous groups of cells, which are defined as metacells (MCs). The derived MCs are then used for building different representations of the data, allowing matrix or 2D graph visualization forming a basis for analysis of cell types, subtypes, transcriptional gradients,cell-cycle variation, gene modules and their regulatory models and more.
Package to predict protein-protein interaction (PPI) networks in target organisms for which only a view information about PPIs is available. Path2PPI predicts PPI networks based on sets of proteins which can belong to a certain pathway from well-established model organisms. It helps to combine and transfer information of a certain pathway or biological process from several reference organisms to one target organism. Path2PPI only depends on the sequence similarity of the involved proteins.
Large data files can be difficult to work with in R, where data generally resides in memory. This package encourages a style of programming where data is streamed from disk into R via a `producer and through a series of `consumers that, typically reduce the original data to a manageable size. The package provides useful Producer and Consumer stream components for operations such as data input, sampling, indexing, and transformation; see package?Streamer for details.
Uniquorn enables users to identify cancer cell lines. Cancer cell line misidentification and cross-contamination reprents a significant challenge for cancer researchers. The identification is vital and in the frame of this package based on the locations/ loci of somatic and germline mutations/ variations. The input format is vcf/ vcf.gz and the files have to contain a single cancer cell line sample (i.e. a single member/genotype/gt column in the vcf file).