This package is a Shiny app for interactively analyzing and visualizing Nanostring GeoMX Whole Transcriptome Atlas data. Users have the option of exploring a sample data to explore this app's functionality. Regions of interest (ROIs) can be filtered based on any user-provided metadata. Upon taking two or more groups of interest, all pairwise and ANOVA-like testing are automatically performed. Available ouputs include PCA, Volcano plots, tables and heatmaps. Aesthetics of each output are highly customizable.
The package contains functions to infer and visualize cell cycle process using Single-cell RNA-Seq data. It exploits the idea of transfer learning, projecting new data to the previous learned biologically interpretable space. The tricycle provides a pre-learned cell cycle space, which could be used to infer cell cycle time of human and mouse single cell samples. In addition, it also offer functions to visualize cell cycle time on different embeddings and functions to build new reference.
Given a set of genomic sites/regions (e.g. ChIP-seq peaks, CpGs, differentially methylated CpGs or regions, SNPs, etc.) it is often of interest to investigate the intersecting genomic annotations. Such annotations include those relating to gene models (promoters, 5'UTRs, exons, introns, and 3'UTRs), CpGs (CpG islands, CpG shores, CpG shelves), or regulatory sequences such as enhancers. The annotatr package provides an easy way to summarize and visualize the intersection of genomic sites/regions with genomic annotations.
This package provides a small collection of interesting and educational machine learning data sets which are used as examples in the mlr3 book Applied machine learning using mlr3 in R https://mlr3book.mlr-org.com, the use case gallery https://mlr3gallery.mlr-org.com, or in other examples. All data sets are properly preprocessed and ready to be analyzed by most machine learning algorithms. Data sets are automatically added to the dictionary of tasks if mlr3 is loaded.
This package provides portable tools to run system processes in the background. It can check if a background process is running; wait on a background process to finish; get the exit status of finished processes; kill background processes and their children; restart processes. It can read the standard output and error of the processes, using non-blocking connections. processx can poll a process for standard output or error, with a timeout. It can also poll several processes at once.
This package provides a set of user-friendly functions to aid in organizing, plotting and analyzing event-related potential (ERP) data. Provides an easy-to-learn method to explore ERP data. Should be useful to those without a background in computer programming, and to those who are new to ERPs (or new to the more advanced ERP software available). Emphasis has been placed on highly automated processes using functions with as few arguments as possible. Expects processed (cleaned) data.
Group SLOPE (Group Sorted L1 Penalized Estimation) is a penalized linear regression method that is used for adaptive selection of groups of significant predictors in a high-dimensional linear model. The Group SLOPE method can control the (group) false discovery rate at a user-specified level (i.e., control the expected proportion of irrelevant among all selected groups of predictors). For additional information about the implemented methods please see Brzyski, Gossmann, Su, Bogdan (2018) <doi:10.1080/01621459.2017.1411269>.
It can be necessary to limit the rate of execution of a loop or repeated function call e.g. to show or gather data only at particular intervals. This package includes two methods for limiting this execution rate; speed governors and timers. A speed governor will insert pauses during execution to meet a user-specified loop time. Timers are alarm clocks which will indicate whether a certain time has passed. These mechanisms are implemented in C to minimize processing overhead.
Duct tape the quanteda ecosystem (Benoit et al., 2018) <doi:10.21105/joss.00774> to modern Transformer-based text classification models (Wolf et al., 2020) <doi:10.18653/v1/2020.emnlp-demos.6>, in order to facilitate supervised machine learning for textual data. This package mimics the behaviors of quanteda.textmodels and provides a function to setup the Python environment to use the pretrained models from Hugging Face <https://huggingface.co/>. More information: <doi:10.5117/CCR2023.1.003.CHAN>.
Supports modeling health outcomes using Bayesian hierarchical spatio-temporal models with complex covariate effects (e.g., linear, non-linear, interactions, distributed lag linear and non-linear models) in the INLA framework. It is designed to help users identify key drivers and predictors of disease risk by enabling streamlined model exploration, comparison, and visualization of complex covariate effects. See an application of the modelling framework in Lowe, Lee, O'Reilly et al. (2021) <doi:10.1016/S2542-5196(20)30292-8>.
This package provides a fragmentation spectra detection pipeline for high-throughput LC/HRMS data processing using peaklists generated by the IDSL.IPA workflow <doi:10.1021/acs.jproteome.2c00120>. The IDSL.CSA package can deconvolute fragmentation spectra from Composite Spectra Analysis (CSA), Data Dependent Acquisition (DDA) analysis, and various Data-Independent Acquisition (DIA) methods such as MS^E, All-Ion Fragmentation (AIF) and SWATH-MS analysis. The IDSL.CSA package was introduced in <doi:10.1021/acs.analchem.3c00376>.
Routines to handle family data with a pedigree object. The initial purpose was to create correlation structures that describe family relationships such as kinship and identity-by-descent, which can be used to model family data in mixed effects models, such as in the coxme function. Also includes a tool for pedigree drawing which is focused on producing compact layouts without intervention. Recent additions include utilities to trim the pedigree object with various criteria, and kinship for the X chromosome.
Estimation of a multi-group count regression models (i.e., Poisson, negative binomial) with latent covariates. This packages provides two extensions compared to ordinary count regression models based on a generalized linear model: First, measurement models for the predictors can be specified allowing to account for measurement error. Second, the count regression can be simultaneously estimated in multiple groups with stochastic group weights. The marginal maximum likelihood estimation is described in Kiefer & Mayer (2020) <doi:10.1080/00273171.2020.1751027>.
Simulate a (bivariate) multivariate renewal Hawkes (MRHawkes) self-exciting process, with given immigrant hazard rate functions and offspring density function. Calculate the likelihood of a MRHawkes process with given hazard rate functions and offspring density function for an (increasing) sequence of event times. Calculate the Rosenblatt residuals of the event times. Predict future event times based on observed event times up to a given time. For details see Stindl and Chen (2018) <doi:10.1016/j.csda.2018.01.021>.
This package implements the American Heart Association Predicting Risk of cardiovascular disease EVENTs (PREVENT) equations from Khan SS, Matsushita K, Sang Y, and colleagues (2023) <doi:10.1161/CIRCULATIONAHA.123.067626>, with optional comparison with their de facto predecessor, the Pooled Cohort Equations from the American Heart Association and American College of Cardiology (2013) <doi:10.1161/01.cir.0000437741.48606.98> and the revision to the Pooled Cohort Equations from Yadlowsky and colleagues (2018) <doi:10.7326/M17-3011>.
LIONESS, or Linear Interpolation to Obtain Network Estimates for Single Samples, can be used to reconstruct single-sample networks (https://arxiv.org/abs/1505.06440). This code implements the LIONESS equation in the lioness function in R to reconstruct single-sample networks. The default network reconstruction method we use is based on Pearson correlation. However, lionessR can run on any network reconstruction algorithms that returns a complete, weighted adjacency matrix. lionessR works for both unipartite and bipartite networks.
The S4Arrays package defines the Array virtual class to be extended by other S4 classes that wish to implement a container with an array-like semantic. It also provides:
low-level functionality meant to help the developer of such container to implement basic operations like display, subsetting, or coercion of their array-like objects to an ordinary matrix or array, and
a framework that facilitates block processing of array-like objects (typically on-disk objects).
This package is an R package dedicated to the analysis of (multiplexed) 4C sequencing data. r-fourcseq provides a pipeline to detect specific interactions between DNA elements and identify differential interactions between conditions. The statistical analysis in R starts with individual bam files for each sample as inputs. To obtain these files, the package contains a Python script to demultiplex libraries and trim off primer sequences. With a standard alignment software the required bam files can be then be generated.
This package provides tools to compares k samples using the Anderson-Darling test, Kruskal-Wallis type tests with different rank score criteria, Steel's multiple comparison test, and the Jonckheere-Terpstra (JT) test. It computes asymptotic, simulated or (limited) exact P-values, all valid under randomization, with or without ties, or conditionally under random sampling from populations, given the observed tie pattern. Except for Steel's test and the JT test it also combines these tests across several blocks of samples.
For researchers to quickly and comprehensively acquire disease genes, so as to understand the mechanism of disease, we developed this program to acquire disease-related genes. The data is integrated from three public databases. The three databases are eDGAR', DrugBank and MalaCards'. The eDGAR is a comprehensive database, containing data on the relationship between disease and genes. DrugBank contains information on 13443 drugs and 5157 targets. MalaCards integrates human disease information, including disease-related genes.
Automatic model selection for structural time series decomposition into trend, cycle, and seasonal components, plus optionality for structural interpolation, using the Kalman filter. Koopman, Siem Jan and Marius Ooms (2012) "Forecasting Economic Time Series Using Unobserved Components Time Series Models" <doi:10.1093/oxfordhb/9780195398649.013.0006>. Kim, Chang-Jin and Charles R. Nelson (1999) "State-Space Models with Regime Switching: Classical and Gibbs-Sampling Approaches with Applications" <doi:10.7551/mitpress/6444.001.0001><http://econ.korea.ac.kr/~cjkim/>.
BEAST2 (<https://www.beast2.org>) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. BEAUti 2 (which is part of BEAST2') is a GUI tool that allows users to specify the many possible setups and generates the XML file BEAST2 needs to run. This package provides a way to create BEAST2 input files without active user input, but using R function calls instead.
Provide early termination phase II trial designs with a decreasingly informative prior (DIP) or a regular Bayesian prior chosen by the user. The program can determine the minimum planned sample size necessary to achieve the user-specified admissible designs. The program can also perform power and expected sample size calculations for the tests in early termination Phase II trials. See Wang C and Sabo RT (2022) <doi:10.18203/2349-3259.ijct20221110>; Sabo RT (2014) <doi:10.1080/10543406.2014.888441>.
This package provides functions to perform the following analyses: i) inferring epistasis from RNAi double knockdown data; ii) identifying gene pairs of multiple mutation patterns; iii) assessing association between gene pairs and survival; and iv) calculating the smallworldness of a graph (e.g., a gene interaction network). Data and analyses are described in Wang, X., Fu, A. Q., McNerney, M. and White, K. P. (2014). Widespread genetic epistasis among breast cancer genes. Nature Communications. 5 4828. <doi:10.1038/ncomms5828>.