Estimate population average treatment effects from a primary data source with borrowing from supplemental sources. Causal estimation is done with either a Bayesian linear model or with Bayesian additive regression trees (BART) to adjust for confounding. Borrowing is done with multisource exchangeability models (MEMs). For information on BART, see Chipman, George, & McCulloch
(2010) <doi:10.1214/09-AOAS285>. For information on MEMs, see Kaizer, Koopmeiners, & Hobbs (2018) <doi:10.1093/biostatistics/kxx031>.
The CalMaTe
method calibrates preprocessed allele-specific copy number estimates (ASCNs) from DNA microarrays by controlling for single-nucleotide polymorphism-specific allelic crosstalk. The resulting ASCNs are on average more accurate, which increases the power of segmentation methods for detecting changes between copy number states in tumor studies including copy neutral loss of heterozygosity. CalMaTe
applies to any ASCNs regardless of preprocessing method and microarray technology, e.g. Affymetrix and Illumina.
This package provides a Bayesian meta-analysis method for studying cross-phenotype genetic associations. It uses summary-level data across multiple phenotypes to simultaneously measure the evidence of aggregate-level pleiotropic association and estimate an optimal subset of traits associated with the risk locus. CPBayes is based on a spike and slab prior. The methodology is available from: A Majumdar, T Haldar, S Bhattacharya, JS Witte (2018) <doi:10.1371/journal.pgen.1007139>.
An open, multi-algorithmic pipeline for easy, fast and efficient analysis of cellular sub-populations and the molecular signatures that characterize them. The pipeline consists of four successive steps: data pre-processing, cellular clustering with pseudo-temporal ordering, defining differential expressed genes and biomarker identification. More details on Ghannoum et. al. (2021) <doi:10.3390/ijms22031399>. This package implements extensions of the work published by Ghannoum et. al. (2019) <doi:10.1101/700989>.
This package provides functions to estimate the mortality attributable to influenza and temperature, using distributed-lag nonlinear models (DLNMs), as first implemented in Lytras et al. (2019) <doi:10.2807/1560-7917.ES.2019.24.14.1800118>. Full descriptions of underlying DLNM methodology in Gasparrini et al. <doi:10.1002/sim.3940> (DLNMs), <doi:10.1186/1471-2288-14-55> (attributable risk from DLNMs) and <doi:10.1002/sim.5471> (multivariate meta-analysis).
Makes the Genepop software available in R. This software implements a mixture of traditional population genetic methods and some more focused developments: it computes exact tests for Hardy-Weinberg equilibrium, for population differentiation and for genotypic disequilibrium among pairs of loci; it computes estimates of F-statistics, null allele frequencies, allele size-based statistics for microsatellites, etc.; and it performs analyses of isolation by distance from pairwise comparisons of individuals or population samples.
Perform high dimensional Feature Selection in the presence of survival outcome. Based on Feature Selection method and different survival analysis, it will obtain the best markers with optimal threshold levels according to their effect on disease progression and produce the most consistent level according to those threshold values. The functions methodology is based on by Sonabend et al (2021) <doi:10.1093/bioinformatics/btab039> and Bhattacharjee et al (2021) <arXiv:2012.02102>
.
Providing mean partition for ensemble clustering by optimal transport alignment(OTA), uncertainty measures for both partition-wise and cluster-wise assessment and multiple visualization functions to show uncertainty, for instance, membership heat map and plot of covering point set. A partition refers to an overall clustering result. Jia Li, Beomseok Seo, and Lin Lin (2019) <doi:10.1002/sam.11418>. Lixiang Zhang, Lin Lin, and Jia Li (2020) <doi:10.1093/bioinformatics/btaa165>.
This package provides a computational model for predicting proteins encoded by circadian genes. The support vector machine has been employed with Laplace kernel for prediction of circadian proteins, where compositional, transitional and physico-chemical features were utilized as numeric features. User can predict for the test dataset using the proposed computational model. Besides, the user can also build their own training model using their training dataset, followed by prediction for the test set.
An environment to simulate the development of annual plant populations with regard to population dynamics and genetics, especially herbicide resistance. It combines genetics on the individual level (Renton et al. 2011) with a stochastic development on the population level (Daedlow, 2015). Renton, M, Diggle, A, Manalil, S and Powles, S (2011) <doi:10.1016/j.jtbi.2011.05.010> Daedlow, Daniel (2015, doctoral dissertation: University of Rostock, Faculty of Agriculture and Environmental Sciences.).
Estimating causal effects in the presence of post-treatment confounding using principal stratification. PStrata allows for customized monotonicity assumptions and exclusion restriction assumptions, with automatic full Bayesian inference supported by Stan'. The main function to use in this package is PStrata()
, which provides posterior estimates of principal causal effect with uncertainty quantification. Visualization tools are also provided for diagnosis and interpretation. See Liu and Li (2023) <arXiv:2304.02740>
for details.
Fit, summarize, and predict for a variety of spatial statistical models applied to point-referenced and areal (lattice) data. Parameters are estimated using various methods. Additional modeling features include anisotropy, non-spatial random effects, partition factors, big data approaches, and more. Model-fit statistics are used to summarize, visualize, and compare models. Predictions at unobserved locations are readily obtainable. For additional details, see Dumelle et al. (2023) <doi:10.1371/journal.pone.0282524>.
Newly developed methods for the estimation of several probabilities in an illness-death model. The package can be used to obtain nonparametric and semiparametric estimates for: transition probabilities, occupation probabilities, cumulative incidence function and the sojourn time distributions. Additionally, it is possible to fit proportional hazards regression models in each transition of the Illness-Death Model. Several auxiliary functions are also provided which can be used for marginal estimation of the survival functions.
This package provides a collection of functions to deal with spatial and spatiotemporal autoregressive conditional heteroscedasticity (spatial ARCH and GARCH models) by Otto, Schmid, Garthoff (2018, Spatial Statistics) <doi:10.1016/j.spasta.2018.07.005>: simulation of spatial ARCH-type processes (spARCH
, log/exponential-spARCH
, complex-spARCH
); quasi-maximum-likelihood estimation of the parameters of spARCH
models and spatial autoregressive models with spARCH
disturbances, diagnostic checks, visualizations.
This package provides some code to run simulations of state-space models, and then use these in the Approximate Bayesian Computation Sequential Monte Carlo (ABC-SMC) algorithm of Toni et al. (2009) <doi:10.1098/rsif.2008.0172> and a bootstrap particle filter based particle Markov chain Monte Carlo (PMCMC) algorithm (Andrieu et al., 2010 <doi:10.1111/j.1467-9868.2009.00736.x>). Also provides functions to plot and summarise the outputs.
Utility functions for scale-dependent and alternative hyperpriors. The distribution parameters may capture location, scale, shape, etc. and every parameter may depend on complex additive terms (fixed, random, smooth, spatial, etc.) similar to a generalized additive model. Hyperpriors for all effects can be elicitated within the package. Including complex tensor product interaction terms and variable selection priors. The basic model is explained in in Klein and Kneib (2016) <doi:10.1214/15-BA983>.
This package provides functions to create and manage research compendiums for data analysis. Research compendiums are a standard and intuitive folder structure for organizing the digital materials of a research project, which can significantly improve reproducibility. The package offers several compendium structure options that fit different research project as well as the ability of duplicating the folder structure of existing projects or implementing custom structures. It also simplifies the use of version control.
The base tools union()
intersect()
, etc., follow the algebraic definition that each element of a set must be unique. Since it's often helpful to compare all elements of two vectors, this toolset treats every element as unique for counting purposes. For ease of use, all functions in vecsets have an argument multiple which, when set to FALSE, reverts them to the base::sets (alias for all the items) tools functionality.
This package helps identify mRNAs
that are overexpressed in subsets of tumors relative to normal tissue. Ideal inputs would be paired tumor-normal data from the same tissue from many patients (>15 pairs). This unsupervised approach relies on the observation that oncogenes are characteristically overexpressed in only a subset of tumors in the population, and may help identify oncogene candidates purely based on differences in mRNA
expression between previously unknown subtypes.
ScreenR
is a package suitable to perform hit identification in loss of function High Throughput Biological Screenings performed using barcoded shRNA-based
libraries. ScreenR
combines the computing power of software such as edgeR
with the simplicity of use of the Tidyverse metapackage. ScreenR
executes a pipeline able to find candidate hits from barcode counts, and integrates a wide range of visualization modes for each step of the analysis.
Offers functions for plotting split (or implicit) networks (unrooted, undirected) and explicit networks (rooted, directed) with reticulations extending. ggtree and using functions from ape and phangorn'. It extends the ggtree package [@Yu2017] to allow the visualization of phylogenetic networks using the ggplot2 syntax. It offers an alternative to the plot functions already available in ape Paradis and Schliep (2019) <doi:10.1093/bioinformatics/bty633> and phangorn Schliep (2011) <doi:10.1093/bioinformatics/btq706>.
This package implements a variety of methods for combining p-values in differential analyses of genome-scale datasets. Functions can combine p-values across different tests in the same analysis (e.g., genomic windows in ChIP-seq, exons in RNA-seq) or for corresponding tests across separate analyses (e.g., replicated comparisons, effect of different treatment conditions). Support is provided for handling log-transformed input p-values, missing values and weighting where appropriate.
The package ABarray
is designed to work with Applied Biosystems whole genome microarray platform, as well as any other platform whose data can be transformed into expression data matrix. Functions include data preprocessing, filtering, control probe analysis, statistical analysis in one single function. A graphical user interface (GUI) is also provided. The raw data, processed data, graphics output and statistical results are organized into folders according to the analysis settings used.
Monocle performs differential expression and time-series analysis for single-cell expression experiments. It orders individual cells according to progress through a biological process, without knowing ahead of time which genes define progress through that process. Monocle also performs differential expression analysis, clustering, visualization, and other useful tasks on single cell expression data. It is designed to work with RNA-Seq and qPCR data, but could be used with other types as well.