This package provides a method to infer modules of co-expressed genes and the dependencies among the modules from multiple expression datasets that may contain different sets of genes. Please refer to: Extracting a low-dimensional description of multiple gene expression datasets reveals a potential driver for tumor-associated stroma in ovarian cancer, Safiye Celik, Benjamin A. Logsdon, Stephanie Battle, Charles W. Drescher, Mara Rendi, R. David Hawkins and Su-In Lee (2016) <DOI:10.1186/s13073-016-0319-7>.
The iterLap
(iterated Laplace approximation) algorithm approximates a general (possibly non-normalized) probability density on R^p, by repeated Laplace approximations to the difference between current approximation and true density (on log scale). The final approximation is a mixture of multivariate normal distributions and might be used for example as a proposal distribution for importance sampling (eg in Bayesian applications). The algorithm can be seen as a computational generalization of the Laplace approximation suitable for skew or multimodal densities.
For high-dimensional correlated observations, this package carries out the L_1 penalized maximum likelihood estimation of the precision matrix (network) and the correlation parameters. The correlated data can be longitudinal data (may be irregularly spaced) with dampening correlation or clustered data with uniform correlation. For the details of the algorithms, please see the paper Jie Zhou et al. Identifying Microbial Interaction Networks Based on Irregularly Spaced Longitudinal 16S rRNA
sequence data <doi:10.1101/2021.11.26.470159>.
This package provides tools for processing and analyzing data from the O-GlcNAcAtlas
database <https://oglcnac.org/>, as described in Ma (2021) <doi:10.1093/glycob/cwab003>. It integrates UniProt
<https://www.uniprot.org/> API calls to retrieve additional information. It is specifically designed for research workflows involving O-GlcNAcAtlas
data, providing a flexible and user-friendly interface for customizing and downloading processed results. Interactive elements allow users to easily adjust parameters and handle various biological datasets.
Observational studies are limited in that there could be an unmeasured variable related to both the response variable and the primary predictor. If this unmeasured variable were included in the analysis it would change the relationship (possibly changing the conclusions). Sensitivity analysis is a way to see how much of a relationship needs to exist with the unmeasured variable before the conclusions change. This package provides tools for doing a sensitivity analysis for regression (linear, logistic, and cox) style models.
In bulk epigenome/transcriptome experiments, molecular expression is measured in a tissue, which is a mixture of multiple types of cells. This package tests association of a disease/phenotype with a molecular marker for each cell type. The proportion of cell types in each sample needs to be given as input. The package is applicable to epigenome-wide association study (EWAS) and differential gene expression analysis. Takeuchi and Kato (submitted) "omicwas: cell-type-specific epigenome-wide and transcriptome association study".
An assortment of helper functions for managing data (e.g., rotating values in matrices by a user-defined angle, switching from row- to column-indexing), dates (e.g., intuiting year from messy date strings), handling missing values (e.g., removing elements/rows across multiple vectors or matrices if any have an NA), text (e.g., flushing reports to the console in real-time); and combining data frames with different schema (copying, filling, or concatenating columns or applying functions before combining).
Analyzis and filtering of phylogenomics datasets. It takes an input either a collection of gene trees (then transformed to matrices) or directly a collection of gene matrices and performs an iterative process to identify what species in what genes are outliers, and whose elimination significantly improves the concordance between the input matrices. The methods builds upon the Distatis approach (Abdi et al. (2005) <doi:10.1101/2021.09.08.459421>), a generalization of classical multidimensional scaling to multiple distance matrices.
Utility functions that help with common base-R problems relating to lists. Lists in base-R are very flexible. This package provides functions to quickly and easily characterize types of lists. That is, to identify if all elements in a list are null, data.frames, lists, or fully named lists. Other functionality is provided for the handling of lists, such as the easy splitting of lists into equally sized groups, and the unnesting of data.frames within fully named lists.
We propose a novel two-step procedure to combine epidemiological data obtained from diverse sources with the aim to quantify risk factors affecting the probability that an individual develops certain disease such as cancer. See Hui Huang, Xiaomei Ma, Rasmus Waagepetersen, Theodore R. Holford, Rong Wang, Harvey Risch, Lloyd Mueller & Yongtao Guan (2014) A New Estimation Approach for Combining Epidemiological Data From Multiple Sources, Journal of the American Statistical Association, 109:505, 11-23, <doi:10.1080/01621459.2013.870904>.
This package provides a lightweight toolkit to reduce the size of a list object. The object is minimized by recursively removing elements from the object one-by-one. The process is constrained by a reference function call specified by the user, where the target object is given as an argument. The procedure will not allow elements to be removed from the object, that will cause results from the function call to diverge from the function call with the original object.
MetCirc
comprises a workflow to interactively explore high-resolution MS/MS metabolomics data. MetCirc
uses the Spectra object infrastructure defined in the package Spectra that stores MS/MS spectra. MetCirc
offers functionality to calculate similarity between precursors based on the normalised dot product, neutral losses or user-defined functions and visualise similarities in a circular layout. Within the interactive framework the user can annotate MS/MS features based on their similarity to (known) related MS/MS features.
Scale4C is an R/Bioconductor package for scale-space transformation and visualization of 4C-seq data. The scale-space transformation is a multi-scale visualization technique to transform a 2D signal (e.g. 4C-seq reads on a genomic interval of choice) into a tesselation in the scale space (2D, genomic position x scale factor) by applying different smoothing kernels (Gauss, with increasing sigma). This transformation allows for explorative analysis and comparisons of the data's structure with other samples.
Guile RDF is an implementation of the RDF (Resource Description Framework) format defined by the W3C for GNU Guile. RDF structures include triples (facts with a subject, a predicate and an object), graphs which are sets of triples, and datasets, which are collections of graphs.
RDF specifications include the specification of concrete syntaxes and of operations on graphs. This library implements some basic functionalities, such as parsing and producing turtle and nquads syntax, as well as manipulating graphs and datasets.
The concept of reliable and clinically significant change (Jacobson & Truax, 1991) helps you answer the following questions for a sample with two measurements at different points in time (pre & post): Which proportion of my sample has a (considering the reliability of the instrument) probably not-just-by-chance difference in pre- vs. post-scores? Which proportion of my sample does not only change in a statistically significant way (see question one), but also in a clinically significant way (e.g. change from a test score regarded "dysfunctional" to a score regarded "functional")? This package allows you to very easily create a scatterplot of your sample in which the x-axis maps to the pre-scores, the y-axis maps to the post-scores and several graphical elements (lines, colors) allow you to gain a quick overview about reliable changes in these scores. An example of this kind of plot is Figure 2 of Jacobson & Truax (1991). Referenced article: Jacobson, N. S., & Truax, P. (1991) <doi:10.1037/0022-006X.59.1.12>.
C++ classes to embed R in C++ (and C) applications A C++ class providing the R interpreter is offered by this package making it easier to have "R inside" your C++ application. As R itself is embedded into your application, a shared library build of R is required. This works on Linux, OS X and even on Windows provided you use the same tools used to build R itself. Numerous examples are provided in the nine subdirectories of the examples/ directory of the installed package: standard, mpi (for parallel computing), qt (showing how to embed RInside inside a Qt GUI application), wt (showing how to build a "web-application" using the Wt toolkit), armadillo (for RInside use with RcppArmadillo
'), eigen (for RInside use with RcppEigen
'), and c_interface for a basic C interface and Ruby illustration. The examples use GNUmakefile(s) with GNU extensions, so a GNU make is required (and will use the GNUmakefile automatically). Doxygen'-generated documentation of the C++ classes is available at the RInside website as well.
This package contains functions for estimating above-ground biomass/carbon and its uncertainty in tropical forests. These functions allow to (1) retrieve and correct taxonomy, (2) estimate wood density and its uncertainty, (3) build height-diameter models, (4) manage tree and plot coordinates, (5) estimate above-ground biomass/carbon at stand level with associated uncertainty. To cite â BIOMASSâ , please use citation(â BIOMASSâ ). For more information, see Réjou-Méchain et al. (2017) <doi:10.1111/2041-210X.12753>.
Narrow down the number of models to look at in model selection using the confidence envelopes based on the minimum ZIC (Generalized Information Criteria) values for regression and time series data. Functions involve the computation of multivariate normal-probabilities with covariance matrices based on minimum ZIC inverting the CDF of the minimum ZIC. It involves both the computation of singular and non-singular probabilities as described in Genz (1992) <[https:doi.org/10.2307/1390838]https:doi.org/10.2307/1390838>.
Fits dose-response models utilizing a Bayesian model averaging approach as outlined in Gould (2019) <doi:10.1002/bimj.201700211> for both continuous and binary responses. Longitudinal dose-response modeling is also supported in a Bayesian model averaging framework as outlined in Payne, Ray, and Thomann (2024) <doi:10.1080/10543406.2023.2292214>. Functions for plotting and calculating various posterior quantities (e.g. posterior mean, quantiles, probability of minimum efficacious dose, etc.) are also implemented. Copyright Eli Lilly and Company (2019).
This package provides a collection of functions for calculating Floristic Quality Assessment (FQA) metrics using regional FQA databases that have been approved or approved with reservations as ecological planning models by the U.S. Army Corps of Engineers (USACE). For information on FQA see Spyreas (2019) <doi:10.1002/ecs2.2825>. These databases are stored in a sister R package, fqadata'. Both packages were developed for the USACE by the U.S. Army Engineer Research and Development Centerâ s Environmental Laboratory.
Conducts hierarchical partitioning to calculate individual contributions of each predictor (fixed effects) towards marginal R2 for generalized linear mixed-effect model (including lm, glm and glmm) based on output of r.squaredGLMM()
in MuMIn
', applying the algorithm of Lai J.,Zou Y., Zhang S.,Zhang X.,Mao L.(2022)glmm.hp: an R package for computing individual effect of predictors in generalized linear mixed models.Journal of Plant Ecology,15(6)1302-1307<doi:10.1093/jpe/rtac096>.
Mainly contains a plotting function ggseg3d()
, and data of two standard brain atlases (Desikan-Killiany and aseg). By far, the largest bit of the package is the data for each of the atlases. The functions and data enable users to plot tri-surface mesh plots of brain atlases, and customise these by projecting colours onto the brain segments based on values in their own data sets. Functions are wrappers for plotly'. Mowinckel & Vidal-Piñeiro (2020) <doi:10.1177/2515245920928009>.
When a network is partially observed (here, NAs in the adjacency matrix rather than 1 or 0 due to missing information between node pairs), it is possible to account for the underlying process that generates those NAs. missSBM
', presented in Barbillon, Chiquet and Tabouy (2022) <doi:10.18637/jss.v101.i12>, adjusts the popular stochastic block model from network data sampled under various missing data conditions, as described in Tabouy, Barbillon and Chiquet (2019) <doi:10.1080/01621459.2018.1562934>.
Analise multivariada, tendo funcoes que executam analise de correspondencia simples (CA) e multipla (MCA), analise de componentes principais (PCA), analise de correlacao canonica (CCA), analise fatorial (FA), escalonamento multidimensional (MDS), analise discriminante linear (LDA) e quadratica (QDA), analise de cluster hierarquico e nao hierarquico, regressao linear simples e multipla, analise de multiplos fatores (MFA) para dados quantitativos, qualitativos, de frequencia (MFACT) e dados mistos, biplot, scatter plot, projection pursuit (PP), grant tour e outras funcoes uteis para a analise multivariada.