This package provides a new batch effect correction method based on Projection to Latent Structures Discriminant Analysis named “PLSDA-batch” to correct data prior to any downstream analysis. PLSDA-batch estimates latent components related to treatment and batch effects to remove batch variation. The method is multivariate, non-parametric and performs dimension reduction. Combined with centered log ratio transformation for addressing uneven library sizes and compositional structure, PLSDA-batch addresses all characteristics of microbiome data that existing correction methods have ignored so far.
The Racket CS implementation, which uses ``Chez Scheme'' as its core compiler and runtime system, has been the default Racket VM implementation since Racket 8.0. It performs better than the Racket BC implementation for most programs. On systems for which Racket CS cannot generate machine code, this package uses a variant of its ``portable bytecode'' backend specialized for word size and endianness.
Using the Racket VM packages directly is not recommended: instead, install the racket-minimal or racket packages.
This package addresses two broad areas. It allows for in-depth analysis of spatial transcriptomic data by identifying tissue neighbourhoods. These are contiguous regions of tissue surrounding individual cells. CatsCradle allows for the categorisation of neighbourhoods by the cell types contained in them and the genes expressed in them. In particular, it produces Seurat objects whose individual elements are neighbourhoods rather than cells. In addition, it enables the categorisation and annotation of genes by producing Seurat objects whose elements are genes.
Supporting data for the ELMER package. It includes: - elmer.data.example.promoter: mae.promoter - elmer.data.example: data - EPIC.hg38.manifest - EPIC.hg19.manifest - hm450.hg38.manifest - hm450.hg19.manifest - hocomoco.table - human.TF - LUSC_meth_refined: Meth - LUSC_RNA_refined: GeneExp - Probes.motif.hg19.450K - Probes.motif.hg19.EPIC - Probes.motif.hg38.450K - Probes.motif.hg38.EPIC - TF.family - TF.subfamily - Human_genes__GRCh37_p13 - Human_genes__GRCh38_p12 - Human_genes__GRCh37_p13__tss - Human_genes__GRCh38_p12__tss.
This package provides tools for conducting Bayesian analyses and Bayesian model averaging (Kass and Raftery, 1995, <doi:10.1080/01621459.1995.10476572>, Hoeting et al., 1999, <doi:10.1214/ss/1009212519>). The package contains functions for creating a wide range of prior distribution objects, mixing posterior samples from JAGS and Stan models, plotting posterior distributions, and etc... The tools for working with prior distribution span from visualization, generating JAGS and bridgesampling syntax to basic functions such as rng, quantile, and distribution functions.
Given the hypothesis of a bi-modal distribution of cells for each marker, the algorithm constructs a binary tree, the nodes of which are subpopulations of cells. At each node, observed cells and markers are modeled by both a family of normal distributions and a family of bi-modal normal mixture distributions. Splitting is done according to a normalized difference of AIC between the two families. Method is detailed in: Commenges, Alkhassim, Gottardo, Hejblum & Thiebaut (2018) <doi: 10.1002/cyto.a.23601>.
Estimation of the parameters in a model for symmetric relational data (e.g., the above-diagonal part of a square matrix), using a model-based eigenvalue decomposition and regression. Missing data is accommodated, and a posterior mean for missing data is calculated under the assumption that the data are missing at random. The marginal distribution of the relational data can be arbitrary, and is fit with an ordered probit specification. See Hoff (2007) <arXiv:0711.1146> for details on the model.
This package provides a toolbox for constructing potential landscapes for Ising networks. The parameters of the networks can be directly supplied by users or estimated by the IsingFit package by van Borkulo and Epskamp (2016) <https://CRAN.R-project.org/package=IsingFit> from empirical data. The Ising model's Boltzmann distribution is preserved for the potential landscape function. The landscape functions can be used for quantifying and visualizing the stability of network states, as well as visualizing the simulation process.
This package performs multilevel matches for data with cluster- level treatments and individual-level outcomes using a network optimization algorithm. Functions for checking balance at the cluster and individual levels are also provided, as are methods for permutation-inference-based outcome analysis. Details in Pimentel et al. (2018) <doi:10.1214/17-AOAS1118>. The optmatch package, which is useful for running many of the provided functions, may be downloaded from Github at <https://github.com/markmfredrickson/optmatch> if not available on CRAN.
This package provides a switch-case construct for R', as it is known from other programming languages. It allows to test multiple, similar conditions in an efficient, easy-to-read manner, so nested if-else constructs can be avoided. The switch-case construct is designed as an R function that allows to return values depending on which condition is met and lets the programmer flexibly decide whether or not to leave the switch-case construct after a case block has been executed.
This package provides Bayesian PCA, Probabilistic PCA, Nipals PCA, Inverse Non-Linear PCA and the conventional SVD PCA. A cluster based method for missing value estimation is included for comparison. BPCA, PPCA and NipalsPCA may be used to perform PCA on incomplete data as well as for accurate missing value estimation. A set of methods for printing and plotting the results is also provided. All PCA methods make use of the same data structure (pcaRes) to provide a common interface to the PCA results.
Apply and visualize conditional formatting to data frames in R. It renders a data frame with cells formatted according to criteria defined by rules, using a tidy evaluation syntax. The table is printed either opening a web browser or within the RStudio viewer if available. The conditional formatting rules allow to highlight cells matching a condition or add a gradient background to a given column. This package supports both HTML and LaTeX outputs in knitr reports, and exporting to an xlsx file.
Whole-buffer DEFLATE-based compression and decompression of raw vectors using the libdeflate library (see <https://github.com/ebiggers/libdeflate>). Provides the user with additional control over the speed and the quality of DEFLATE compression compared to the fixed level of compression offered in R's memCompress() function. Also provides the libdeflate static library and C headers along with a CMake target and packageâ config file that ease linking of libdeflate in packages that compile and statically link bundled libraries using CMake'.
This package provides functions to support compatibility between Maelstrom R packages and Opal environment. Opal is the OBiBa core database application for biobanks. It is used to build data repositories that integrates data collected from multiple sources. Opal Maelstrom is a specific implementation of this software. This Opal client is specifically designed to interact with Opal Maelstrom distributions to perform operations on the R server side. The user must have adequate credentials. Please see <https://opaldoc.obiba.org/> for complete documentation.
Allows the user to perform ANOVA tests (in a strict sense: continuous and normally-distributed Y variable and 1 or more factorial/categorical X variable(s)), with the possibility to specify the type of sum of squares (1, 2 or 3), the types of variables (Fixed or Random) and their relationships (crossed or nested) with the sole function of the package (FullyParamANOVA()). The resulting outputs are the same as in SAS software. A dataset (Butterfly) to test the function is also joined.
An implementation of split-population duration regression models. Unlike regular duration models, split-population duration models are mixture models that accommodate the presence of a sub-population that is not at risk for failure, e.g. cancer patients who have been cured by treatment. This package implements Weibull and Loglogistic forms for the duration component, and focuses on data with time-varying covariates. These models were originally formulated in Boag (1949) and Berkson and Gage (1952), and extended in Schmidt and Witte (1989).
Acquire hourly meteorological data from stations located all over the world. There is a wealth of data available, with historic weather data accessible from nearly 30,000 stations. The available data is automatically downloaded from a data repository and processed into a tibble for the exact range of years requested. A relative humidity approximation is provided using the August-Roche-Magnus formula, which was adapted from Alduchov and Eskridge (1996) <doi:10.1175%2F1520-0450%281996%29035%3C0601%3AIMFAOS%3E2.0.CO%3B2>.
Provide a range of functions with multiple criteria for cutting phylogenetic trees at any evolutionary depth. It enables users to cut trees in any orientation, such as rootwardly (from root to tips) and tipwardly (from tips to its root), or allows users to define a specific time interval of interest. It can also be used to create multiple tree pieces of equal temporal width. Moreover, it allows the assessment of novel temporal rates for various phylogenetic indexes, which can be quickly displayed graphically.
BayesPrism includes deconvolution and embedding learning modules. The deconvolution module models a prior from cell type-specific expression profiles from scRNA-seq to jointly estimate the posterior distribution of cell type composition and cell type-specific gene expression from bulk RNA-seq expression of tumor samples. The embedding learning module uses Expectation-maximization (EM) to approximate the tumor expression using a linear combination of malignant gene programs while conditional on the inferred expression and fraction of non-malignant cells estimated by the deconvolution module.
Genomic selection is a specialized form of marker assisted selection. The package contains functions to select important genetic markers and predict phenotype on the basis of fitted training data using integrated model framework (Guha Majumdar et. al. (2019) <doi:10.1089/cmb.2019.0223>) developed by combining one additive (sparse additive models by Ravikumar et. al. (2009) <doi:10.1111/j.1467-9868.2009.00718.x>) and one non-additive (hsic lasso by Yamada et. al. (2014) <doi:10.1162/NECO_a_00537>) model.
Function to identify haplotypes within QTL (Quantitative Trait Loci). One haplotype is a combination of SNP (Single Nucleotide Polymorphisms) within the QTL. This function groups together all individuals of a population with the same haplotype. Each group contains individual with the same allele in each SNP, whether or not missing data. Thus, haplotyper groups individuals, that to be imputed, have a non-zero probability of having the same alleles in the entire sequence of SNP's. Moreover, haplotyper calculates such probability from relative frequencies.
This package provides functions complementary to packages nicheROVER and SIBER allowing the user to extract Bayesian estimates from data objects created by the packages nicheROVER and SIBER'. Please see the following publications for detailed methods on nicheROVER and SIBER Hansen et al. (2015) <doi:10.1890/14-0235.1>, Jackson et al. (2011) <doi:10.1111/j.1365-2656.2011.01806.x>, and Layman et al. (2007) <doi:10.1890/0012-9658(2007)88[42:CSIRPF]2.0.CO;2>, respectfully.
This package provides functions for Bayesian analysis of data from randomized experiments with non-compliance. The functions are based on the models described in Imbens and Rubin (1997) <doi:10.1214/aos/1034276631>. Currently only two types of outcome models are supported: binary outcomes and normally distributed outcomes. Models can be fit with and without the exclusion restriction and/or the strong access monotonicity assumption. Models are fit using the data augmentation algorithm as described in Tanner and Wong (1987) <doi:10.2307/2289457>.
Proteins reside in either the cell plasma or in the cell membrane. A membrane protein goes through the membrane at least once. Given the amino acid sequence of a membrane protein, the tool PureseqTM (<https://github.com/PureseqTM/pureseqTM_package>, as described in "Efficient And Accurate Prediction Of Transmembrane Topology From Amino acid sequence only.", Wang, Qing, et al (2019), <doi:10.1101/627307>), can predict the topology of a membrane protein. This package allows one to use PureseqTM from R.