This package provides tools for estimate (joint) cumulants and (joint) products of cumulants of a random sample using (multivariate) k-statistics and (multivariate) polykays, unbiased estimators with minimum variance. Tools for generating univariate and multivariate Faa di Bruno's formula and related polynomials, such as Bell polynomials, generalized complete Bell polynomials, partition polynomials and generalized partition polynomials. For more details see Di Nardo E., Guarino G., Senato D. (2009) <arXiv:0807.5008>, <arXiv:1012.6008>.
L-systems or Lindenmayer systems are parallel rewriting systems which can be used to simulate biological forms and certain kinds of fractals. Briefly, in an L-system a series of symbols in a string are replaced iteratively according to rules to give a more complex string. Eventually, the symbols are translated into turtle graphics for plotting. Wikipedia has a very good introduction: en.wikipedia.org/wiki/L-system This package provides basic functions for exploring L-systems.
The mlrMBO package can ordinarily not be used for optimization within mlr3', because of incompatibilities of their respective class systems. mlrintermbo offers a compatibility interface that provides mlrMBO as an mlr3tuning Tuner object, for tuning of machine learning algorithms within mlr3', as well as a bbotk Optimizer object for optimization of general objective functions using the bbotk black box optimization framework. The control parameters of mlrMBO are faithfully reproduced as a paradox ParamSet'.
This package provides quality control (QC), normalization, and batch effect correction operations for NanoString nCounter data, Talhouk et al. (2016) <doi:10.1371/journal.pone.0153844>. Various metrics are used to determine which samples passed or failed QC. Gene expression should first be normalized to housekeeping genes, before a reference-based approach is used to adjust for batch effects. Raw NanoString data can be imported in the form of Reporter Code Count (RCC) files.
This package provides functions for solving systems of delay differential equations by interfacing with numerical routines written by Simon N. Wood, including contributions from Benjamin J. Cairns. These numerical routines first appeared in Simon Wood's solv95 program. This package includes a vignette and a complete user's guide. PBSddesolve originally appeared on CRAN under the name ddesolve'. That version is no longer supported. The current name emphasizes a close association with other PBS packages, particularly PBSmodelling'.
This package provides methods for spatial risk calculations, focusing on efficient determination of the sum of observations within a circle of a given radius. These methods are particularly relevant for applications such as insurance, where recent European Commission regulations require the calculation of the maximum insured value of fire risk policies for all buildings that are partly or fully located within a 200 m radius. The underlying problem is described by Church (1974) <doi:10.1007/BF01942293>.
(guix-science-nonfree packages bioconductor)This package is used for the detection of differentially expressed genes (DEGs) from the comparison of two biological conditions (treated vs. untreated, diseased vs. normal, mutant vs. wild-type) among different levels of gene expression (transcriptome ,translatome, proteome), using several statistical methods: Rank Product, Translational Efficiency, t-test, Limma, ANOTA, DESeq, edgeR. It also provides the possibility to plot the results with scatterplots, histograms, MA plots, standard deviation (SD) plots, coefficient of variation (CV) plots.
This package provides a class and subclasses for storing non-scalar objects in matrix entries. This is akin to a ragged array but the raggedness is in the third dimension, much like a bumpy surface--hence the name. Of particular interest is the BumpyDataFrameMatrix, where each entry is a Bioconductor data frame. This allows us to naturally represent multivariate data in a format that is compatible with two-dimensional containers like the SummarizedExperiment and MultiAssayExperiment objects.
Function-oriented Make-like declarative pipelines for statistics and data science are supported in the targets R package. As an extension to targets, the tarchetypes package provides convenient user-side functions to make targets easier to use. By establishing reusable archetypes for common kinds of targets and pipelines, these functions help express complicated reproducible pipelines concisely and compactly. The methods in this package were influenced by the drake R package by Will Landau (2018) <doi:10.21105/joss.00550>.
This package provides a pipeline to discern RNA structure at and proximal to the site of protein binding within regions of the transcriptome defined by the user. CLIP protein-binding data can be input as either aligned BAM or peak-called bedGraph files. RNA structure can either be predicted internally from sequence or users have the option to input their own RNA structure data. RNA structure binding profiles can be visually and quantitatively compared across multiple formats.
Estimate fish length-at-age models using MCMC analysis with rstan models. This package allows a multimodel approach to growth fitting to be applied to length-at-age data and is supported by further analyses to determine model selection and result presentation. The core methods of this package are presented in Smart and Grammer (2021) "Modernising fish and shark growth curves with Bayesian length-at-age models". PLOS ONE 16(2): e0246734 <doi:10.1371/journal.pone.0246734>.
Evolutionary black box optimization algorithms building on the bbotk package. miesmuschel offers both ready-to-use optimization algorithms, as well as their fundamental building blocks that can be used to manually construct specialized optimization loops. The Mixed Integer Evolution Strategies as described by Li et al. (2013) <doi:10.1162/EVCO_a_00059> can be implemented, as well as the multi-objective optimization algorithms NSGA-II by Deb, Pratap, Agarwal, and Meyarivan (2002) <doi:10.1109/4235.996017>.
Useful functions for one-sample (individual level data) Mendelian randomization and instrumental variable analyses. The package includes implementations of; the Sanderson and Windmeijer (2016) <doi:10.1016/j.jeconom.2015.06.004> conditional F-statistic, the multiplicative structural mean model Hernán and Robins (2006) <doi:10.1097/01.ede.0000222409.00878.37>, and two-stage predictor substitution and two-stage residual inclusion estimators explained by Terza et al. (2008) <doi:10.1016/j.jhealeco.2007.09.009>.
This package provides a collection of functions that primarily produce graphics to aid in a Propensity Score Analysis (PSA). Functions include: cat.psa and box.psa to test balance within strata of categorical and quantitative covariates, circ.psa for a representation of the estimated effect size by stratum, loess.psa that provides a graphic and loess based effect size estimate, and various balance functions that provide measures of the balance achieved via a PSA in a categorical covariate.
Routines for state estimate in a linear Gaussian state space model and a simple stochastic volatility model using particle filtering. Parameter inference is also carried out in these models using the particle Metropolis-Hastings algorithm that includes the particle filter to provided an unbiased estimator of the likelihood. This package is a collection of minimal working examples of these algorithms and is only meant for educational use and as a start for learning to them on your own.
This package provides a wrapped LASSO approach by integrating an ensemble learning strategy to help select efficient, stable, and high confidential variables from omics-based data. Using a bagging strategy in combination of a parametric method or inflection point search method for cut-off threshold determination. This package can integrate and vote variables generated from multiple LASSO models to determine the optimal candidates. Luo H, Zhao Q, et al (2020) <doi:10.1126/scitranslmed.aax7533> for more details.
Finds the most likely originating tissue(s) and developmental stage(s) of tissue-specific RNA sequencing data. The package identifies both pure transcriptomes and mixtures of transcriptomes. The most likely identity is found through comparisons of the sequencing data with high-throughput in situ hybridisation patterns. Typical uses are the identification of cancer cell origins, validation of cell culture strain identities, validation of single-cell transcriptomes, and validation of identity and purity of flow-sorting and dissection sequencing products.
This package implements fast Monte Carlo simulations for goodness-of-fit (GOF) tests for discrete distributions. This includes tests based on the Chi-squared statistic, the log-likelihood-ratio (G^2) statistic, the Freeman-Tukey (Hellinger-distance) statistic, the Kolmogorov-Smirnov statistic, the Cramer-von Mises statistic as described in Choulakian, Lockhart and Stephens (1994) <doi:10.2307/3315828>, and the root-mean-square statistic, see Perkins, Tygert, and Ward (2011) <doi:10.1016/j.amc.2011.03.124>.
An implementation of several functions for feature extraction in ordinal time series datasets. Specifically, some of the features proposed by Weiss (2019) <doi:10.1080/01621459.2019.1604370> can be computed. These features can be used to perform inferential tasks or to feed machine learning algorithms for ordinal time series, among others. The package also includes some interesting datasets containing financial time series. Practitioners from a broad variety of fields could benefit from the general framework provided by otsfeatures'.
This package provides tools to conduct interpretable sensitivity analyses for weighted estimators, introduced in Huang (2024) <doi:10.1093/jrsssa/qnae012> and Hartman and Huang (2024) <doi:10.1017/pan.2023.12>. The package allows researchers to generate the set of recommended sensitivity summaries to evaluate the sensitivity in their underlying weighting estimators to omitted moderators or confounders. The tools can be flexibly applied in causal inference settings (i.e., in external and internal validity contexts) or survey contexts.
Using Gaussian graphical models we propose a novel approach to perform pathway analysis using gene expression. Given the structure of a graph (a pathway) we introduce two statistical tests to compare the mean and the concentration matrices between two groups. Specifically, these tests can be performed on the graph and on its connected components (cliques). The package is based on the method described in Massa M.S., Chiogna M., Romualdi C. (2010) <doi:10.1186/1752-0509-4-121>.
This package contains functions for a variational Bayesian method for sparse PCA proposed by Ning (2020) <arXiv:2102.00305>. There are two algorithms: the PX-CAVI algorithm (if assuming the loadings matrix is jointly row-sparse) and the batch PX-CAVI algorithm (if without this assumption). The outputs of the main function, VBsparsePCA(), include the mean and covariance of the loadings matrix, the score functions, the variable selection results, and the estimated variance of the random noise.
A tandem repeat in DNA is two or more adjacent, approximate copies of a pattern of nucleotides. Tandem Repeats Finder is a program to locate and display tandem repeats in DNA sequences. In order to use the program, the user submits a sequence in FASTA format. The output consists of two files: a repeat table file and an alignment file. Submitted sequences may be of arbitrary length. Repeats with pattern size in the range from 1 to 2000 bases are detected.
The package CellBarcode performs Cellular DNA Barcode analysis. It can handle all kinds of DNA barcodes, as long as the barcode is within a single sequencing read and has a pattern that can be matched by a regular expression. \codeCellBarcode can handle barcodes with flexible lengths, with or without UMI (unique molecular identifier). This tool also can be used for pre-processing some amplicon data such as CRISPR gRNA screening, immune repertoire sequencing, and metagenome data.