Code for a variety of nonlinear conditional independence tests: Kernel conditional independence test (Zhang et al., UAI 2011, <arXiv:1202.3775>), Residual Prediction test (based on Shah and Buehlmann, <arXiv:1511.03334>), Invariant environment prediction, Invariant target prediction, Invariant residual distribution test, Invariant conditional quantile prediction (all from Heinze-Deml et al., <arXiv:1706.08576>).
This package provides data on countries and their main city or agglomeration and the different distance measures and dummy variables indicating whether two countries are contiguous, share a common language or a colonial relationship. The reference article for these datasets is Mayer and Zignago (2011) <http://www.cepii.fr/CEPII/en/publications/wp/abstract.asp?NoDoc=3877>.
Query data hosted in Microsoft Fabric'. Provides helpers to open DBI connections to SQL endpoints of Lakehouse and Data Warehouse items; submit Data Analysis Expressions ('DAX') queries to semantic model datasets in Microsoft Fabric and Power BI'; read Delta Lake tables stored in OneLake ('Azure Data Lake Storage Gen2'); and execute Spark code via the Livy API'.
Frequentist assisted by Bayes (FAB) p-values and confidence interval construction. See Hoff (2019) <arXiv:1907.12589> "Smaller p-values via indirect information", Hoff and Yu (2019) <doi:10.1214/18-EJS1517> "Exact adaptive confidence intervals for linear regression coefficients", and Yu and Hoff (2018) <doi:10.1093/biomet/asy009> "Adaptive multigroup confidence intervals with constant coverage".
This package provides methods for quantifying the information gain contributed by individual modalities in multimodal regression models. Information gain is measured using Expected Relative Entropy (ERE) or pseudo-R² metrics, with corresponding p-values and confidence intervals. Currently supports linear and logistic regression models with plans for extension to additional Generalized Linear Models and Cox proportional hazard model.
This package provides utility functions and custom probability distribution for Bayesian analyses of radiocarbon dates within the nimble modelling framework. It includes various population growth models, nimbleFunction objects, as well as a suite of functions for prior and posterior predictive checks for demographic inference (Crema and Shoda (2021) <doi:10.1371/journal.pone.0251695>) and other analyses.
Visualizes the relationship between allele frequency and effect size in genetic association studies. The input is a data frame containing association results. The output is a plot with the effect size of risk variants in the Y axis, and the allele frequency spectrum in the X axis. Corte et al (2023) <doi:10.1101/2023.04.21.23288923>.
inf-ruby provides a Read Eval Print Loop (REPL) buffer, allowing for easy interaction with a Ruby subprocess. Features include support for detecting specific uses of Ruby, e.g., when using Rails, and using an appropriate console.
If you are using Guix shell with manifest.scm, the inf-ruby-wrapper-command customization variable could be helpful.
The objective of this package is to efficiently create scatterplots where groups can be distinguished by color and texture. Visualizations in computational biology tend to have many groups making it difficult to distinguish between groups solely on color. Thus, this package is useful for increasing the accessibility of scatterplot visualizations to those with visual impairments such as color blindness.
This package provides Bioconductor-friendly wrappers for RNA velocity calculations in single-cell RNA-seq data. We use the basilisk package to manage Conda environments, and the zellkonverter package to convert data structures between SingleCellExperiment (R) and AnnData (Python). The information produced by the velocity methods is stored in the various components of the SingleCellExperiment class.
Application of empirical mode decomposition based artificial neural network model for nonlinear and non stationary univariate time series forecasting. For method details see (i) Choudhury (2019) <https://www.indianjournals.com/ijor.aspx?target=ijor:ijee3&volume=55&issue=1&article=013>; (ii) Das (2020) <https://www.indianjournals.com/ijor.aspx?target=ijor:ijee3&volume=56&issue=2&article=002>.
Reverse engineer a regular expression pattern for the characters contained in an R object. Individual characters can be categorised into digits, letters, punctuation or spaces and encoded into run-lengths. This can be used to summarise the structure of a dataset or identify non-standard entries. Many non-character inputs such as numeric vectors and data frames are supported.
This package provides ensemble samplers for affine-invariant Monte Carlo Markov Chain, which allow a faster convergence for badly scaled estimation problems. Two samplers are proposed: the differential.evolution sampler from ter Braak and Vrugt (2008) <doi:10.1007/s11222-008-9104-9> and the stretch sampler from Goodman and Weare (2010) <doi:10.2140/camcos.2010.5.65>.
Analysis of risk through liability matrices. Contains a Gibbs sampler for network reconstruction, where only row and column sums of the liabilities matrix as well as some other fixed entries are observed, following the methodology of Gandy&Veraart (2016) <doi:10.1287/mnsc.2016.2546>. It also incorporates models that use a power law distribution on the degree distribution.
This package provides functions to implement group sequential procedures that allow for early stopping to declare efficacy using a surrogate marker and the possibility of futility stopping. More details are available in: Parast, L. and Bartroff, J (2024) <doi:10.1093/biomtc/ujae108>. A tutorial for this package can be found at <https://laylaparast.com/home/SurrogateSeq.html>.
This package provides an SQL-based mass spectrometry (MS) data backend supporting also storage and handling of very large data sets. Objects from this package are supposed to be used with the Spectra Bioconductor package. Through the MsBackendSql with its minimal memory footprint, this package thus provides an alternative MS data representation for very large or remote MS data sets.
The r-abhgenotyper package provides simple imputation, error-correction and plotting capacities for genotype data. The package is supposed to serve as an intermediate but independent analysis tool between the TASSEL GBS pipeline and the r-qtl package. It provides functionalities not found in either TASSEL or r-qtl in addition to visualization of genotypes as "graphical genotypes".
SpaceTrooper performs Quality Control analysis using data driven GLM models of Image-Based spatial data, providing exploration plots, QC metrics computation, outlier detection. It implements a GLM strategy for the detection of low quality cells in imaging-based spatial data (Transcriptomics and Proteomics). It additionally implements several plots for the visualization of imaging based polygons through the ggplot2 package.
Generate project files and directories following a pre-made template. You can specify variables to customize file names and content, and flexibly adapt the template to your needs. cookiecutter for R implements a subset of the excellent cookiecutter package for the Python programming language (<https://github.com/cookiecutter/>), and aims to be largely compatible with the original cookiecutter template format.
Uses inverse probability weighting methods to estimate treatment effect under marginal structure model for the cause-specific hazard of competing risk events. Estimates also the cumulative incidence function (i.e. risk) of the potential outcomes, and provides inference on risk difference and risk ratio. Reference: Kalbfleisch & Prentice (2002)<doi:10.1002/9781118032985>; Hernan et al (2001)<doi:10.1198/016214501753168154>.
Reads water network simulation data in Epanet text-based .inp and .rpt formats into R. Also reads results from Epanet-msx'. Provides basic summary information and plots. The README file has a quick introduction. See <http://www2.epa.gov/water-research/epanet> for more information on the Epanet software for modeling hydraulic and water quality behavior of water piping systems.
Description: Application of empirical mode decomposition based support vector regression model for nonlinear and non stationary univariate time series forecasting. For method details see (i) Choudhury (2019) <http://krishi.icar.gov.in/jspui/handle/123456789/44873>; (ii) Das (2020) <http://krishi.icar.gov.in/jspui/handle/123456789/43174>; (iii) Das (2023) <http://krishi.icar.gov.in/jspui/handle/123456789/77772>.
Create forecasts from multiple predictions using ensemble Bayesian model averaging (EBMA). EBMA models can be estimated using an expectation maximization (EM) algorithm or as fully Bayesian models via Gibbs sampling. The methods in this package are Montgomery, Hollenbach, and Ward (2015) <doi:10.1016/j.ijforecast.2014.08.001> and Montgomery, Hollenbach, and Ward (2012) <doi:10.1093/pan/mps002>.
Comparing two independent or paired groups across a range of descriptive statistics, enabling the evaluation of potential differences in central tendency (mean, median), dispersion (variance, interquartile range), shape (skewness, kurtosis), and distributional characteristics (various quantiles). The analytical framework incorporates parametric t-tests, non-parametric Wilcoxon tests, permutation tests, and bootstrap resampling techniques to assess the statistical significance of observed differences.