This is a supportive data package for the software package gage
. However, the data supplied here are also useful for gene set or pathway analysis or microarray data analysis in general. In this package, we provide two demo microarray dataset: GSE16873 (a breast cancer dataset from GEO) and BMP6 (originally published as an demo dataset for GAGE, also registered as GSE13604 in GEO). This package also includes commonly used gene set data based on KEGG pathways and GO terms for major research species, including human, mouse, rat and budding yeast. Mapping data between common gene IDs for budding yeast are also included.
GARFIELD is a non-parametric functional enrichment analysis approach described in the paper GARFIELD: GWAS analysis of regulatory or functional information enrichment with LD correction. Briefly, it is a method that leverages GWAS findings with regulatory or functional annotations (primarily from ENCODE and Roadmap epigenomics data) to find features relevant to a phenotype of interest. It performs greedy pruning of GWAS SNPs (LD r2 > 0.1) and then annotates them based on functional information overlap. Next, it quantifies Fold Enrichment (FE) at various GWAS significance cutoffs and assesses them by permutation testing, while matching for minor allele frequency, distance to nearest transcription start site and number of LD proxies (r2 > 0.8).
Calculate and plot the configuration of Jupiter's four largest satellites (known as Galilean satellites) for a given date and time (UTC - Coordinated Universal Time). The galsat()
function returns numerical values of the satellitesâ positions. x â the apparent rectangular coordinate of the satellite with respect to the center of Jupiterâ s disk in the equatorial plane in the units of Jupiterâ s equatorial radius; X is positive toward the west, y â the apparent rectangular coordinate of the satellite with respect to the center of Jupiterâ s disk from the equatorial plane in the units of Jupiterâ s equatorial radius; Y is positive toward the north. For more details see Meeus (1988, ISBN 0-943396-22-0) "Astronomical Formulae for Calculators". The function delta_t()
returns the value of delta-T in units of seconds.
The standard linear regression theory whether frequentist or Bayesian is based on an assumed (revealed?) truth (John Tukey) attitude to models. This is reflected in the language of statistical inference which involves a concept of truth, for example confidence intervals, hypothesis testing and consistency. The motivation behind this package was to remove the word true from the theory and practice of linear regression and to replace it by approximation. The approximations considered are the least squares approximations. An approximation is called valid if it contains no irrelevant covariates. This is operationalized using the concept of a Gaussian P-value which is the probability that pure Gaussian noise is better in term of least squares than the covariate. The precise definition given in the paper "An Approximation Based Theory of Linear Regression". Only four simple equations are required. Moreover the Gaussian P-values can be simply derived from standard F P-values. Furthermore they are exact and valid whatever the data in contrast F P-values are only valid for specially designed simulations. A valid approximation is one where all the Gaussian P-values are less than a threshold p0 specified by the statistician, in this package with the default value 0.01. This approximations approach is not only much simpler it is overwhelmingly better than the standard model based approach. The will be demonstrated using high dimensional regression and vector autoregression real data sets. The goal is to find valid approximations. The search function is f1st which is a greedy forward selection procedure which results in either just one or no approximations which may however not be valid. If the size is less than than a threshold with default value 21 then an all subset procedure is called which returns the best valid subset. A good default start is f1st(y,x,kmn=15) The best function for returning multiple approximations is f3st which repeatedly calls f1st. For more information see the papers: L. Davies and L. Duembgen, "Covariate Selection Based on a Model-free Approach to Linear Regression with Exact Probabilities", <doi:10.48550/arXiv.2202.01553>
, L. Davies, "An Approximation Based Theory of Linear Regression", 2024, <doi:10.48550/arXiv.2402.09858>
.
This package provides functions for greenhouse gas flux calculation from chamber measurements.
Implementation of various inference and simulation tools to apply generalized additive models to bivariate dependence structures and non-simplified vine copulas.
This package provides a collection of functions to perform Gaussian quadrature with different weight functions corresponding to the orthogonal polynomials in package orthopolynom. Examples verify the orthogonality and inner products of the polynomials.
An excerpt of the data available at Gapminder.org. For each of 142 countries, the package provides values for life expectancy, GDP per capita, and population, every five years, from 1952 to 2007.
The main purpose of this package is to allow fitting of mixture distributions with generalised additive models for location scale and shape models see Chapter 7 of Stasinopoulos et al. (2017) <doi:10.1201/b21973-4>.
This is a dataset package for GANPA, which implements a network-based gene weighting approach to pathway analysis. This package includes data useful for GANPA, such as a functional association network, pathways, an expression dataset and multi-subunit proteins.
This package provides a collection difference measures for multivariate Gaussian probability density functions, such as the Euclidea mean, the Mahalanobis distance, the Kullback-Leibler divergence, the J-Coefficient, the Minkowski L2-distance, the Chi-square divergence and the Hellinger Coefficient.
This is an add on package to GAMLSS. The purpose of this package is to allow users to defined truncated distributions in GAMLSS models. The main function gen.trun()
generates truncated version of an existing GAMLSS family distribution.
Uses a slice sampling-based Markov chain Monte Carlo to conduct Bayesian fitting and inference for generalized additive mixed models. Generalized linear mixed models and generalized additive models are also handled as special cases of generalized additive mixed models. The methodology and software is described in Pham, T.H. and Wand, M.P. (2018). Australian and New Zealand Journal of Statistics, 60, 279-330 <DOI:10.1111/ANZS.12241>.
Implementation of a common set of punctual solutions for Cooperative Game Theory.
An R interface to the Galvanize Highbond API <https://docs-apis.highbond.com>.
Interface for extra smooth functions including tensor products, neural networks and decision trees.
Allows user to choose/gate a region on the plot and returns points within it.
Display a random fact about Carl Friedrich Gauss based the on collection curated by Mike Cavers via the <http://gaussfacts.com> site.
Given a vector of cluster memberships for a cell population, identifies a sequence of gates (polygon filters on 2D scatter plots) for isolation of that cell type.
Density, distribution function, quantile function and random generation for the bimodal skew symmetric normal distribution of Hassan and El-Bassiouni (2016) <doi:10.1080/03610926.2014.882950>.
This package provides functions to fit two-dimensional Gaussian functions, predict values from fits, and produce plots of predicted data via either ggplot2 or base R plotting.
GA4GHshiny package provides an easy way to interact with data servers based on Global Alliance for Genomics and Health (GA4GH) genomics API through a Shiny application. It also integrates with Beacon Network.
This package provides functions to estimate the disparities across categories (e.g. Black and white) that persists if a treatment variable (e.g. college) is equalized. Makes estimates by treatment modeling, outcome modeling, and doubly-robust augmented inverse probability weighting estimation, with standard errors calculated by a nonparametric bootstrap. Cross-fitting is supported. Survey weights are supported for point estimation but not for standard error estimation; those applying this package with complex survey samples should consult the data distributor to select an appropriate approach for standard error construction, which may involve calling the functions repeatedly for many sets of replicate weights provided by the data distributor. The methods in this package are described in Lundberg (2021) <doi:10.31235/osf.io/gx4y3>.
This is an add-on package to gamlss'. The purpose of this package is to allow users to fit GAMLSS (Generalised Additive Models for Location Scale and Shape) models when the response variable is defined either in the intervals [0,1), (0,1] and [0,1] (inflated at zero and/or one distributions), or in the positive real line including zero (zero-adjusted distributions). The mass points at zero and/or one are treated as extra parameters with the possibility to include a linear predictor for both. The package also allows transformed or truncated distributions from the GAMLSS family to be used for the continuous part of the distribution. Standard methods and GAMLSS diagnostics can be used with the resulting fitted object.