Implementation of the classifier described in the paper Ali HR et al (2014) <doi:10.1186/s13059-014-0431-1>. It uses copy number and/or expression form breast cancer data, trains a Tibshirani's pamr classifier with the features available and predicts the iC10 group.
This package provides a unified software package simultaneously implemented in Python', R', and Matlab providing a uniform and internally-consistent way of calculating stoichiometric equilibrium constants in modern and palaeo seawater as a function of temperature, salinity, pressure and the concentration of magnesium, calcium, sulphate, and fluorine.
"Learning with Subset Stacking" is a supervised learning algorithm that is based on training many local estimators on subsets of a given dataset, and then passing their predictions to a global estimator. You can find the details about LESS in our manuscript at <arXiv:2112.06251>.
Bandwidth selection for kernel density estimators of 2-d level sets and highest density regions. It applies a plug-in strategy to estimate the asymptotic risk function and minimize to get the optimal bandwidth matrix. See Doss and Weng (2018) <arXiv:1806.00731> for more detail.
Estimates multivariate subgaussian stable densities and probabilities as well as generates random variates using product distribution theory. A function for estimating the parameters from data to fit a distribution to data is also provided, using the method from Nolan (2013) <doi:10.1007/s00180-013-0396-7>.
Count the occurrence of sequences of values in a vector that meets certain conditions of length and magnitude. The method is based on the Run Length Encoding algorithm, available with base R, inspired by A. H. Robinson and C. Cherry (1967) <doi:10.1109/PROC.1967.5493>.
An implementation for computing Optimal B-Robust Estimators of two-parameter distribution. The procedure is composed of some equations that are evaluated alternatively until the solution is reached. Some tools for analyzing the estimates are included. The most relevant is covariance matrix computation using a closed formula.
Recursive algorithms for computing various relatedness coefficients, including pairwise kinship, kappa and identity coefficients. Both autosomal and X-linked coefficients are computed. Founders are allowed to be inbred, which enables construction of any given kappa coefficients, as described in Vigeland (2020) <doi:10.1007/s00285-020-01505-x>. In addition to the standard coefficients, ribd also computes a range of lesser-known coefficients, including generalised kinship coefficients, multi-person coefficients and two-locus coefficients (Vigeland, 2023, <doi:10.1093/g3journal/jkac326>). Many features of ribd are available through the online app QuickPed at <https://magnusdv.shinyapps.io/quickped>; see Vigeland (2022) <doi:10.1186/s12859-022-04759-y>.
This package contains implementations of recurrent event data analysis routines including (1) survival and recurrent event data simulation from stochastic process point of view by the thinning method proposed by Lewis and Shedler (1979) <doi:10.1002/nav.3800260304> and the inversion method introduced in Cinlar (1975, ISBN:978-0486497976), (2) the mean cumulative function (MCF) estimation by the Nelson-Aalen estimator of the cumulative hazard rate function, (3) two-sample recurrent event responses comparison with the pseudo-score tests proposed by Lawless and Nadeau (1995) <doi:10.2307/1269617>, (4) gamma frailty model with spline rate function following Fu, et al. (2016) <doi:10.1080/10543406.2014.992524>.
This package implements five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang, respectively, for measuring semantic similarities among Disease ontology (DO) terms and gene products. Enrichment analyses including hypergeometric model and gene set enrichment analysis are also implemented for discovering disease associations of high-throughput biological data.
This package contains an implementation of AIMS -- Absolute Intrinsic Molecular Subtyping. It contains necessary functions to assign the five intrinsic molecular subtypes (Luminal A, Luminal B, Her2-enriched, Basal-like, Normal-like). Assignments could be done on individual samples as well as on dataset of gene expression data.
Remind allows you to remind yourself of upcoming events and appointments. Each reminder or alarm can consist of a message sent to standard output, or a program to be executed. It also features: sophisticated date calculation, moon phases, sunrise/sunset, Hebrew calendar, alarms, PostScript output and proper handling of holidays.
This package provides an implementation of the Uniform Manifold Approximation and Projection dimensionality reduction by McInnes et al. (2018). It also provides means to transform new data and to carry out supervised dimensionality reduction. An implementation of the related LargeVis method of Tang et al. (2016) is also provided.
This is a package for analysis of case-control data in genetic epidemiology. It provides a set of statistical methods for evaluating gene-environment (or gene-genes) interactions under multiplicative and additive risk models, with or without assuming gene-environment (or gene-gene) independence in the underlying population.
GSCA takes as input several lists of activated and repressed genes. GSCA then searches through a compendium of publicly available gene expression profiles for biological contexts that are enriched with a specified pattern of gene expression. GSCA provides both traditional R functions and interactive, user-friendly user interface.
Bayesian fitting and sensitivity analysis methods for adaptive spline surfaces described in <doi:10.18637/jss.v094.i08>. Built to handle continuous and categorical inputs as well as functional or scalar output. An extension of the methodology in Denison, Mallick and Smith (1998) <doi:10.1023/A:1008824606259>.
Density, distribution function, quantile function random generation and estimation of bimodal GEV distribution given in Otiniano et al. (2023) <doi:10.1007/s10651-023-00566-7>. This new generalization of the well-known GEV (Generalized Extreme Value) distribution is useful for modeling heterogeneous bimodal data from different areas.
This package provides a C++ library for Bayesian modeling, with an emphasis on Markov chain Monte Carlo. Although boom contains a few R utilities (mainly plotting functions), its primary purpose is to install the BOOM C++ library on your system so that other packages can link against it.
This package implements the combined cluster and discriminant analysis method for finding homogeneous groups of data with known origin as described in Kovacs et. al (2014): Classification into homogeneous groups using combined cluster and discriminant analysis (CCDA). Environmental Modelling & Software. <doi:10.1016/j.envsoft.2014.01.010>.
The issue of overlapping regions in multidimensional data arises when different classes or clusters share similar feature representations, making it challenging to delineate distinct boundaries between them accurately. This package provides methods for detecting and visualizing these overlapping regions using partitional clustering techniques based on nearest neighbor distances.
Covariate Assisted Principal Regression (CAPR) for multiple covariance-matrix outcomes. The method identifies (principal) projection directions that maximize the log-likelihood of a log-linear regression model of the covariates. See Zhao et al. (2021), "Covariate Assisted Principal Regression for Covariance Matrix Outcomes" <doi:10.1093/biostatistics/kxz057>.
All data sets required for the examples and exercises in the book "Forecasting: principles and practice" by Rob J Hyndman and George Athanasopoulos <https://OTexts.com/fpp3/>. All packages required to run the examples are also loaded. Additional data sets not used in the book are also included.
Fits geographically weighted regression (GWR) models and has tools to diagnose and remediate collinearity in the GWR models. Also fits geographically weighted ridge regression (GWRR) and geographically weighted lasso (GWL) models. See Wheeler (2009) <doi:10.1068/a40256> and Wheeler (2007) <doi:10.1068/a38325> for more details.
Implementation of S4 class of sets and multisets of numbers. The implementation is based on the hash table from the package hash'. Quick operations are allowed when the set is a dynamic object. The implementation is discussed in detail in Ceoldo and Wit (2023) <arXiv:2304.09809>.