This package provides a unified software package simultaneously implemented in Python', R', and Matlab providing a uniform and internally-consistent way of calculating stoichiometric equilibrium constants in modern and palaeo seawater as a function of temperature, salinity, pressure and the concentration of magnesium, calcium, sulphate, and fluorine.
"Learning with Subset Stacking" is a supervised learning algorithm that is based on training many local estimators on subsets of a given dataset, and then passing their predictions to a global estimator. You can find the details about LESS in our manuscript at <arXiv:2112.06251>
.
Bandwidth selection for kernel density estimators of 2-d level sets and highest density regions. It applies a plug-in strategy to estimate the asymptotic risk function and minimize to get the optimal bandwidth matrix. See Doss and Weng (2018) <arXiv:1806.00731>
for more detail.
Estimates multivariate subgaussian stable densities and probabilities as well as generates random variates using product distribution theory. A function for estimating the parameters from data to fit a distribution to data is also provided, using the method from Nolan (2013) <doi:10.1007/s00180-013-0396-7>.
Count the occurrence of sequences of values in a vector that meets certain conditions of length and magnitude. The method is based on the Run Length Encoding algorithm, available with base R, inspired by A. H. Robinson and C. Cherry (1967) <doi:10.1109/PROC.1967.5493>.
An implementation for computing Optimal B-Robust Estimators of two-parameter distribution. The procedure is composed of some equations that are evaluated alternatively until the solution is reached. Some tools for analyzing the estimates are included. The most relevant is covariance matrix computation using a closed formula.
Water resources system simulator is a tool for simulation and analysis of large-scale water resources systems. WRSS proposes functions and methods for construction, simulation and analysis of primary storage and hydropower water resources features (e.g. reservoirs, aquifers, and etc.) based on Standard Operating Policy (SOP).
This package contains functionality to run differential gene co-expression across two different conditions. The algorithm is inspired by Voigt et al. 2017 and finds Conserved, Specific and Differentiated genes (hence the name CSD). This package include efficient and variance calculation by bootstrapping and Welford's algorithm.
Statistical methods for multiple testing with covariate information. Traditional multiple testing methods only consider a list of test statistics, such as p-values. Our methods incorporate the auxiliary information, such as the lengths of gene coding regions or the minor allele frequencies of SNPs, to improve power.
PAST takes GWAS output and assigns SNPs to genes, uses those genes to find pathways associated with the genes, and plots pathways based on significance. Implements methods for reading GWAS input data, finding genes associated with SNPs, calculating enrichment score and significance of pathways, and plotting pathways.
Raw vectors in R are useful for storing a single binary object. What if you want to put a vector of them in a data frame? The blob package provides the blob object, a list of raw vectors, suitable for use as a column in data frame.
R-coop offers implementations of covariance, correlation and cosine similarity. The implementations are fast and memory-efficient and their use is resolved automatically based on the input data, handled by R's S3 methods. Full descriptions of the algorithms and benchmarks are available in the package vignettes.
ranger is a console file manager with Vi key bindings. It provides a minimalistic and nice curses interface with a view on the directory hierarchy. It ships with rifle
, a file launcher that is good at automatically finding out which program to use for what file type.
Implementation of Nelson rules for control charts in R'. The Rspc implements some Statistical Process Control methods, namely Levey-Jennings type of I (individuals) chart, Shewhart C (count) chart and Nelson rules (as described in Montgomery, D. C. (2013) Introduction to statistical quality control. Hoboken, NJ: Wiley.). Typical workflow is taking the time series, specify the control limits, and list of Nelson rules you want to evaluate. There are several options how to modify the rules (one sided limits, numerical parameters of rules, etc.). Package is also capable of calculating the control limits from the data (so far only for i-chart and c-chart are implemented).
This package provides functions to interact with the Google DoubleClick
for Publishers (DFP) API <https://developers.google.com/ad-manager/api/start> (recently renamed to Google Ad Manager'). This package is automatically compiled from the API WSDL (Web Service Description Language) files to dictate how the API is structured. Theoretically, all API actions are possible using this package; however, care must be taken to format the inputs correctly and parse the outputs correctly. Please see the Google Ad Manager API reference <https://developers.google.com/ad-manager/api/rel_notes> and this package's website <https://stevenmmortimer.github.io/rdfp/> for more information, documentation, and examples.
Tu & Zhou (1999) <doi:10.1002/(SICI)1097-0258(19991030)18:20%3C2749::AID-SIM195%3E3.0.CO;2-C> showed that comparing the means of populations whose data-generating distributions are non-negative with excess zero observations is a problem of great importance in the analysis of medical cost data. In the same study, Tu & Zhou discuss that it can be difficult to control type-I error rates of general-purpose statistical tests for comparing the means of these particular data sets. This package allows users to perform a modified bootstrap-based t-test that aims to better control type-I error rates in these situations.
Suppose we have a data matrix, which is the superposition of a low-rank component and a sparse component. Candes, E. J., Li, X., Ma, Y., & Wright, J. (2011). Robust principal component analysis?. Journal of the ACM (JACM), 58(3), 11. prove that we can recover each component individually under some suitable assumptions. It is possible to recover both the low-rank and the sparse components exactly by solving a very convenient convex program called Principal Component Pursuit; among all feasible decompositions, simply minimize a weighted combination of the nuclear norm and of the L1 norm. This package implements this decomposition algorithm resulting with Robust PCA approach.
This package provides a C++ library for Bayesian modeling, with an emphasis on Markov chain Monte Carlo. Although boom contains a few R utilities (mainly plotting functions), its primary purpose is to install the BOOM C++ library on your system so that other packages can link against it.
Density, distribution function, quantile function random generation and estimation of bimodal GEV distribution given in Otiniano et al. (2023) <doi:10.1007/s10651-023-00566-7>. This new generalization of the well-known GEV (Generalized Extreme Value) distribution is useful for modeling heterogeneous bimodal data from different areas.
Bayesian fitting and sensitivity analysis methods for adaptive spline surfaces described in <doi:10.18637/jss.v094.i08>. Built to handle continuous and categorical inputs as well as functional or scalar output. An extension of the methodology in Denison, Mallick and Smith (1998) <doi:10.1023/A:1008824606259>.
This package implements the combined cluster and discriminant analysis method for finding homogeneous groups of data with known origin as described in Kovacs et. al (2014): Classification into homogeneous groups using combined cluster and discriminant analysis (CCDA). Environmental Modelling & Software. <doi:10.1016/j.envsoft.2014.01.010>.
The issue of overlapping regions in multidimensional data arises when different classes or clusters share similar feature representations, making it challenging to delineate distinct boundaries between them accurately. This package provides methods for detecting and visualizing these overlapping regions using partitional clustering techniques based on nearest neighbor distances.
All data sets required for the examples and exercises in the book "Forecasting: principles and practice" by Rob J Hyndman and George Athanasopoulos <https://OTexts.com/fpp3/>. All packages required to run the examples are also loaded. Additional data sets not used in the book are also included.
Fits geographically weighted regression (GWR) models and has tools to diagnose and remediate collinearity in the GWR models. Also fits geographically weighted ridge regression (GWRR) and geographically weighted lasso (GWL) models. See Wheeler (2009) <doi:10.1068/a40256> and Wheeler (2007) <doi:10.1068/a38325> for more details.