This package implements the generalized propensity score cumulative distribution function proposed by Greene (2017) <https://digitalcommons.library.tmc.edu/dissertations/AAI10681743/>. A single scalar balancing score is calculated for any generalized propensity score vector with three or more treatments. This balancing score is used for propensity score matching and stratification in outcome analyses when analyzing either ordinal or multinomial treatments.
Sequential strategies for finding a game equilibrium are proposed in a black-box setting (expensive pay-off evaluations, no derivatives). The algorithm handles noiseless or noisy evaluations. Two acquisition functions are available. Graphical outputs can be generated automatically. V. Picheny, M. Binois, A. Habbal (2018) <doi:10.1007/s10898-018-0688-0>. M. Binois, V. Picheny, P. Taillandier, A. Habbal (2020) <arXiv:1902.06565v2>
.
Offers a generalization of the scatterplot matrix based on the recognition that most datasets include both categorical and quantitative information. Traditional grids of scatterplots often obscure important features of the data when one or more variables are categorical but coded as numerical. The generalized pairs plot offers a range of displays of paired combinations of categorical and quantitative variables. Emerson et al. (2013) <DOI:10.1080/10618600.2012.694762>.
This package provides a non-parametric Bayesian framework based on Gaussian process priors for estimating causal effects of a continuous exposure and detecting change points in the causal exposure response curves using observational data. Ren, B., Wu, X., Braun, D., Pillai, N., & Dominici, F.(2021). "Bayesian modeling for exposure response curve via gaussian processes: Causal effects of exposure to air pollution on health outcomes." arXiv
preprint <doi:10.48550/arXiv.2105.03454>
.
This package provides various R programming tools for plotting data, including:
calculating and plotting locally smoothed summary function
enhanced versions of standard plots
manipulating colors
calculating and plotting two-dimensional data summaries
enhanced regression diagnostic plots
formula-enabled interface to
stats::lowess
functiondisplaying textual data in plots
balloon plots
plotting "Venn" diagrams
displaying Open-Office style plots
plotting multiple data on same region, with separate axes
plotting means and confidence intervals
spacing points in an x-y plot so they don't overlap
This package provides a package containing an environment representing the GP53.CDF file.
Convert the chip ID of GPL2025 <https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL2025> to GeneBank
Accession and ENTREZID <http://www.ncbi.nlm.nih.gov/gene>.
Parameter estimation and prediction of Gaussian Process Classifier models as described in Bachoc et al. (2020) <doi:10.1007/S10898-020-00920-0>. Important functions : gpcm()
, predict.gpcm()
, update.gpcm()
.
This package performs statistical data analysis of various Plant Breeding experiments. Contains functions for Line by Tester analysis as per Arunachalam, V.(1974) <http://repository.ias.ac.in/89299/> and Diallel analysis as per Griffing, B. (1956) <https://www.publish.csiro.au/bi/pdf/BI9560463>.
Gaussian process regression models, a.k.a. Kriging models, are applied to global multi-objective optimization of black-box functions. Multi-objective Expected Improvement and Step-wise Uncertainty Reduction sequential infill criteria are available. A quantification of uncertainty on Pareto fronts is provided using conditional simulations.
An R package that allows for combining tree-boosting with Gaussian process and mixed effects models. It also allows for independently doing tree-boosting as well as inference and prediction for Gaussian process and mixed effects models. See <https://github.com/fabsig/GPBoost> for more information on the software and Sigrist (2022, JMLR) <https://www.jmlr.org/papers/v23/20-322.html> and Sigrist (2023, TPAMI) <doi:10.1109/TPAMI.2022.3168152> for more information on the methodology.
This package provides tools to build and work with bilateral generalized-mean price indexes (and by extension quantity indexes), and indexes composed of generalized-mean indexes (e.g., superlative quadratic-mean indexes, GEKS). Covers the core mathematical machinery for making bilateral price indexes, computing price relatives, detecting outliers, and decomposing indexes, with wrappers for all common (and many uncommon) index-number formulas. Implements and extends many of the methods in Balk (2008, <doi:10.1017/CBO9780511720758>), von der Lippe (2007, <doi:10.3726/978-3-653-01120-3>), and the CPI manual (2020, <doi:10.5089/9781484354841.069>).
We implement and extend the Dividing Local Gaussian Process algorithm by Lederer et al. (2020) <doi:10.48550/arXiv.2006.09446>
. Its main use case is in online learning where it is used to train a network of local GPs (referred to as tree) by cleverly partitioning the input space. In contrast to a single GP, GPTreeO
is able to deal with larger amounts of data. The package includes methods to create the tree and set its parameter, incorporating data points from a data stream as well as making joint predictions based on all relevant local GPs.
Gaussian processes ('GPs') have been widely used to model spatial data, spatio'-temporal data, and computer experiments in diverse areas of statistics including spatial statistics, spatio'-temporal statistics, uncertainty quantification, and machine learning. This package creates basic tools for fitting and prediction based on GPs with spatial data, spatio'-temporal data, and computer experiments. Key characteristics for this GP tool include: (1) the comprehensive implementation of various covariance functions including the Matérn family and the Confluent Hypergeometric family with isotropic form, tensor form, and automatic relevance determination form, where the isotropic form is widely used in spatial statistics, the tensor form is widely used in design and analysis of computer experiments and uncertainty quantification, and the automatic relevance determination form is widely used in machine learning; (2) implementations via Markov chain Monte Carlo ('MCMC') algorithms and optimization algorithms for GP models with all the implemented covariance functions. The methods for fitting and prediction are mainly implemented in a Bayesian framework; (3) model evaluation via Fisher information and predictive metrics such as predictive scores; (4) built-in functionality for simulating GPs with all the implemented covariance functions; (5) unified implementation to allow easy specification of various GPs'.
An R interface to the GPTZero API (<https://gptzero.me/docs>). Allows users to classify text into human and computer written with probabilities. Formats the data into data frames where each sentence is an observation. Paragraph-level and document-level predictions are organized to align with the sentences.
The package aims to help users write openCL
code with little or no effort. It is able to compile an user-defined R function and run it on a device such as a CPU or a GPU. The user can also write and run their openCL
code directly by calling .kernel function.
This package provides tools for functional enrichment analysis, gene identifier conversion and mapping homologous genes across related organisms via the g:Profiler
toolkit.
Focused on extracting important data from track points such as speed, distance, elevation difference and azimuth.(PLAZA, J. et al., 2022) <doi:10.1016/j.applanim.2022.105643>.
Large language models are readily accessible via API. This package lowers the barrier to use the API inside of your development environment. For more on the API, see <https://platform.openai.com/docs/introduction>.
Fast scalable Gaussian process approximations, particularly well suited to spatial (aerial, remote-sensed) and environmental data, described in more detail in Katzfuss and Guinness (2017) <arXiv:1708.06302>
. Package also contains a fast implementation of the incomplete Cholesky decomposition (IC0), based on Schaefer et al. (2019) <arXiv:1706.02205>
and MaxMin
ordering proposed in Guinness (2018) <arXiv:1609.05372>
.
Applies sequential clustering algorithm to animal location data based on user-defined parameters. Plots interactive cluster maps and provides a summary dataframe with attributes for each cluster commonly used as covariates in subsequent modeling efforts. Additional functions provide individual keyhole markup language plots for quick assessment, and export of global positioning system exchange format files for navigation purposes. Methods can be found at <doi:10.1111/2041-210X.13572>.
This package provides a framework to detect Differential Item Functioning (DIF) in Generalized Partial Credit Models (GPCM) and special cases of the GPCM as proposed by Schauberger and Mair (2019) <doi:10.3758/s13428-019-01224-2>. A joint model is set up where DIF is explicitly parametrized and penalized likelihood estimation is used for parameter selection. The big advantage of the method called GPCMlasso is that several variables can be treated simultaneously and that both continuous and categorical variables can be used to detect DIF.
GPUs are great resources for data analysis, especially in statistics and linear algebra. Unfortunately, very few packages connect R to the GPU, and none of them are transparent enough to run the computations on the GPU without substantial changes to the code. The maintenance of these packages is cumbersome: several of the earlier attempts have been removed from their respective repositories. It would be desirable to have a properly maintained R package that takes advantage of the GPU with minimal changes to the existing code. We have developed the GPUmatrix package (available on CRAN). GPUmatrix mimics the behavior of the Matrix package and extends R to use the GPU for computations. It includes single(FP32) and double(FP64) precision data types, and provides support for sparse matrices. It is easy to learn, and requires very few code changes to perform the operations on the GPU. GPUmatrix relies on either the Torch or Tensorflow R packages to perform the GPU operations. We have demonstrated its usefulness for several statistical applications and machine learning applications: non-negative matrix factorization, logistic regression and general linear models. We have also included a comparison of GPU and CPU performance on different matrix operations.
Example data for the GPA package, consisting of the p-values of 1,219,805 SNPs for five psychiatric disorder GWAS from the psychiatric GWAS consortium (PGC), with the annotation data using genes preferentially expressed in the central nervous system (CNS).