This package provides geometric- and regression-based forecast combination methods under a unified user interface for the packages ForecastCombinations
and GeomComb
'. Additionally, updated tools and convenience functions for data pre-processing are available in order to deal with common problems in forecast combination (missingness, collinearity). For method details see Hsiao C, Wan SK (2014). <doi:10.1016/j.jeconom.2013.11.003>, Hansen BE (2007). <doi:10.1111/j.1468-0262.2007.00785.x>, Elliott G, Gargano A, Timmermann A (2013). <doi:10.1016/j.jeconom.2013.04.017>, and Clemen RT (1989). <doi:10.1016/0169-2070(89)90012-5>.
Calculate clinical scores for hidradenitis suppurativa (HS), a dermatologic disease. The scores are typically used for evaluation of efficacy in clinical trials. The scores are not commonly used in clinical practice. The specific scores implemented are Hidradenitis Suppurativa Clinical Response (HiSCR
) (Kimball, et al. (2015) <doi:10.1111/jdv.13216>), Hidradenitis Suppurativa Area and Severity Index Revised (HASI-R) (Goldfarb, et al. (2020) <doi:10.1111/bjd.19565>), hidradenitis suppurativa Physician Global Assessment (HS PGA) (Marzano, et al. (2020) <doi:10.1111/jdv.16328>), and the International Hidradenitis Suppurativa Severity Score System (IHS4) (Zouboulis, et al. (2017) <doi:10.1111/bjd.15748>).
This package provides a collection of tools for detecting influential cases in generalized mixed effects models. It analyses models that were estimated using lme4'. The basic rationale behind identifying influential data is that when single units are omitted from the data, models based on these data should not produce substantially different estimates. To standardize the assessment of how influential a (single group of) observation(s) is, several measures of influence are common practice, such as Cook's Distance. In addition, we provide a measure of percentage change of the fixed point estimates and a simple procedure to detect changing levels of significance.
As a sequel to iNEXT
', the iNEXT.beta3D
package provides functions to compute standardized taxonomic, phylogenetic, and functional diversity (3D) estimates with a common sample size (for alpha and gamma diversity) or sample coverage (for alpha, beta, gamma diversity as well as dissimilarity or turnover indices). Hill numbers and their generalizations are used to quantify 3D and to make multiplicative decomposition (gamma = alpha x beta). The package also features size- and coverage-based rarefaction and extrapolation sampling curves to facilitate rigorous comparison of beta diversity across datasets. See Chao et al. (2023) <doi:10.1002/ecm.1588> for more details.
The function install_load checks the local R library(ies) to see if the required package(s) is/are installed or not. If the package(s) is/are not installed, then the package(s) will be installed along with the required dependency(ies). This function pulls source or binary packages from the Posit/RStudio-sponsored CRAN mirror. Lastly, the chosen package(s) is/are loaded. The function load_package simply loads the provided package(s). If this package does not fit your needs, then you may want to consider these other R packages: needs', easypackages', pacman', pak', anyLib
', and/or librarian'.
Taxonomic dictionaries, formative element lists, and functions related to the maintenance, development and application of U.S. Soil Taxonomy. Data and functionality are based on official U.S. Department of Agriculture sources including the latest edition of the Keys to Soil Taxonomy. Descriptions and metadata are obtained from the National Soil Information System or Soil Survey Geographic databases. Other sources are referenced in the data documentation. Provides tools for understanding and interacting with concepts in the U.S. Soil Taxonomic System. Most of the current utilities are for working with taxonomic concepts at the "higher" taxonomic levels: Order, Suborder, Great Group, and Subgroup.
adverSCarial
is an R Package designed for generating and analyzing the vulnerability of scRNA-seq
classifiers to adversarial attacks. The package is versatile and provides a format for integrating any type of classifier. It offers functions for studying and generating two types of attacks, single gene attack and max change attack. The single-gene attack involves making a small modification to the input to alter the classification. The max-change attack involves making a large modification to the input without changing its classification. The package provides a comprehensive solution for evaluating the robustness of scRNA-seq
classifiers against adversarial attacks.
IsoCorrectoR
performs the correction of mass spectrometry data from stable isotope labeling/tracing metabolomics experiments with regard to natural isotope abundance and tracer impurity. Data from both MS and MS/MS measurements can be corrected (with any tracer isotope: 13C, 15N, 18O...), as well as ultra-high resolution MS data from multiple-tracer experiments (e.g. 13C and 15N used simultaneously). See the Bioconductor package IsoCorrectoRGUI
for a graphical user interface to IsoCorrectoR
. NOTE: With R version 4.0.0, writing correction results to Excel files may currently not work on Windows. However, writing results to csv works as before.
Easy function for text-mining the PubMed
repository based on defined sets of terms. The relationship between fix-terms (related to your research topic) and pub-terms (terms which pivot around your research focus) is calculated using the pointwise mutual information algorithm ('PMI'). Church, Kenneth Ward and Hanks, Patrick (1990) <https://www.aclweb.org/anthology/J90-1003/> A text file is generated with the PMI'-scores for each fix-term. Then for each collocation pairs (a fix-term + a pub-term), a text file is generated with related article titles and publishing years. Additional Author section will follow in the next version updates.
In computationally demanding analysis projects, statisticians and data scientists asynchronously deploy long-running tasks to distributed systems, ranging from traditional clusters to cloud services. The crew.cluster package extends the mirai'-powered crew package with worker launcher plugins for traditional high-performance computing systems. Inspiration also comes from packages mirai by Gao (2023) <https://github.com/r-lib/mirai>, future by Bengtsson (2021) <doi:10.32614/RJ-2021-048>, rrq by FitzJohn
and Ashton (2023) <https://github.com/mrc-ide/rrq>, clustermq by Schubert (2019) <doi:10.1093/bioinformatics/btz284>), and batchtools by Lang, Bischl, and Surmann (2017). <doi:10.21105/joss.00135>.
This package provides tools for multivariate analyses of morphological data, wrapped in one package, to make the workflow convenient and fast. Statistical and graphical tools provide a comprehensive framework for checking and manipulating input data, statistical analyses, and visualization of results. Several methods are provided for the analysis of raw data, to make the dataset ready for downstream analyses. Integrated statistical methods include hierarchical classification, principal component analysis, principal coordinates analysis, non-metric multidimensional scaling, and multiple discriminant analyses: canonical, stepwise, and classificatory (linear, quadratic, and the non-parametric k nearest neighbours). The philosophy of the package is described in Å lenker et al. 2022.
This is an add-on package to the monobin package that simplifies its use. It provides shiny-based user interface (UI) that is especially handy for less experienced R users as well as for those who intend to perform quick scanning of numeric risk factors when building credit rating models. The additional functions implemented in monobinShiny
that do no exist in monobin package are: descriptive statistics, special case and outliers imputation. The function descriptive statistics is exported and can be used in R sessions independently from the user interface, while special case and outlier imputation functions are written to be used with shiny UI.
This is a sparklyr extension integrating VariantSpark
and R. VariantSpark
is a framework based on scala and spark to analyze genome datasets, see <https://bioinformatics.csiro.au/>. It was tested on datasets with 3000 samples each one containing 80 million features in either unsupervised clustering approaches and supervised applications, like classification and regression. The genome datasets are usually writing in VCF, a specific text file format used in bioinformatics for storing gene sequence variations. So, VariantSpark
is a great tool for genome research, because it is able to read VCF files, run analyses and return the output in a spark data frame.
This package implements Bayesian Distribution Regression methods. This package contains functions for three estimators (non-asymptotic, semi-asymptotic and asymptotic) and related routines for Bayesian Distribution Regression in Huang and Tsyawo (2018) <doi:10.2139/ssrn.3048658> which is also the recommended reference to cite for this package. The functions can be grouped into three (3) categories. The first computes the logit likelihood function and posterior densities under uniform and normal priors. The second contains Independence and Random Walk Metropolis-Hastings Markov Chain Monte Carlo (MCMC) algorithms as functions and the third category of functions are useful for semi-asymptotic and asymptotic Bayesian distribution regression inference.
Maximum likelihood estimation of nonlinear mixed effects models of epidemic growth using Template Model Builder ('TMB'). Enables joint estimation for collections of disease incidence time series, including time series that describe multiple epidemic waves. Supports a set of widely used phenomenological models: exponential, logistic, Richards (generalized logistic), subexponential, and Gompertz. Provides methods for interrogating model objects and several auxiliary functions, including one for computing basic reproduction numbers from fitted values of the initial exponential growth rate. Preliminary versions of this software were applied in Ma et al. (2014) <doi:10.1007/s11538-013-9918-2> and in Earn et al. (2020) <doi:10.1073/pnas.2004904117>.
One of the strengths of R is its vast package ecosystem. Indeed, R packages extend from visualization to Bayesian inference and from spatial analyses to pharmacokinetics (<https://cran.r-project.org/web/views/>). There is probably not an area of quantitative research that isn't represented by at least one R package. At the time of this writing, there are more than 10,000 active CRAN packages. Because of this massive ecosystem, it is important to have tools to search and learn about packages related to your personal R needs. For this reason, we developed an RStudio addin capable of searching available CRAN packages directly within RStudio.
The aim of the package is to provide some basic functions for doing statistics with trapezoidal fuzzy numbers. In particular, the package contains several functions for simulating trapezoidal fuzzy numbers, as well as for calculating some central tendency measures (mean and two types of median), some scale measures (variance, ADD, MDD, Sn, Qn, Tn and some M-estimators) and one diversity index and one inequality index. Moreover, functions for calculating the 1-norm distance, the mid/spr distance and the (phi,theta)-wabl/ldev/rdev distance between fuzzy numbers are included, and a function to calculate the value phi-wabl given a sample of trapezoidal fuzzy numbers.
This package provides a comprehensive toolkit for clinical Human Leukocyte Antigen (HLA) informatics, built on tidyverse <https://tidyverse.tidyverse.org/> principles and making use of genotype list string (GL string, Mack et al. (2023) <doi:10.1111/tan.15126>) for storing and computing HLA genotype data. Specific functionalities include: coercion of HLA data in tabular format to and from GL string; calculation of matching and mismatching in all directions, with multiple output formats; automatic formatting of HLA data for searching within a GL string; truncation of molecular HLA data to a specific number of fields; and reading HLA genotypes in HML files and extracting the GL string.
This package provides a software package to perform Wombling, or boundary analysis, using the nimble Bayesian hierarchical modeling environment. Wombling is used widely to track regions of rapid change within the spatial reference domain. Specific functions in the package implement Gaussian process models for point-referenced spatial data followed by predictive inference on rates of change over curves using line integrals. We demonstrate model based Bayesian inference using posterior distributions featuring simple analytic forms while offering uncertainty quantification over curves. For more details on wombling please see, Banerjee and Gelfand (2006) <doi:10.1198/016214506000000041> and Halder, Banerjee and Dey (2024) <doi:10.1080/01621459.2023.2177166>.
This package provides a collection of functions to perform the Application Programming Interface (API) calls associated with the Walk Score website (www.walkscore.com) within the R environment. These functions can be used to query the Walk Score and Transit Score database for a wide variety of information using R scripts. This package includes the simple Walk Score and Transit Score API calls, which return the scores associated with an input location, as well as calls which return some data used to calculate the scores. These functions are especially useful for mass data collection and gathering Walk Score and Transit Score values for large lists of locations.
The package provides methods of combining the graph structure learning and generalized least squares regression to improve the regression estimation. The main function sparsenetgls()
provides solutions for multivariate regression with Gaussian distributed dependant variables and explanatory variables utlizing multiple well-known graph structure learning approaches to estimating the precision matrix, and uses a penalized variance covariance matrix with a distance tuning parameter of the graph structure in deriving the sandwich estimators in generalized least squares (gls) regression. This package also provides functions for assessing a Gaussian graphical model which uses the penalized approach. It uses Receiver Operative Characteristics curve as a visualization tool in the assessment.
This package implements an algorithm which increases the number of simultaneously measurable markers and in this way helps with study of the immune responses. Thus, the present algorithm, named CytoBackBone
, allows combining phenotypic information of cells from different cytometric profiles obtained from different cytometry panels. This computational approach is based on the principle that each cell has its own phenotypic and functional characteristics that can be used as an identification card. CytoBackBone
uses a set of predefined markers, that we call the backbone, to define this identification card. The phenotypic information of cells with similar identification cards in the different cytometric profiles is then merged.
PCG is a family of simple fast space-efficient statistically good algorithms for random number generation. Unlike many general-purpose RNGs, they are also hard to predict. . This library implements bindings to the standard C implementation. This includes the standard, unique, fast and single variants in the pcg family. There is a pure implementation that can be used as a generator with the random package as well as a faster primitive api that includes functions for generating common types. . The generators in this module are suitable for use in parallel but make sure threads don't share the same generator or things will go horribly wrong.
This package provides functions for revealing what happens when effect size estimates from previous studies are taken into account when evaluating each new dataset in a study sequence. The analyses can be conducted for cumulative meta-analyses and for Bayesian data analyses. The package contains sample data for a wide selection of research topics. Jointly considering previous findings along with new data is more likely to result in correct conclusions than does the traditional practice of not incorporating previous findings, which often results in a back and forth ping-pong of conclusions when evaluating a sequence of studies. O'Connor & Ermacora (2021, <doi:10.1037/cbs0000259>).