Summarizes the taxonomic composition, diversity contribution of the rare and abundant community by using OTU (operational taxonomic unit) table which was generated by analyzing pipeline of QIIME or mothur'. The rare biosphere in this package is subset by the relative abundance threshold (for details about rare biosphere please see Lynch and Neufeld (2015) <doi:10.1038/nrmicro3400>).
Perform simultaneous estimation and variable selection for correlated bivariate mixed outcomes (one continuous outcome and one binary outcome per cluster) using penalized generalized estimating equations. In addition, clustered Gaussian and binary outcomes can also be modeled. The SCAD, MCP, and LASSO penalties are supported. Cross-validation can be performed to find the optimal regularization parameter(s).
This package implements an extension of the Chacko chi-square test for ordered vectors (Chacko, 1966, <https://www.jstor.org/stable/25051572>). Our extension brings the Chacko test to the computer age by implementing a permutation test to offer a numeric estimate of the p-value, which is particularly useful when the analytic solution is not available.
This package provides essential checklists for R package developers, whether you're creating your first package or beginning a new project. This tool guides you through each step of the development process, including specific considerations for submitting your package to the Comprehensive R Archive Network (CRAN). Simplify your workflow and ensure adherence to best practices with packagepal'.
Uses provenance post-execution to help the user understand and debug their script by providing functions to look at intermediate steps and data values, their forwards and backwards lineage, and to understand the steps leading up to warning and error messages. provDebugR uses provenance produced by rdtLite (available on CRAN), stored in PROV-JSON format.
Quasi-Cauchy quantile regression, proposed by de Oliveira, Ospina, Leiva, Figueroa-Zuniga and Castro (2023) <doi:10.3390/fractalfract7090667>. This regression model is useful for the case where you want to model data of a nature limited to the intervals [0,1], (0,1], [0,1) or (0,1) and you want to use a quantile approach.
Uses the optimal test design approach by Birnbaum (1968, ISBN:9781593119348) and van der Linden (2018) <doi:10.1201/9781315117430> to construct fixed, adaptive, and parallel tests. Supports the following mixed-integer programming (MIP) solver packages: Rsymphony', highs', gurobi', lpSolve', and Rglpk'. The gurobi package is not available from CRAN; see <https://www.gurobi.com/downloads/>.
An integrated R interface to several United States Census Bureau APIs (<https://www.census.gov/data/developers/data-sets.html>) and the US Census Bureau's geographic boundary files. Allows R users to return Census and ACS data as tidyverse-ready data frames, and optionally returns a list-column with feature geometry for mapping and spatial analysis.
This package provides functions to support economic modelling in R based on the methods of the Dutch guideline for economic evaluations in healthcare <https://www.zorginstituutnederland.nl/publicaties/publicatie/2024/01/16/richtlijn-voor-het-uitvoeren-van-economische-evaluaties-in-de-gezondheidszorg>, CBS data <https://www.cbs.nl/>, and OECD data <https://www.oecd.org/en.html>.
Detect binding sites using motifs IUPAC sequence or bed coordinates and ChIP-seq experiments in bed or bam format. Combine/compare binding sites across experiments, tissues, or conditions. All normalization and differential steps are done using TMM-GLM method. Signal decomposition is done by setting motifs as the centers of the mixture of normal distribution curves.
The package is usable with Affymetrix GeneChip short oligonucleotide arrays, and it can be adapted or extended to other platforms. It is able to modify or replace the grouping of probes in the probe sets. Also, the package contains simple functions to read R connections in the FASTA format and it can create an alternative mapping from sequences.
HDCytoData contains a set of high-dimensional cytometry benchmark datasets. These datasets are formatted into SummarizedExperiment and flowSet Bioconductor object formats, including all required metadata. Row metadata includes sample IDs, group IDs, patient IDs, reference cell population or cluster labels and labels identifying spiked in cells. Column metadata includes channel names, protein marker names, and protein marker classes.
This package implements multitaper spectral estimation techniques using prolate spheroidal sequences (Slepians) and sine tapers for time series analysis. It includes an adaptive weighted multitaper spectral estimate, a coherence estimate, Thomson's Harmonic F-test, and complex demodulation. The Slepians sequences are generated efficiently using a tridiagonal matrix solution, and jackknifed confidence intervals are available for most estimates.
In order to smoothly animate the transformation of polygons and paths, many aspects needs to be taken into account, such as differing number of control points, changing center of rotation, etc. The transformr package provides an extensive framework for manipulating the shapes of polygons and paths and can be seen as the spatial brother to the tweenr package.
R's default conflict management system gives the most recently loaded package precedence. This can make it hard to detect conflicts, particularly when they arise because a package update creates ambiguity that did not previously exist. The conflicted package takes a different approach, making every conflict an error and forcing you to choose which function to use.
Random Jungle is an implementation of Random Forests. It is supposed to analyse high dimensional data. In genetics, it can be used for analysing big Genome Wide Association (GWA) data. Random Forests is a powerful machine learning method. Most interesting features are variable selection, missing value imputation, classifier creation, generalization error estimation and sample proximities between pairs of cases.
Mixedpower uses pilotdata and a linear mixed model fitted with lme4 to simulate new data sets. Power is computed separate for every effect in the model output as the relation of significant simulations to all simulations. More conservative simulations as a protection against a bias in the pilotdata are available as well as methods for plotting the results.
This package implements functions for comparing strings, sequences and numeric vectors for clustering and record linkage applications. Supported comparison functions include: generalized edit distances for comparing sequences/strings, Monge-Elkan similarity for fuzzy comparison of token sets, and L-p distances for comparing numeric vectors. Where possible, comparison functions are implemented in C/C++ to ensure good performance.
Data sets for the chapter "Ensemble Postprocessing with R" of the book Stephane Vannitsem, Daniel S. Wilks, and Jakob W. Messner (2018) "Statistical Postprocessing of Ensemble Forecasts", Elsevier, 362pp. These data sets contain temperature and precipitation ensemble weather forecasts and corresponding observations at Innsbruck/Austria. Additionally, a demo with the full code of the book chapter is provided.
Causal mediation analysis for a single exposure/treatment and a single mediator, both allowed to be either continuous or binary. The package implements the difference method and provides point and interval estimates as well as testing for the natural direct and indirect effects and the mediation proportion. Nevo, Xiao and Spiegelman (2017) <doi:10.1515/ijb-2017-0006>.
Boxplots adapted to the happenstance of missing observations where drop-out probabilities can be given by the practitioner or modelled using auxiliary covariates. The paper of "Zhang, Z., Chen, Z., Troendle, J. F. and Zhang, J.(2012) <doi:10.1111/j.1541-0420.2011.01712.x>", proposes estimators of marginal quantiles based on the Inverse Probability Weighting method.
This package contains the function mice.impute.midastouch(). Technically this function is to be run from within the mice package (van Buuren et al. 2011), type ??mice. It substitutes the method pmm within mice by midastouch'. The authors have shown that midastouch is superior to default pmm'. Many ideas are based on Siddique / Belin 2008's MIDAS.
Facilitates the incorporation of biological processes in biogeographical analyses. It offers conveniences in fitting, comparing and extrapolating models of biological processes such as physiology and phenology. These spatial extrapolations can be informative by themselves, but also complement traditional correlative species distribution models, by mixing environmental and process-based predictors. Caetano et al (2020) <doi:10.1111/oik.07123>.
All the methods in this package generate a vector of uniform order statistics using a beta distribution and use an inverse cumulative distribution function for some distribution to give a vector of random order statistic variables for some distribution. This is much more efficient than using a loop since it is directly sampling from the order statistic distribution.