This package provides tools to solve real-world problems with multiple classes classifications by computing the areas under ROC and PR curve via micro-averaging and macro-averaging. The vignettes of this package can be found via <https://github.com/WandeRum/multiROC>
. The methodology is described in V. Van Asch (2013) <https://www.clips.uantwerpen.be/~vincent/pdf/microaverage.pdf> and Pedregosa et al. (2011) <http://scikit-learn.org/stable/auto_examples/model_selection/plot_roc.html>.
Implementations of a large number of tests for symmetry and their bootstrap variants, which can be used for testing the symmetry of random samples around a known or unknown mean. Functions are also there for testing the symmetry of model residuals around zero. Currently, the supported models are linear models and generalized autoregressive conditional heteroskedasticity (GARCH) models (fitted with the fGarch
package). All tests are implemented using the Rcpp package which ensures great performance of the code.
This package provides a method to visualize pharmacometric analyses which are impacted by covariate effects. Variability-aligned covariate harmonized-effects and time-transformation equivalent ('vachette') facilitates intuitive overlays of data and model predictions, allowing for comprehensive comparison without dilution effects. vachette improves upon previous methods Lommerse et al. (2021) <doi:10.1002/psp4.12679>, enabling its application to all pharmacometric models and enhancing Visual Predictive Checks (VPC) by integrating data into cohesive plots that can highlight model misspecification.
This package assists in demultiplexing scRNAseq
data using both cell hashing and SNPs data. The SNP profile of each group os learned using high confidence assignments from the cell hashing data. Cells which cannot be assigned with high confidence from the cell hashing data are assigned to their most similar group based on their SNPs. We also provide some helper function to optimise SNP selection, create training data and merge SNP data into the SingleCellExperiment
framework.
The missRows
package implements the MI-MFA method to deal with missing individuals ('biological units') in multi-omics data integration. The MI-MFA method generates multiple imputed datasets from a Multiple Factor Analysis model, then the yield results are combined in a single consensus solution. The package provides functions for estimating coordinates of individuals and variables, imputing missing individuals, and various diagnostic plots to inspect the pattern of missingness and visualize the uncertainty due to missing values.
This package provides several methods for aggregating probabilistic forecasts. You have a group of people who have made probabilistic forecasts for the same event. You want to take advantage of the "wisdom of the crowd" and combine these forecasts in some sensible way. This package provides implementations of several strategies, including geometric mean of odds, an extremized aggregate (Neyman, Roughgarden (2021) <doi:10.1145/3490486.3538243>), and "high-density trimmed mean" (Powell et al. (2022) <doi:10.1037/dec0000191>).
Proposed by Harrell, the C index or concordance C, is considered an overall measure of discrimination in survival analysis between a survival outcome that is possibly right censored and a predictive-score variable, which can represent a measured biomarker or a composite-score output from an algorithm that combines multiple biomarkers. This package aims to statistically compare two C indices with right-censored survival outcome, which commonly arise from a paired design and thus resulting two correlated C indices.
This package provides methods and plotting functions for displaying categorical data on an interactive heatmap using plotly'. Provides functionality for strictly categorical heatmaps, heatmaps illustrating categorized continuous data and annotated heatmaps. Also, there are various options to interact with the x-axis to prevent overlapping axis labels, e.g. via simple sliders or range sliders. Besides the viewer pane, resulting plots can be saved as a standalone HTML file, embedded in R Markdown documents or in a Shiny app.
Density function and generation of random variables from the Generalized Inverse Normal (GIN) distribution from Robert (1991) <doi:10.1016/0167-7152(91)90174-P>. Also provides density functions and generation from the GIN distribution truncated to positive or negative reals. Theoretical guarantees supporting the sampling algorithms and an application to Bayesian estimation of network formation models can be found in the working paper Ding, Estrada and Montoya-Blandón (2023) <https://www.smontoyablandon.com/publication/networks/network_externalities.pdf>.
This package provides a lightweight fork of gMCP
with functions for graphical described multiple test procedures introduced in Bretz et al. (2009) <doi:10.1002/sim.3495> and Bretz et al. (2011) <doi:10.1002/bimj.201000239>. Implements a flexible function using ggplot2 to create multiplicity graph visualizations. Contains instructions of multiplicity graph and graphical testing for group sequential design, described in Maurer and Bretz (2013) <doi:10.1080/19466315.2013.807748>, with necessary unit testing using testthat'.
An implementation of randomization-based hypothesis testing for three different estimands in a cluster-randomized encouragement experiment. The three estimands include (1) testing a cluster-level constant proportional treatment effect (Fisher's sharp null hypothesis), (2) pooled effect ratio, and (3) average cluster effect ratio. To test the third estimand, user needs to install Gurobi (>= 9.0.1) optimizer via its R API. Please refer to <https://www.gurobi.com/documentation/9.0/refman/ins_the_r_package.html>.
This package provides functions for the design process of survey sampling, with specific tools for multi-wave and multi-phase designs. Perform optimum allocation using Neyman (1934) <doi:10.2307/2342192> or Wright (2012) <doi:10.1080/00031305.2012.733679> allocation, split strata based on quantiles or values of known variables, randomly select samples from strata, allocate sampling waves iteratively, and organize a complex survey design. Also includes a Shiny application for observing the effects of different strata splits.
Fitting dimension reduction methods to data lying on two-dimensional sphere. This package provides principal geodesic analysis, principal circle, principal curves proposed by Hauberg, and spherical principal curves. Moreover, it offers the method of locally defined principal geodesics which is underway. The detailed procedures are described in Lee, J., Kim, J.-H. and Oh, H.-S. (2021) <doi:10.1109/TPAMI.2020.3025327>. Also see Kim, J.-H., Lee, J. and Oh, H.-S. (2020) <arXiv:2003.02578>
.
Geostatistical modeling and kriging with gridded data using spatially separable covariance functions (Kronecker covariances). Kronecker products in these models provide shortcuts for solving large matrix problems in likelihood and conditional mean, making snapKrig
computationally efficient with large grids. The package supplies its own S3 grid object class, and a host of methods including plot, print, Ops, square bracket replace/assign, and more. Our computational methods are described in Koch, Lele, Lewis (2020) <doi:10.7939/r3-g6qb-bq70>.
Calculates maximum likelihood estimate, exact and asymptotic confidence intervals, and exact and asymptotic goodness of fit p-values for concentration of infectious units from serial limiting dilution assays. This package uses the likelihood equation, exact goodness of fit p-values, and exact confidence intervals described in Meyers et al. (1994) <http://jcm.asm.org/content/32/3/732.full.pdf>. This software is also implemented as a web application through the Shiny R package <https://iupm.shinyapps.io/sldassay/>.
The main function is icweib()
, which fits a stratified Weibull proportional hazards model for left censored, right censored, interval censored, and non-censored survival data. We parameterize the Weibull regression model so that it allows a stratum-specific baseline hazard function, but where the effects of other covariates are assumed to be constant across strata. Please refer to Xiangdong Gu, David Shapiro, Michael D. Hughes and Raji Balasubramanian (2014) <doi:10.32614/RJ-2014-003> for more details.
Semi-distance and mean-variance (MV) index are proposed to measure the dependence between a categorical random variable and a continuous variable. Test of independence and feature screening for classification problems can be implemented via the two dependence measures. For the details of the methods, see Zhong et al. (2023) <doi:10.1080/01621459.2023.2284988>; Cui and Zhong (2019) <doi:10.1016/j.csda.2019.05.004>; Cui, Li and Zhong (2015) <doi:10.1080/01621459.2014.920256>.
The United Nationsâ Sustainable Development Goals (SDGs) have become an important guideline for organisations to monitor and plan their contributions to social, economic, and environmental transformations. The text2sdg package is an open-source analysis package that identifies SDGs in text using scientifically developed query systems, opening up the opportunity to monitor any type of text-based data, such as scientific output or corporate publications. For more information regarding the methodology see Meier, Mata & Wulff (2022) <arXiv:2110.05856>
.
Declare data validation rules and data quality indicators; confront data with them and analyze or visualize the results. The package supports rules that are per-field, in-record, cross-record or cross-dataset. Rules can be automatically analyzed for rule type and connectivity. Supports checks implied by an SDMX DSD file as well. See also Van der Loo and De Jonge (2018) <doi:10.1002/9781118897126>, Chapter 6 and the JSS paper (2021) <doi:10.18637/jss.v097.i10>.
To implement disease ontology (DO) enrichment analysis, this package is designed and presents a double weighted model based on the latest annotations of the human genome with DO terms, by integrating the DO graph topology on a global scale. This package exhibits high accuracy that it can identify more specific DO terms, which alleviates the over enriched problem. The package includes various statistical models and visualization schemes for discovering the associations between genes and diseases from biological big data.
Perform large scale genomic data retrieval and functional annotation retrieval. This package aims to provide users with a standardized way to automate genome, proteome, RNA, coding sequence (CDS), GFF, and metagenome retrieval from NCBI RefSeq, NCBI Genbank, ENSEMBL, and UniProt databases. Furthermore, an interface to the BioMart database allows users to retrieve functional annotation for genomic loci. In addition, users can download entire databases such as NCBI RefSeq, NCBI nr, NCBI nt, NCBI Genbank, etc with only one command.
This package provides a simple way of fitting detection functions to distance sampling data for both line and point transects. Adjustment term selection, left and right truncation as well as monotonicity constraints and binning are supported. Abundance and density estimates can also be calculated (via a Horvitz-Thompson-like estimator) if survey area information is provided. See Miller et al. (2019) <doi:10.18637/jss.v089.i01> for more information on methods and <https://examples.distancesampling.org/> for example analyses.
This package provides a set of user-friendly functions to aid in organizing, plotting and analyzing event-related potential (ERP) data. Provides an easy-to-learn method to explore ERP data. Should be useful to those without a background in computer programming, and to those who are new to ERPs (or new to the more advanced ERP software available). Emphasis has been placed on highly automated processes using functions with as few arguments as possible. Expects processed (cleaned) data.
Group SLOPE (Group Sorted L1 Penalized Estimation) is a penalized linear regression method that is used for adaptive selection of groups of significant predictors in a high-dimensional linear model. The Group SLOPE method can control the (group) false discovery rate at a user-specified level (i.e., control the expected proportion of irrelevant among all selected groups of predictors). For additional information about the implemented methods please see Brzyski, Gossmann, Su, Bogdan (2018) <doi:10.1080/01621459.2017.1411269>.