Joint mean and dispersion effects models fit the mean and dispersion parameters of a response variable by two separate linear models, the mean and dispersion submodels, simultaneously. It also allows the users to choose either the deviance or the Pearson residuals as the response variable of the dispersion submodel. Furthermore, the package provides the possibility to nest the submodels in one another, if one of the parameters has significant explanatory power on the other. Wu & Li (2016) <doi:10.1016/j.csda.2016.04.015>.
This package provides a C++ header library for using the libsoda-cxx library with R. The C++ header reimplements the lsoda function from the ODEPACK library for solving initial value problems for first order ordinary differential equations (Hindmarsh, 1982; <https://computing.llnl.gov/sites/default/files/ODEPACK_pub1_u88007.pdf>). The C++ header can be used by other R packages by linking against this package. The C++ functions can be called inline using Rcpp'. Finally, the package provides an ode function to call from R.
This package provides an interface to the Mapbox GL JS (<https://docs.mapbox.com/mapbox-gl-js/guides>) and the MapLibre
GL JS (<https://maplibre.org/maplibre-gl-js/docs/>) interactive mapping libraries to help users create custom interactive maps in R. Users can create interactive globe visualizations; layer sf objects to create filled maps, circle maps, heatmaps', and three-dimensional graphics; and customize map styles and views. The package also includes utilities to use Mapbox and MapLibre
maps in Shiny web applications.
Asymptotic efficient closed-form estimators (MLEces) are provided in this package for three multivariate distributions(gamma, Weibull and Dirichlet) whose maximum likelihood estimators (MLEs) are not in closed forms. Closed-form estimators are strong consistent, and have the similar asymptotic normal distribution like MLEs. But the calculation of MLEces are much faster than the corresponding MLEs. Further details and explanations of MLEces can be found in. Jang, et al. (2023) <doi:10.1111/stan.12299>. Kim, et al. (2023) <doi:10.1080/03610926.2023.2179880>.
Calculate a multivariate functional principal component analysis for data observed on different dimensional domains. The estimation algorithm relies on univariate basis expansions for each element of the multivariate functional data (Happ & Greven, 2018) <doi:10.1080/01621459.2016.1273115>. Multivariate and univariate functional data objects are represented by S4 classes for this type of data implemented in the package funData
'. For more details on the general concepts of both packages and a case study, see Happ-Kurz (2020) <doi:10.18637/jss.v093.i05>.
This package implements methods for estimating generalized estimating equations (GEE) with advanced options for flexible modeling and handling missing data. This package provides tools to fit and analyze GEE models for longitudinal data, allowing users to address missingness using a variety of imputation techniques. It supports both univariate and multivariate modeling, visualization of missing data patterns, and facilitates the transformation of data for efficient statistical analysis. Designed for researchers working with complex datasets, it ensures robust estimation and inference in longitudinal and clustered data settings.
This ONEST software implements the method of assessing the pathologist agreement in reading PD-L1 assays (Reisenbichler et al. (2020 <doi:10.1038/s41379-020-0544-x>)), to determine the minimum number of evaluators needed to estimate agreement involving a large number of raters. Input to the program should be binary(1/0) pathology data, where â 0â may stand for negative and â 1â for positive. Additional examples were given using the data from Rimm et al. (2017 <doi:10.1001/jamaoncol.2017.0013>).
An implementation of the unified framework for assessing partial association between ordinal variables after adjusting for a set of covariates (Dungang Liu, Shaobo Li, Yan Yu and Irini Moustaki (2020), accepted by the Journal of the American Statistical Association). This package provides a set of tools to quantify, visualize, and test partial associations between multiple ordinal variables. It can produce a number of $phi$ measures, partial regression plots, 3-D plots, and $p$-values for testing $H_0: phi=0$ or $H_0: phi <= delta$.
This software is useful for loading .fasta or .gbk files, and for retrieving sequences from GenBank
dataset <https://www.ncbi.nlm.nih.gov/genbank/>. This package allows to detect differences or asymmetries based on nucleotide composition by using local linear kernel smoothers. Also, it is possible to draw inference about critical points (i. e. maximum or minimum points) related with the derivative curves. Additionally, bootstrap methods have been used for estimating confidence intervals and speed computational techniques (binning techniques) have been implemented in seq2R'.
Augmenting a matched data set by generating multiple stochastic, matched samples from the data using a multi-dimensional histogram constructed from dropping the input matched data into a multi-dimensional grid built on the full data set. The resulting stochastic, matched sets will likely provide a collectively higher coverage of the full data set compared to the single matched set. Each stochastic match is without duplication, thus allowing downstream validation techniques such as cross-validation to be applied to each set without concern for overfitting.
The objective of AGDEX
is to evaluate whether the results of a pair of two-group differential expression analysis comparisons show a level of agreement that is greater than expected if the group labels for each two-group comparison are randomly assigned. The agreement is evaluated for the entire transcriptome and (optionally) for a collection of pre-defined gene-sets. Additionally, the procedure performs permutation-based differential expression and meta analysis at both gene and gene-set levels of the data from each experiment.
Comprehensive set of tools for performing system identification of both linear and nonlinear dynamical systems directly from data. The Automatic Regression for Governing Equations (ARGOS) simplifies the complex task of constructing mathematical models of dynamical systems from observed input and output data, supporting various types of systems, including those described by ordinary differential equations. It employs optimal numerical derivatives for enhanced accuracy and employs formal variable selection techniques to help identify the most relevant variables, thereby enabling the development of predictive models for system behavior analysis.
Builds on gpuR
and utilizes the clRNG
('OpenCL
') library to provide efficient tools to generate independent random numbers in parallel on a GPU and save the results as R objects, ensuring high-quality random numbers even when R is used interactively or in an ad-hoc manner. Includes Fisher's simulation method adapted from Patefield, William M (1981) <doi:10.2307/2346669> and MRG31k3p Random Number Generator from clRNG
library by Advanced Micro Devices, Inc. (2015) <https://github.com/clMathLibraries/clRNG>
.
This package provides coefficients of interrater reliability that are generalized to cope with randomly incomplete (i.e. unbalanced) datasets without any imputation of missing values or any (row-wise or column-wise) omissions of actually available data. Applied to complete (balanced) datasets, these generalizations yield the same results as the common procedures, namely the Intraclass Correlation according to McGraw
& Wong (1996) \doi10.1037/1082-989X.1.1.30 and the Coefficient of Concordance according to Kendall & Babington Smith (1939) \doi10.1214/aoms/1177732186.
Designed for the curation and analysis of data generated from real-time quaking-induced conversion (RT-QuIC
) assays first described by Atarashi et al. (2011) <doi:10.1038/nm.2294>. quicR
calculates useful metrics such as maxpoint ratio: Rowden et al. (2023) <doi:10.1099/vir.0.069906-0>; time-to-threshold: Shi et al. (2013) <doi:10.1186/2051-5960-1-44>; and maximum slope. Integration with the output from plate readers allows for seamless input of raw data into the R environment.
This package implements nonlinear autoregressive (AR) time series models. For univariate series, a non-parametric approach is available through additive nonlinear AR. Parametric modeling and testing for regime switching dynamics is available when the transition is either direct (TAR: threshold AR) or smooth (STAR: smooth transition AR, LSTAR). For multivariate series, one can estimate a range of TVAR or threshold cointegration TVECM models with two or three regimes. Tests can be conducted for TVAR as well as for TVECM (Hansen and Seo 2002 and Seo 2006).
This package provides a comprehensive suite of statistical tools for analyzing, simulating, and computing properties of the Topp-Leone Cauchy Rayleigh (TLCAR) distribution, a versatile distribution amalgamating features of the Topp-Leone, Cauchy, and Rayleigh distributions, ideal for modeling intricate, heterogeneous data across scientific domains. See Atchadé, M.N., Bogninou, M.J., and Djibril, A.M. (2023) <doi:10.1007/s44199-023-00066-4> and Atchadé, M.N., Bogninou, M.J., and Djibril, A.M. (2024) <doi:10.1007/s44199-023-00069-1> for further insights.
The tdROC
package facilitates the estimation of time-dependent ROC (Receiver Operating Characteristic) curves and the Area Under the time-dependent ROC Curve (AUC) in the context of survival data, accommodating scenarios with right censored data and the option to account for competing risks. In addition to the ROC/AUC estimation, the package also estimates time-dependent Brier score and survival difference. Confidence intervals of various estimated quantities can be obtained from bootstrap. The package also offers plotting functions for visualizing time-dependent ROC curves.
This tool provides functions to load, segment and classify zooplankton images. The image processing algorithms and the machine learning classifiers in this package are (will be, since these have not been added yet) direct ports of an early python implementation that can be found at <https://github.com/arickGrootveld/ZooID>
. The model weights and datasets (also not added yet) that are a part of this package can also be found at Arick Grootveld, Eva R. Kozak, Carmen Franco-Gordo (2023) <doi:10.5281/zenodo.7979996>.
XBSeq is a novel algorithm for testing RNA-seq differential expression (DE), where a statistical model was established based on the assumption that observed signals are the convolution of true expression signals and sequencing noises. The mapped reads in non-exonic regions are considered as sequencing noises, which follows a Poisson distribution. Given measurable observed signal and background noise from RNA-seq data, true expression signals, assuming governed by the negative binomial distribution, can be delineated and thus the accurate detection of differential expressed genes.
The AnVIL is a cloud computing resource developed in part by the National Human Genome Research Institute. The AnVIL package provides end-user and developer functionality. AnVIL provides fast binary package installation, utilities for working with Terra/AnVIL table and data resources, and convenient functions for file movement to and from Google cloud storage. For developers, AnVIL provides programmatic access to the Terra, Leonardo, Rawls, Dockstore, and Gen3 RESTful programming interface, including helper functions to transform JSON responses to formats more amenable to manipulation in R.
This package provides various R programming tools for data manipulation, including:
medical unit conversions
combining objects
character vector operations
factor manipulation
obtaining information about R objects
generating fixed-width format files
extricating components of date and time objects
operations on columns of data frames
matrix operations
operations on vectors and data frames
value of last evaluated expression
wrapper for
sample
that ensures consistent behavior for both scalar and vector arguments
This package implements an efficient algorithm for solving sparse-penalized support vector machines with kernel density convolution. This package is designed for high-dimensional classification tasks, supporting lasso (L1) and elastic-net penalties for sparse feature selection and providing options for tuning kernel bandwidth and penalty weights. The dcsvm is applicable to fields such as bioinformatics, image analysis, and text classification, where high-dimensional data commonly arise. Learn more about the methodology and algorithm at Wang, Zhou, Gu, and Zou (2023) <doi:10.1109/TIT.2022.3222767>.
Decomposition of (income) inequality by population sub groups. For a decomposition on a single variable the mean log deviation can be used (see Mookherjee Shorrocks (1982) <DOI:10.2307/2232673>). For a decomposition on multiple variables a regression based technique can be used (see Fields (2003) <DOI:10.1016/s0147-9121(03)22001-x>). Recentered influence function regression for marginal effects of the (income or wealth) distribution (see Firpo et al. (2009) <DOI:10.3982/ECTA6822>). Some extensions to inequality functions to handle weights and/or missings.