This package provides the functions for planning and conducting a clinical trial with adaptive sample size determination. Maximal statistical efficiency will be exploited even when dramatic or multiple adaptations are made. Such a trial consists of adaptive determination of sample size at an interim analysis and implementation of frequentist statistical test at the interim and final analysis with a prefixed significance level. The required assumptions for the stage-wise test statistics are independent and stationary increments and normality. Predetermination of adaptation rule is not required.
Designed for the development and application of hidden Markov models and profile HMMs for biological sequence analysis. Contains functions for multiple and pairwise sequence alignment, model construction and parameter optimization, file import/export, implementation of the forward, backward and Viterbi algorithms for conditional sequence probabilities, tree-based sequence weighting, and sequence simulation. Features a wide variety of potential applications including database searching, gene-finding and annotation, phylogenetic analysis and sequence classification. Based on the models and algorithms described in Durbin et al (1998, ISBN: 9780521629713).
This package implements a likelihood-based method for genome polarization, identifying which alleles of SNV markers belong to either side of a barrier to gene flow. The approach co-estimates individual assignment, barrier strength, and divergence between sides, with direct application to studies of hybridization. Includes VCF-to-diem conversion and input checks, support for mixed ploidy and parallelization, and tools for visualization and diagnostic outputs. Based on diagnostic index expectation maximization as described in Baird et al. (2023) <doi:10.1111/2041-210X.14010>.
Distances on dual-weighted directed graphs using priority-queue shortest paths (Padgham (2019) <doi:10.32866/6945>). Weighted directed graphs have weights from A to B which may differ from those from B to A. Dual-weighted directed graphs have two sets of such weights. A canonical example is a street network to be used for routing in which routes are calculated by weighting distances according to the type of way and mode of transport, yet lengths of routes must be calculated from direct distances.
This package provides methods and tools designed to improve the forecast accuracy for a linearly constrained multiple time series, while fulfilling the linear/aggregation relationships linking the components (Girolimetto and Di Fonzo, 2024 <doi:10.48550/arXiv.2412.03429>). FoCo2 offers multi-task forecast combination and reconciliation approaches leveraging input from multiple forecasting models or experts and ensuring that the resulting forecasts satisfy specified linear constraints. In addition, linear inequality constraints (e.g., non-negativity of the forecasts) can be imposed, if needed.
Computes the power and sample size (PASS) required to test for the difference in the mean function between two groups under a repeatedly measured longitudinal or sparse functional design. See the manuscript by Koner and Luo (2023) <https://salilkoner.github.io/assets/PASS_manuscript.pdf> for details of the PASS formula and computational details. The details of the testing procedure for univariate and multivariate response are presented in Wang (2021) <doi:10.1214/21-EJS1802> and Koner and Luo (2023) <arXiv:2302.05612> respectively.
This package implements methods developed by Ding, Feller, and Miratrix (2016) <doi:10.1111/rssb.12124> <arXiv:1412.5000>, and Ding, Feller, and Miratrix (2018) <doi:10.1080/01621459.2017.1407322> <arXiv:1605.06566> for testing whether there is unexplained variation in treatment effects across observations, and for characterizing the extent of the explained and unexplained variation in treatment effects. The package includes wrapper functions implementing the proposed methods, as well as helper functions for analyzing and visualizing the results of the test.
Joint mean and dispersion effects models fit the mean and dispersion parameters of a response variable by two separate linear models, the mean and dispersion submodels, simultaneously. It also allows the users to choose either the deviance or the Pearson residuals as the response variable of the dispersion submodel. Furthermore, the package provides the possibility to nest the submodels in one another, if one of the parameters has significant explanatory power on the other. Wu & Li (2016) <doi:10.1016/j.csda.2016.04.015>.
This package provides a C++ header library for using the libsoda-cxx library with R. The C++ header reimplements the lsoda function from the ODEPACK library for solving initial value problems for first order ordinary differential equations (Hindmarsh, 1982; <https://computing.llnl.gov/sites/default/files/ODEPACK_pub1_u88007.pdf>). The C++ header can be used by other R packages by linking against this package. The C++ functions can be called inline using Rcpp'. Finally, the package provides an ode function to call from R.
Calculate a multivariate functional principal component analysis for data observed on different dimensional domains. The estimation algorithm relies on univariate basis expansions for each element of the multivariate functional data (Happ & Greven, 2018) <doi:10.1080/01621459.2016.1273115>. Multivariate and univariate functional data objects are represented by S4 classes for this type of data implemented in the package funData'. For more details on the general concepts of both packages and a case study, see Happ-Kurz (2020) <doi:10.18637/jss.v093.i05>.
This package implements methods for estimating generalized estimating equations (GEE) with advanced options for flexible modeling and handling missing data. This package provides tools to fit and analyze GEE models for longitudinal data, allowing users to address missingness using a variety of imputation techniques. It supports both univariate and multivariate modeling, visualization of missing data patterns, and facilitates the transformation of data for efficient statistical analysis. Designed for researchers working with complex datasets, it ensures robust estimation and inference in longitudinal and clustered data settings.
This package provides an interface to the Mapbox GL JS (<https://docs.mapbox.com/mapbox-gl-js/guides>) and the MapLibre GL JS (<https://maplibre.org/maplibre-gl-js/docs/>) interactive mapping libraries to help users create custom interactive maps in R. Users can create interactive globe visualizations; layer sf objects to create filled maps, circle maps, heatmaps', and three-dimensional graphics; and customize map styles and views. The package also includes utilities to use Mapbox and MapLibre maps in Shiny web applications.
Asymptotic efficient closed-form estimators (MLEces) are provided in this package for three multivariate distributions(gamma, Weibull and Dirichlet) whose maximum likelihood estimators (MLEs) are not in closed forms. Closed-form estimators are strong consistent, and have the similar asymptotic normal distribution like MLEs. But the calculation of MLEces are much faster than the corresponding MLEs. Further details and explanations of MLEces can be found in. Jang, et al. (2023) <doi:10.1111/stan.12299>. Kim, et al. (2023) <doi:10.1080/03610926.2023.2179880>.
This ONEST software implements the method of assessing the pathologist agreement in reading PD-L1 assays (Reisenbichler et al. (2020 <doi:10.1038/s41379-020-0544-x>)), to determine the minimum number of evaluators needed to estimate agreement involving a large number of raters. Input to the program should be binary(1/0) pathology data, where â 0â may stand for negative and â 1â for positive. Additional examples were given using the data from Rimm et al. (2017 <doi:10.1001/jamaoncol.2017.0013>).
Augmenting a matched data set by generating multiple stochastic, matched samples from the data using a multi-dimensional histogram constructed from dropping the input matched data into a multi-dimensional grid built on the full data set. The resulting stochastic, matched sets will likely provide a collectively higher coverage of the full data set compared to the single matched set. Each stochastic match is without duplication, thus allowing downstream validation techniques such as cross-validation to be applied to each set without concern for overfitting.
This software is useful for loading .fasta or .gbk files, and for retrieving sequences from GenBank dataset <https://www.ncbi.nlm.nih.gov/genbank/>. This package allows to detect differences or asymmetries based on nucleotide composition by using local linear kernel smoothers. Also, it is possible to draw inference about critical points (i. e. maximum or minimum points) related with the derivative curves. Additionally, bootstrap methods have been used for estimating confidence intervals and speed computational techniques (binning techniques) have been implemented in seq2R'.
This package provides helper functions and wrappers to simplify authentication, data retrieval, and result processing from the VALD APIs'. Designed to streamline integration for analysts and researchers working with VALD's external APIs'. For further documentation on integrating with VALD APIs', see: <https://support.vald.com/hc/en-au/articles/23415335574553-How-to-integrate-with-VALD-APIs>. For a step-by-step guide to using this package, see: <https://support.vald.com/hc/en-au/articles/48730811824281-A-guide-to-using-the-valdr-R-package>.
The AnVIL is a cloud computing resource developed in part by the National Human Genome Research Institute. The AnVIL package provides end-user and developer functionality. AnVIL provides fast binary package installation, utilities for working with Terra/AnVIL table and data resources, and convenient functions for file movement to and from Google cloud storage. For developers, AnVIL provides programmatic access to the Terra, Leonardo, Rawls, Dockstore, and Gen3 RESTful programming interface, including helper functions to transform JSON responses to formats more amenable to manipulation in R.
XBSeq is a novel algorithm for testing RNA-seq differential expression (DE), where a statistical model was established based on the assumption that observed signals are the convolution of true expression signals and sequencing noises. The mapped reads in non-exonic regions are considered as sequencing noises, which follows a Poisson distribution. Given measurable observed signal and background noise from RNA-seq data, true expression signals, assuming governed by the negative binomial distribution, can be delineated and thus the accurate detection of differential expressed genes.
This package provides various R programming tools for data manipulation, including:
- medical unit conversions 
- combining objects 
- character vector operations 
- factor manipulation 
- obtaining information about R objects 
- generating fixed-width format files 
- extricating components of date and time objects 
- operations on columns of data frames 
- matrix operations 
- operations on vectors and data frames 
- value of last evaluated expression 
- wrapper for - samplethat ensures consistent behavior for both scalar and vector arguments
Comprehensive set of tools for performing system identification of both linear and nonlinear dynamical systems directly from data. The Automatic Regression for Governing Equations (ARGOS) simplifies the complex task of constructing mathematical models of dynamical systems from observed input and output data, supporting various types of systems, including those described by ordinary differential equations. It employs optimal numerical derivatives for enhanced accuracy and employs formal variable selection techniques to help identify the most relevant variables, thereby enabling the development of predictive models for system behavior analysis.
Builds on gpuR and utilizes the clRNG ('OpenCL') library to provide efficient tools to generate independent random numbers in parallel on a GPU and save the results as R objects, ensuring high-quality random numbers even when R is used interactively or in an ad-hoc manner. Includes Fisher's simulation method adapted from Patefield, William M (1981) <doi:10.2307/2346669> and MRG31k3p Random Number Generator from clRNG library by Advanced Micro Devices, Inc. (2015) <https://github.com/clMathLibraries/clRNG>.
This package provides coefficients of interrater reliability that are generalized to cope with randomly incomplete (i.e. unbalanced) datasets without any imputation of missing values or any (row-wise or column-wise) omissions of actually available data. Applied to complete (balanced) datasets, these generalizations yield the same results as the common procedures, namely the Intraclass Correlation according to McGraw & Wong (1996) \doi10.1037/1082-989X.1.1.30 and the Coefficient of Concordance according to Kendall & Babington Smith (1939) \doi10.1214/aoms/1177732186.
The tdROC package facilitates the estimation of time-dependent ROC (Receiver Operating Characteristic) curves and the Area Under the time-dependent ROC Curve (AUC) in the context of survival data, accommodating scenarios with right censored data and the option to account for competing risks. In addition to the ROC/AUC estimation, the package also estimates time-dependent Brier score and survival difference. Confidence intervals of various estimated quantities can be obtained from bootstrap. The package also offers plotting functions for visualizing time-dependent ROC curves.