The Multivariate Asymptotic Non-parametric Test of Association (MANTA) enables non-parametric, asymptotic P-value computation for multivariate linear models. MANTA relies on the asymptotic null distribution of the PERMANOVA test statistic. P-values are computed using a highly accurate approximation of the corresponding cumulative distribution function. Garrido-Martà n et al. (2022) <doi:10.1101/2022.06.06.493041>.
This package provides functions for detecting multicollinearity. This test gives statistical support to two of the most famous methods for detecting multicollinearity in applied work: Kleinâ s rule and Variance Inflation Factor (VIF). See the URL for the papers associated with this package, as for instance, Morales-Oñate and Morales-Oñate (2015) <doi:10.33333/rp.vol51n2.05>.
Predictive multivariate modelling for metabolomics. Types: Classification and regression. Methods: Partial Least Squares, Random Forest ans Elastic Net Data structures: Paired and unpaired Validation: repeated double cross-validation (Westerhuis et al. (2008)<doi:10.1007/s11306-007-0099-6>, Filzmoser et al. (2009)<doi:10.1002/cem.1225>) Variable selection: Performed internally, through tuning in the inner cross-validation loop.
Package for learning and evaluating (subgroup) policies via doubly robust loss functions. Policy learning methods include doubly robust blip/conditional average treatment effect learning and sequential policy tree learning. Methods for (subgroup) policy evaluation include doubly robust cross-fitting and online estimation/sequential validation. See Nordland and Holst (2022) <doi:10.48550/arXiv.2212.02335> for documentation and references.
Set of tools to automatize extraction of data on pests from EPPO Data Services and EPPO Global Database and to put them into tables with human readable format. Those function use EPPO database API', thus you first need to register on <https://data.eppo.int> (free of charge). Additional helpers allow to download, check and connect to SQLite EPPO database'.
Currently incorporate the generalized odds-rate model (a type of linear transformation model) for interval-censored data based on penalized monotonic B-Spline. More methods under other semiparametric models such as cure model or additive model will be included in future versions. For more details see Lu, M., Liu, Y., Li, C. and Sun, J. (2019) <arXiv:1912.11703>.
Estimates the coefficients of the two-time centered autologistic regression model based on Gegout-Petit A., Guerin-Dubrana L., Li S. "A new centered spatio-temporal autologistic regression model. Application to local spread of plant diseases." 2019. <arXiv:1811.06782>, using a grid of binary variables to estimate the spread of a disease on the grid over the years.
Fast and efficient sampling from general univariate probability density functions. Implements a rejection sampling approach designed to take advantage of modern CPU caches and minimise evaluation of the target density for most samples. Many standard densities are internally implemented in C for high performance, with general user defined densities also supported. A paper describing the methodology will be released soon.
This package provides a general spatiotemporal satellite image imputation method based on sparse functional data analytic techniques. The imputation method applies and extends the Functional Principal Analysis by Conditional Estimation (PACE). The underlying idea for the proposed procedure is to impute a missing pixel by borrowing information from temporally and spatially contiguous pixels based on the best linear unbiased prediction.
Simulation extrapolation and inverse probability weighted generalized estimating equations method for longitudinal data with missing observations and measurement error in covariates. References: Yi, G. Y. (2008) <doi:10.1093/biostatistics/kxm054>; Cook, J. R. and Stefanski, L. A. (1994) <doi:10.1080/01621459.1994.10476871>; Little, R. J. A. and Rubin, D. B. (2002, ISBN:978-0-471-18386-0).
Variable Penalty Dynamic Time Warping (VPdtw) for aligning chromatographic signals. With an appropriate penalty this method performs good alignment of chromatographic data without deforming the peaks (Clifford, D., Stone, G., Montoliu, I., Rezzi S., Martin F., Guy P., Bruce S., and Kochhar S.(2009) <doi:10.1021/ac802041e>; Clifford, D. and Stone, G. (2012) <doi:10.18637/jss.v047.i08>).
Implementation of Azure DevOps <https://azure.microsoft.com/> API calls. It enables the extraction of information about repositories, build and release definitions and individual releases. It also helps create repositories and work items within a project without logging into Azure DevOps'. There is the ability to use any API service with a shell for any non-predefined call.
Fits hierarchical regularized regression models to incorporate potentially informative external data, Weaver and Lewinger (2019) <doi:10.21105/joss.01761>. Utilizes coordinate descent to efficiently fit regularized regression models both with and without external information with the most common penalties used in practice (i.e. ridge, lasso, elastic net). Support for standard R matrices, sparse matrices and big.matrix objects.
The Bayesian modelling of relative sea-level data using a comprehensive approach that incorporates various statistical models within a unifying framework. Details regarding each statistical models; linear regression (Ashe et al 2019) <doi:10.1016/j.quascirev.2018.10.032>, change point models (Cahill et al 2015) <doi:10.1088/1748-9326/10/8/084002>, integrated Gaussian process models (Cahill et al 2015) <doi:10.1214/15-AOAS824>, temporal splines (Upton et al 2023) <arXiv:2301.09556>, spatio-temporal splines (Upton et al 2023) <arXiv:2301.09556> and generalised additive models (Upton et al 2023) <arXiv:2301.09556>. This package facilitates data loading, model fitting and result summarisation. Notably, it accommodates the inherent measurement errors found in relative sea-level data across multiple dimensions, allowing for their inclusion in the statistical models.
T (extent of the primary tumor), N (absence or presence and extent of regional lymph node metastasis) and M (absence or presence of distant metastasis) are three components to describe the anatomical tumor extent. TNM stage is important in treatment decision-making and outcome predicting. The existing oropharyngeal Cancer (OPC) TNM stages have not made distinction of the two sub sites of Human papillomavirus positive (HPV+) and Human papillomavirus negative (HPV-) diseases. We developed novel criteria to assess performance of the TNM stage grouping schemes based on parametric modeling adjusting on important clinical factors. These criteria evaluate the TNM stage grouping scheme in five different measures: hazard consistency, hazard discrimination, explained variation, likelihood difference, and balance. The methods are described in Xu, W., et al. (2015) <https://www.austinpublishinggroup.com/biometrics/fulltext/biometrics-v2-id1014.php>.
This package provides resampling procedures to assess the stability of selected variables with additional finite sample error control for high-dimensional variable selection procedures such as Lasso or boosting. Both, standard stability selection (Meinshausen & Buhlmann, 2010) and complementary pairs stability selection with improved error bounds (Shah & Samworth, 2013) are implemented. The package can be combined with arbitrary user specified variable selection approaches.
This package contains various tools for working with and evaluating cross-validated area under the ROC curve (AUC) estimators. The primary functions of the package are ci.cvAUC and ci.pooled.cvAUC, which report cross-validated AUC and compute confidence intervals for cross-validated AUC estimates based on influence curves for i.i.d. and pooled repeated measures data, respectively.
This package provides colour choice in information visualisation. It important in order to avoid being mislead by inherent bias in the used colour palette. This package provides access to the perceptually uniform and colour-blindness friendly palettes developed by Fabio Crameri and released under the "Scientific Colour-Maps" moniker. The package contains 24 different palettes and includes both diverging and sequential types.
This package provides methods to detect the differential composition abundances between conditions in singel-cell RNA-seq experiments, with or without replicates. It aims to correct bias introduced by missclaisification and enable controlling of confounding covariates. To avoid the influence of proportion change from big cell types, DCATS can use either total cell number or specific reference group as normalization term.
This package provides diagnostics for assessing genomic DNA contamination in RNA-seq data, as well as plots representing these diagnostics. Moreover, the package can be used to get an insight into the strand library protocol used and, in case of strand-specific libraries, the strandedness of the data. Furthermore, it provides functionality to filter out reads of potential gDNA origin.
This package provides methods to perform trajectory analysis based on a minimum spanning tree constructed from cluster centroids. Computes pseudotemporal cell orderings by mapping cells in each cluster (or new cells) to the closest edge in the tree. Uses linear modelling to identify differentially expressed genes along each path through the tree. Several plotting and interactive visualization functions are also implemented.
Using a Bayesian estimation procedure, this package fits linear quantile regression models such as linear quantile models, linear quantile mixed models, quantile regression joint models for time-to-event and longitudinal data. The estimation procedure is based on the asymmetric Laplace distribution and the JAGS software is used to get posterior samples (Yang, Luo, DeSantis (2019) <doi:10.1177/0962280218784757>).
Implementations of the family of map() functions with frequent saving of the intermediate results. The contained functions let you start the evaluation of the iterations where you stopped (reading the already evaluated ones from cache), and work with the currently evaluated iterations while remaining ones are running in a background job. Parallel computing is also easier with the workers parameter.
Interact with the FRED API, <https://fred.stlouisfed.org/docs/api/fred/>, to fetch observations across economic series; find information about different economic sources, releases, series, etc.; conduct searches by series name, attributes, or tags; and determine the latest updates. Includes functions for creating panels of related variables with minimal effort and datasets containing data sources, releases, and popular FRED tags.