Perform censored quantile regression of Huang (2010) <doi:10.1214/09-AOS771>, and restore monotonicity respecting via adaptive interpolation for dynamic regression of Huang (2017) <doi:10.1080/01621459.2016.1149070>. The monotonicity-respecting restoration applies to general dynamic regression models including (uncensored or censored) quantile regression model, additive hazards model, and dynamic survival models of Peng and Huang (2007) <doi:10.1093/biomet/asm058>, among others.
Decorrelates a set of summary statistics (i.e., Z-scores or P-values per SNP) via Decorrelation by Orthogonal Transformation (DOT) approach and performs gene-set analyses by combining transformed statistic values; operations are performed with algorithms that rely only on the association summary results and the linkage disequilibrium (LD). For more details on DOT and its power, see Olga (2020) <doi:10.1371/journal.pcbi.1007819>.
Replication methods to compute some basic statistic operations (means, standard deviations, frequency tables, percentiles, mean comparisons using weighted effect coding, generalized linear models, and linear multilevel models) in complex survey designs comprising multiple imputed or nested imputed variables and/or a clustered sampling structure which both deserve special procedures at least in estimating standard errors. See the package documentation for a more detailed description along with references.
This package provides flexible odds ratio curves that enable modeling non-linear relationships between continuous predictors and binary outcomes. This package facilitates a deeper understanding of the impact of each continuous predictor on the outcome by presenting results in terms of odds ratio (OR) curves based on splines. These curves allow for comparison against a specified reference value, aiding in the interpretation of the predictor's effect.
Fair machine learning regression models which take sensitive attributes into account in model estimation. Currently implementing Komiyama et al. (2018) <http://proceedings.mlr.press/v80/komiyama18a/komiyama18a.pdf>, Zafar et al. (2019) <https://www.jmlr.org/papers/volume20/18-262/18-262.pdf> and my own approach from Scutari, Panero and Proissl (2022) <doi:10.1007/s11222-022-10143-w> that uses ridge regression to enforce fairness.
Fits a multivariate linear mixed effects model that uses a polygenic term, after Zhou & Stephens (2014) (<https://www.nature.com/articles/nmeth.2848>). Of particular interest is the estimation of variance components with restricted maximum likelihood (REML) methods. Genome-wide efficient mixed-model association (GEMMA), as implemented in the package gemma2', uses an expectation-maximization algorithm for variance components inference for use in quantitative trait locus studies.
Activate dark mode on your favorite ggplot2 theme with dark_mode()
or use the dark versions of ggplot2 themes, including dark_theme_gray()
, dark_theme_minimal()
, and others. When a dark theme is applied, all geom color and geom fill defaults are changed to make them visible against a dark background. To restore the defaults to their original values, use invert_geom_defaults()
.
Launches a shiny based application for Nuclear Magnetic Resonance (NMR)data importation and Statistical TOtal Correlation SpectroscopY
(STOCSY) analyses in a full interactive approach. The theoretical background and applications of STOCSY method could be found at Cloarec, O., Dumas, M. E., Craig, A., Barton, R. H., Trygg, J., Hudson, J., Blancher, C., Gauguier, D., Lindon, J. C., Holmes, E. & Nicholson, J. (2005) <doi:10.1021/ac048630x>.
This package provides a new method to implement clustering from multiple modality data of certain samples, the function M2SMjF()
jointly factorizes multiple similarity matrices into a shared sub-matrix and several modality private sub-matrices, which is further used for clustering. Along with this method, we also provide function to calculate the similarity matrix and function to evaluate the best cluster number from the original data.
This package provides a collection of statistical tests for the detection of differential item functioning (DIF) in multistage tests. Methods entail logistic regression, an adaptation of the simultaneous item bias test (SIBTEST), and various score-based tests. The presented tests provide itemwise test for DIF along categorical, ordinal or metric covariates. Methods for uniform and non-uniform DIF effects are available depending on which method is used.
This package performs nonparametric analysis of longitudinal data in factorial experiments. Longitudinal data are those which are collected from the same subjects over time, and they frequently arise in biological sciences. Nonparametric methods do not require distributional assumptions, and are applicable to a variety of data types (continuous, discrete, purely ordinal, and dichotomous). Such methods are also robust with respect to outliers and for small sample sizes.
Estimates DNA target concentration by classifying digital PCR (polymerase chain reaction) droplets as positive, negative, or rain, using Expectation-Maximization Clustering. The fitting is accomplished using the EMMIXskew R package (v. 1.0.3) by Kui Wang, Angus Ng, and Geoff McLachlan
(2018) as based on their paper "Multivariate Skew t Mixture Models: Applications to Fluorescence-Activated Cell Sorting Data" <doi:10.1109/DICTA.2009.88>.
Estimation of the Spatially Smoothed Minimum Regularized Determinant (ssMRCD
) estimator and its usage in an ssMRCD-based
outlier detection method as described in Puchhammer and Filzmoser (2023) <doi:10.1080/10618600.2023.2277875> and for sparse robust PCA for multi-source data described in Puchhammer, Wilms and Filzmoser (2024) <doi:10.48550/arXiv.2407.16299>
. Included are also complementary visualization and parameter tuning tools.
An R API providing access to a relational database with macroeconomic time series data for South Africa, obtained from the South African Reserve Bank (SARB) and Statistics South Africa (STATSSA), and updated on a weekly basis via the EconData
<https://www.econdata.co.za/> platform and automated scraping of the SARB and STATSSA websites. The database is maintained at the Department of Economics at Stellenbosch University.
Identifies what optimal subset of a desired number of items should be retained in a short version of a psychometric instrument to assess the â broadestâ proportion of the construct-level content of the set of items included in the original version of the said psychometric instrument. Expects a symmetric adjacency matrix as input (undirected weighted network model). Supports brute force and simulated annealing combinatorial search algorithms.
Estimation of group-based trajectory models, including finite mixture models for longitudinal data, supporting censored normal, zero-inflated Poisson, logit, and beta distributions, using expectation-maximization and quasi-Newton methods, with tools for model selection, diagnostics, and visualization of latent trajectory groups, <doi:10.4159/9780674041318>, Nagin, D. (2005). Group-Based Modeling of Development. Cambridge, MA: Harvard University Press. and Noel (2022), <https://orbilu.uni.lu/>, thesis.
This package provides low-level access to GDAL functionality. GDAL is the Geospatial Data Abstraction Library a translator for raster and vector geospatial data formats that presents a single raster abstract data model and single vector abstract data model to the calling application for all supported formats <https://gdal.org/>. This package is focussed on providing exactly and only what GDAL does, to enable developing further tools.
Conditional distance correlation <doi:10.1080/01621459.2014.993081> is a novel conditional dependence measurement of two multivariate random variables given a confounding variable. This package provides conditional distance correlation, performs the conditional distance correlation sure independence screening procedure for ultrahigh dimensional data <https://www3.stat.sinica.edu.tw/statistica/J28N1/J28N114/J28N114.html>, and conducts conditional distance covariance test for conditional independence assumption of two multivariate variable.
Threshold regression models are also called two-phase regression, broken-stick regression, split-point regression, structural change models, and regression kink models, with and without interaction terms. Methods for both continuous and discontinuous threshold models are included, but the support for the former is much greater. This package is described in Fong, Huang, Gilbert and Permar (2017) <DOI:10.1186/s12859-017-1863-x> and the package vignette.
This package provides a variety of functions to analyze and model geostatistical count data with Gaussian copulas, including 1) data simulation and visualization; 2) correlation structure assessment (here also known as the Normal To Anything); 3) calculate multivariate normal rectangle probabilities; 4) likelihood inference and parallel prediction at predictive locations. Description of the method is available from: Han and DeOliveira
(2018) <doi:10.18637/jss.v087.i13>.
This package provides a grammar of graphics approach for visualizing summary statistics from multiple Genome-wide Association Studies (GWAS). It offers geneticists, bioinformaticians, and researchers a powerful yet flexible tool for illustrating complex genetic associations using data from various GWAS datasets. The visualizations can be extensively customized, facilitating detailed comparative analysis across different genetic studies. Reference: Uffelmann, E. et al. (2021) <doi:10.1038/s43586-021-00056-9>.
This package provides tools for assessing and diagnosing convergence of Markov Chain Monte Carlo simulations, as well as for graphically display results from full MCMC analysis. The package also facilitates the graphical interpretation of models by providing flexible functions to plot the results against observed variables, and functions to work with hierarchical/multilevel batches of parameters (Fernández-i-Marà n, 2016 <doi:10.18637/jss.v070.i09>).
This package provides a Kriging method for functional datasets with spatial dependency. This functional Kriging method avoids the need to estimate the trace-variogram, and the curve is estimated by minimizing a quadratic form. The curves in the functional dataset are smoothed using Fourier series. The functional Kriging of this package is a modification of the method proposed by Giraldo (2011) <doi:10.1007/s10651-010-0143-y>.
R function gawdis()
produces multi-trait dissimilarity with more uniform contributions of different traits. de Bello et al. (2021) <doi:10.1111/2041-210X.13537> presented the approach based on minimizing the differences in the correlation between the dissimilarity of each trait, or groups of traits, and the multi-trait dissimilarity. This is done using either an analytic or a numerical solution, both available in the function.