Efficient implementations of cross-validation techniques for linear and ridge regression models, leveraging C++ code with Rcpp', RcppParallel
', and Eigen libraries. It supports leave-one-out, generalized, and K-fold cross-validation methods, utilizing Eigen matrices for high performance. Methodology references: Hastie, Tibshirani, and Friedman (2009) <doi:10.1007/978-0-387-84858-7>.
Latent process embedding for functional network data with the Functional Adjacency Spectral Embedding. Fits smooth latent processes based on cubic spline bases. Also generates functional network data from three models, and evaluates a network generalized cross-validation criterion for dimension selection. For more information, see MacDonald
, Zhu and Levina (2022+) <arXiv:2210.07491>
.
Several group factor analysis algorithms are implemented, including Canonical Correlation-based Estimation by Choi et al. (2021) <doi:10.1016/j.jeconom.2021.09.008> , Generalised Canonical Correlation Estimation by Lin and Shin (2023) <doi:10.2139/ssrn.4295429>, Circularly Projected Estimation by Chen (2022) <doi:10.1080/07350015.2022.2051520>, and Aggregated projection method.
The half-weight index gregariousness (HWIG) is an association index used in social network analyses. It extends the half-weight association index (HWI), correcting for level of gregariousness in individuals. It is calculated using group by individual data according to methods described in Godde et al. (2013) <doi:10.1016/j.anbehav.2012.12.010>.
Creating effective colour palettes for figures is challenging. This package generates and plot palettes of optimally distinct colours in perceptually uniform colour space, based on iwanthue <http://tools.medialab.sciences-po.fr/iwanthue/>. This is done through k-means clustering of CIE Lab colour space, according to user-selected constraints on hue, chroma, and lightness.
This package implements an efficient algorithm to fit and tune penalized quantile regression models using the generalized coordinate descent algorithm. Designed to handle high-dimensional datasets effectively, with emphasis on precision and computational efficiency. This package implements the algorithms proposed in Tang, Q., Zhang, Y., & Wang, B. (2022) <https://openreview.net/pdf?id=RvwMTDYTOb>
.
Calculates Model-Averaged Tail Area Wald (MATA-Wald) confidence intervals, and MATA-Wald confidence densities and distributions, which are constructed using single-model frequentist estimators and model weights. See Turek and Fletcher (2012) <doi:10.1016/j.csda.2012.03.002> and Fletcher et al (2019) <doi:10.1007/s10651-019-00432-5> for details.
This package provides functions for dimension reduction, using MAVE (Minimum Average Variance Estimation), OPG (Outer Product of Gradient) and KSIR (sliced inverse regression of kernel version). Methods for selecting the best dimension are also included. Xia (2002) <doi:10.1111/1467-9868.03411>; Xia (2007) <doi:10.1214/009053607000000352>; Wang (2008) <doi:10.1198/016214508000000418>.
This package implements an MCMC sampler for the posterior distribution of arbitrary time-homogeneous multivariate stochastic differential equation (SDE) models with possibly latent components. The package provides a simple entry point to integrate user-defined models directly with the sampler's C++ code, and parallelizes large portions of the calculations when compiled with OpenMP
'.
The ntfy (pronounce: notify) service is a simple HTTP-based pub-sub notification service. It allows you to send notifications to your phone or desktop via scripts from any computer, entirely without signup, cost or setup. It's also open source if you want to run your own. Visit <https://ntfy.sh> for more details.
Miscellaneous R functions developed as collateral damage over the course of work in statistical and scientific computing for research. These include, for example, utilities that supplement existing idiosyncrasies of the R language, extend existing plotting functionality and aesthetics, help prepare data objects for imputation, and extend access to command line tools and systems-level information.
Fits a non-linear transformation model ('nltm') for analyzing survival data, see Tsodikov (2003) <doi:10.1111/1467-9868.00414>. The class of nltm includes the following currently supported models: Cox proportional hazard, proportional hazard cure, proportional odds, proportional hazard - proportional hazard cure, proportional hazard - proportional odds cure, Gamma frailty, and proportional hazard - proportional odds.
Calculate superior identification index and its extensions. Measure the performance of journals based on how well they could identify the top papers by any index (e.g. citation indices) according to Huang & Yang. (2022) <doi:10.1007/s11192-022-04372-z>. These methods could be extended to evaluate other entities such as institutes, countries, etc.
The goal of SIHR is to provide inference procedures in the high-dimensional generalized linear regression setting for: (1) linear functionals <doi:10.48550/arXiv.1904.12891>
<doi:10.48550/arXiv.2012.07133>
, (2) conditional average treatment effects, (3) quadratic functionals <doi:10.48550/arXiv.1909.01503>
, (4) inner product, (5) distance.
The zlib package for R aims to offer an R-based equivalent of Python's built-in zlib module for data compression and decompression. This package provides a suite of functions for working with zlib compression, including utilities for compressing and decompressing data streams, manipulating compressed files, and working with gzip', zlib', and deflate formats.
Discovery of genome-wide variable alternative splicing events from short-read RNA-seq data and visualizations of gene splicing information for publication-quality multi-panel figures in a population. (Warning: The visualizing function is removed due to the dependent package Sushi deprecated. If you want to use it, please change back to an older version.).
Rlwrap is a 'readline wrapper', a small utility that uses the GNU readline library to allow the editing of keyboard input for any command. You should consider rlwrap especially when you need user-defined completion (by way of completion word lists) and persistent history, or if you want to program `special effects' using the filter mechanism.
This package provides a method for automatic detection of peaks in noisy periodic and quasi-periodic signals. This method, called automatic multiscale-based peak detection (AMPD), is based on the calculation and analysis of the local maxima scalogram, a matrix comprising the scale-dependent occurrences of local maxima. For further information see <doi:10.3390/a5040588>.
Anytime-valid inference for linear models, namely, sequential t-tests, sequential F-tests, and confidence sequences with time-uniform Type-I error and coverage guarantees. This allows hypotheses to be continuously tested without sacrificing false positive guarantees. It is based on the methods documented in Lindon et al. (2022) <doi:10.48550/arXiv.2210.08589>
.
Assists researchers and output checkers by distinguishing between research output that is safe to publish, output that requires further analysis, and output that cannot be published because of substantial disclosure risk. A paper about the tool was presented at the UNECE Expert Meeting on Statistical Data Confidentiality 2023; see <https://uwe-repository.worktribe.com/output/11060964>.
This package implements the Agnostic Fay-Herriot model, an extension of the traditional small area model. In place of normal sampling errors, the sampling error distribution is estimated with a Gaussian process to accommodate a broader class of distributions. This flexibility is most useful in the presence of bounded, multi-modal, or heavily skewed sampling errors.
Create correlation (or partial correlation) matrices. Correlation matrices are formatted with significance stars based on user preferences. Matrices of coefficients, p-values, and number of pairwise observations are returned. Send resultant formatted matrices to the clipboard to be pasted into excel and other programs. A plot method allows users to visualize correlation matrices created with corx'.
Cobb's maximum likelihood method for cusp-catastrophe modeling (Grasman, van der Maas, and Wagenmakers (2009) <doi:10.18637/jss.v032.i08>; Cobb (1981), Behavioral Science, 26(1), 75-78). Includes a cusp()
function for model fitting, and several utility functions for plotting, and for comparing the model to linear regression and logistic curve models.
Connect to the California Data Exchange Center (CDEC) Web Service <http://cdec.water.ca.gov/>. CDEC provides a centralized database to store, process, and exchange real-time hydrologic information gathered by various cooperators throughout California. The CDEC Web Service <http://cdec.water.ca.gov/dynamicapp/wsSensorData>
provides a data download service for accessing historical records.