This package provides functions to Simultaneously Infer Causal Graphs and Genetic Architecture. Includes acyclic and cyclic graphs for data from an experimental cross with a modest number (<10) of phenotypes driven by a few genetic loci (QTL). Chaibub Neto E, Keller MP, Attie AD, Yandell BS (2010) Causal Graphical Models in Systems Genetics: a unified framework for joint inference of causal network and genetic architecture for correlated phenotypes. Annals of Applied Statistics 4: 320-339. <doi:10.1214/09-AOAS288>.
Calculates the slope (longitudinal gradient or steepness) of linear geographic features such as roads (for more details, see Ariza-López et al. (2019) <doi:10.1038/s41597-019-0147-x>) and rivers (for more details, see Cohen et al. (2018) <doi:10.1016/j.jhydrol.2018.06.066>). It can use local Digital Elevation Model (DEM) data or download DEM data via the ceramic package. The package also provides functions to add elevation data to linestrings and visualize elevation profiles.
Based on the illness-death model a large number of clinical trials with oncology endpoints progression-free survival (PFS) and overall survival (OS) can be simulated, see Meller, Beyersmann and Rufibach (2019) <doi:10.1002/sim.8295>. The simulation set-up allows for random and event-driven censoring, an arbitrary number of treatment arms, staggered study entry and drop-out. Exponentially, Weibull and piecewise exponentially distributed survival times can be generated. The correlation between PFS and OS can be calculated.
This package provides functions for estimation (parametric, semi-parametric and non-parametric) of copula-based dependence coefficients between a finite collection of random vectors, including phi-dependence measures and Bures-Wasserstein dependence measures. An algorithm for agglomerative hierarchical variable clustering is also implemented. Following the articles De Keyser & Gijbels (2024) <doi:10.1016/j.jmva.2024.105336>, De Keyser & Gijbels (2024) <doi:10.1016/j.ijar.2023.109090>, and De Keyser & Gijbels (2024) <doi:10.48550/arXiv.2404.07141>
.
This package is for building isoscapes using mixed models and inferring the geographic origin of samples based on their isotopic ratios. This package is essentially a simplified interface to several other packages which implements a new statistical framework based on mixed models. It uses spaMM
for fitting and predicting isoscapes, and assigning an organism's origin depending on its isotopic ratio. IsoriX
also relies heavily on the package rasterVis
for plotting the maps produced with terra using lattice'.
This package contains an implementation of a function digest()
for the creation of hash digests of arbitrary R objects (using the md5, sha-1, sha-256, crc32, xxhash and murmurhash algorithms) permitting easy comparison of R language objects, as well as a function hmac()
to create hash-based message authentication code.
Please note that this package is not meant to be deployed for cryptographic purposes for which more comprehensive (and widely tested) libraries such as OpenSSL should be used.
This package provides functions and data sets for actuarial science: modeling of loss distributions; risk theory and ruin theory; simulation of compound models, discrete mixtures and compound hierarchical models; credibility theory. It boasts support for many additional probability distributions to model insurance loss amounts and loss frequency: 19 continuous heavy tailed distributions; the Poisson-inverse Gaussian discrete distribution; zero-truncated and zero-modified extensions of the standard discrete distributions. It also supports phase-type distributions commonly used to compute ruin probabilities.
Writing interfaces to command line software is cumbersome. The cmdfun package provides a framework for building function calls to seamlessly interface with shell commands by allowing lazy evaluation of command line arguments. It also provides methods for handling user-specific paths to tool installs or secrets like API keys. Its focus is to equally serve package builders who wish to wrap command line software, and to help analysts stay inside R when they might usually leave to execute non-R software.
Bisulfite-treated RNA non-conversion in a set of samples is analysed as follows : each sample's non-conversion distribution is identified to a Poisson distribution. P-values adjusted for multiple testing are calculated in each sample. Combined non-conversion P-values and standard errors are calculated on the intersection of the set of samples. For further details, see C Legrand, F Tuorto, M Hartmann, R Liebers, D Jakob, M Helm and F Lyko (2017) <doi:10.1101/gr.210666.116>.
The bootstrap ARDL tests for cointegration is the main functionality of this package. It also acts as a wrapper of the most commond ARDL testing procedures for cointegration: the bound tests of Pesaran, Shin and Smith (PSS; 2001 - <doi:10.1002/jae.616>) and the asymptotic test on the independent variables of Sam, McNown
and Goh (SMG: 2019 - <doi:10.1016/j.econmod.2018.11.001>). Bootstrap and bound tests are performed under both the conditional and unconditional ARDL models.
Computes solutions for linear and logistic regression models with potentially high-dimensional categorical predictors. This is done by applying a nonconvex penalty (SCOPE) and computing solutions in an efficient path-wise fashion. The scaling of the solution paths is selected automatically. Includes functionality for selecting tuning parameter lambda by k-fold cross-validation and early termination based on information criteria. Solutions are computed by cyclical block-coordinate descent, iterating an innovative dynamic programming algorithm to compute exact solutions for each block.
This package provides a variety of methods to identify data quality issues in process-oriented data, which are useful to verify data quality in a process mining context. Builds on the class for activity logs implemented in the package bupaR
'. Methods to identify data quality issues either consider each activity log entry independently (e.g. missing values, activity duration outliers,...), or focus on the relation amongst several activity log entries (e.g. batch registrations, violations of the expected activity order,...).
The purpose of this package is to tests whether a given moment of the distribution of a given sample is finite or not. For heavy-tailed distributions with tail exponent b, only moments of order smaller than b are finite. Tail exponent and heavy- tailedness are notoriously difficult to ascertain. But the finiteness of moments (including fractional moments) can be tested directly. This package does that following the test suggested by Trapani (2016) <doi:10.1016/j.jeconom.2015.08.006>.
Automatically performs desired statistical tests (e.g. wilcox.test()
, t.test()
) to compare between groups, and adds the resulting p-values to the plot with an annotation bar. Visualizing group differences are frequently performed by boxplots, bar plots, etc. Statistical test results are often needed to be annotated on these plots. This package provides a convenient function that works on ggplot2 objects, performs the desired statistical test between groups of interest and annotates the test results on the plot.
This package implements regression models for bounded continuous data in the open interval (0,1) using the five-parameter Generalized Kumaraswamy distribution. Supports modeling all distribution parameters (alpha, beta, gamma, delta, lambda) as functions of predictors through various link functions. Provides efficient maximum likelihood estimation via Template Model Builder ('TMB'), offering comprehensive diagnostics, model comparison tools, and simulation methods. Particularly useful for analyzing proportions, rates, indices, and other bounded response data with complex distributional features not adequately captured by simpler models.
Addons for the mice package to perform multiple imputation using chained equations with two-level data. Includes imputation methods dedicated to sporadically and systematically missing values. Imputation of continuous, binary or count variables are available. Following the recommendations of Audigier, V. et al (2018) <doi:10.1214/18-STS646>, the choice of the imputation method for each variable can be facilitated by a default choice tuned according to the structure of the incomplete dataset. Allows parallel calculation and overimputation for mice'.
This package offers three important components: (1) to construct a use-defined linear mixed model, (2) to employ one of linear mixed model approaches: minimum norm quadratic unbiased estimation (MINQUE) (Rao, 1971) for variance component estimation and random effect prediction; and (3) to employ a jackknife resampling technique to conduct various statistical tests. In addition, this package provides the function for model or data evaluations.This R package offers fast computations for large data sets analyses for various irregular data structures.
Recently, multiple marginal variable selection methods have been developed and shown to be effective in Gene-Environment interactions studies. We propose a novel marginal Bayesian variable selection method for Gene-Environment interactions studies. In particular, our marginal Bayesian method is robust to data contamination and outliers in the outcome variables. With the incorporation of spike-and-slab priors, we have implemented the Gibbs sampler based on Markov Chain Monte Carlo. The core algorithms of the package have been developed in C++'.
Segregation is a network-level property such that edges between predefined groups of vertices are relatively less likely. Network homophily is a individual-level tendency to form relations with people who are similar on some attribute (e.g. gender, music taste, social status, etc.). In general homophily leads to segregation, but segregation might arise without homophily. This package implements descriptive indices measuring homophily/segregation. It is a computational companion to Bojanowski & Corten (2014) <doi:10.1016/j.socnet.2014.04.001>.
This package implements a unified framework of parametric simplex method for a variety of sparse learning problems (e.g., Dantzig selector (for linear regression), sparse quantile regression, sparse support vector machines, and compressive sensing) combined with efficient hyper-parameter selection strategies. The core algorithm is implemented in C++ with Eigen3 support for portable high performance linear algebra. For more details about parametric simplex method, see Haotian Pang (2017) <https://papers.nips.cc/paper/6623-parametric-simplex-method-for-sparse-learning.pdf>.
The plsdof package provides Degrees of Freedom estimates for Partial Least Squares (PLS) Regression. Model selection for PLS is based on various information criteria (aic, bic, gmdl) or on cross-validation. Estimates for the mean and covariance of the PLS regression coefficients are available. They allow the construction of approximate confidence intervals and the application of test procedures (Kramer and Sugiyama 2012 <doi:10.1198/jasa.2011.tm10107>). Further, cross-validation procedures for Ridge Regression and Principal Components Regression are available.
This package provides functions to infer co-mapping trait hotspots and causal models. Chaibub Neto E, Keller MP, Broman AF, Attie AD, Jansen RC, Broman KW, Yandell BS (2012) Quantile-based permutation thresholds for QTL hotspots. Genetics 191 : 1355-1365. <doi:10.1534/genetics.112.139451>. Chaibub Neto E, Broman AT, Keller MP, Attie AD, Zhang B, Zhu J, Yandell BS (2013) Modeling causality for pairs of phenotypes in system genetics. Genetics 193 : 1003-1013. <doi:10.1534/genetics.112.147124>.
This software provides tools for quantitative trait mapping in populations such as advanced intercross lines where relatedness among individuals should not be ignored. It can estimate background genetic variance components, impute missing genotypes, simulate genotypes, perform a genome scan for putative quantitative trait loci (QTL), and plot mapping results. It also has functions to calculate identity coefficients from pedigrees, especially suitable for pedigrees that consist of a large number of generations, or estimate identity coefficients from genotypic data in certain circumstances.
This package implements the following approaches for multidimensional scaling (MDS) based on stress minimization using majorization (smacof): ratio/interval/ordinal/spline MDS on symmetric dissimilarity matrices, MDS with external constraints on the configuration, individual differences scaling (idioscal, indscal), MDS with spherical restrictions, and ratio/interval/ordinal/spline unfolding (circular restrictions, row-conditional). Various tools and extensions like jackknife MDS, bootstrap MDS, permutation tests, MDS biplots, gravity models, unidimensional scaling, drift vectors (asymmetric MDS), classical scaling, and Procrustes are implemented as well.