This package implements a kernel-based association test for copy number variation (CNV) aggregate analysis in a certain genomic region (e.g., gene set, chromosome, or genome) that is robust to the within-locus and across-locus etiological heterogeneity, and bypass the need to define a "locus" unit for CNVs. Brucker, A., et al. (2020) <doi:10.1101/666875>.
This package contains functions for operations with fuzzy cognitive maps using t-norm and s-norm operators. T-norms and S-norms are described by Dov M. Gabbay and George Metcalfe (2007) <doi:10.1007/s00153-007-0047-1>. System indicators are described by Cox, Earl D. (1995) <isbn:1886801010>. Executable examples are provided in the "inst/examples" folder.
This package contains Rcpp and RcppEigen implementations of matrix operations useful for Gaussian process models, such as the inversion of a symmetric Toeplitz matrix, sampling from multivariate normal distributions, evaluation of the log-density of a multivariate normal vector, and Bayesian inference for latent variable Gaussian process models with elliptical slice sampling (Murray, Adams, and MacKay 2010).
This Rcpp'-based package implements highly efficient functions for the calculation of the Jonckheere-Terpstra statistic. It can be used for a variety of applications, including feature selection in machine learning problems, or to conduct genome-wide association studies (GWAS) with multiple quantitative phenotypes. The code leverages OpenMP directives for multi-core computing to reduce overall processing time.
Extends the capabilities of ggplot2 by providing grammatical elements and plot helpers designed for visualizing temporal patterns. The package implements a grammar of temporal graphics, which leverages calendar structures to highlight changes over time. The package also provides plot helper functions to quickly produce commonly used time series graphics, including time plots, season plots, and seasonal sub-series plots.
Approximate frequentist inference for generalized linear mixed model analysis with expectation propagation used to circumvent the need for multivariate integration. In this version, the random effects can be any reasonable dimension. However, only probit mixed models with one level of nesting are supported. The methodology is described in Hall, Johnstone, Ormerod, Wand and Yu (2018) <arXiv:1805.08423v1>.
An RStudio Addin for Hippie Expand (AKA Hippie Code Completion or Cyclic Expand Word). This type of completion searches for matching tokens within the user's current source editor file, regardless of file type. By searching only within the current source file, hippie offers a fast way to identify and insert completions that appear around the user's cursor.
H-index and h-alpha are a bibliometric indicators. This package provides functions to simulate how these indicators may develop over time for a given set of researchers and to visualize the simulation data. The implementation is based on the STATA ado h-index and is described in more detail in Bornmann et al. (2019) <arXiv:1905.11052>.
This package provides a set of streamlined functions that allow easy generation of linear regression diagnostic plots necessarily for checking linear model assumptions. This package is meant for easy scheming of linear regression diagnostics, while preserving merits of "The Grammar of Graphics" as implemented in ggplot2'. See the ggplot2 website for more information regarding the specific capability of graphics.
Analyses species distribution models and evaluates their performance. It includes functions for variation partitioning, extracting variable importance, computing several metrics of model discrimination and calibration performance, optimizing prediction thresholds based on a number of criteria, performing multivariate environmental similarity surface (MESS) analysis, and displaying various analytical plots. Initially described in Barbosa et al. (2013) <doi:10.1111/ddi.12100>.
This package provides a way to estimate and test marginal mediation effects for zero-inflated compositional mediators. Estimates of Natural Indirect Effect (NIE), Natural Direct Effect (NDE) of each taxon, as well as their standard errors and confident intervals, were provided as outputs. Zeros will not be imputed during analysis. See Wu et al. (2022) <doi:10.3390/genes13061049>.
This package provides tools for computing Monte Carlo standard errors (MCSE) in Markov chain Monte Carlo (MCMC) settings (survey in <doi:10.1201/b10905>, Chapter 7). MCSE computation for expectation and quantile estimators is supported as well as multivariate estimations. The package also provides functions for computing effective sample size and for plotting Monte Carlo estimates versus sample size.
Maximum a posteriori (MAP) estimation for topic models (i.e., Latent Dirichlet Allocation) in text analysis, as described in Taddy (2012) On estimation and selection for topic models'. Previous versions of this code were included as part of the textir package. If you want to take advantage of openmp parallelization, uncomment the relevant flags in src/MAKEVARS before compiling.
Anomaly detection in dynamic, temporal networks. The package oddnet uses a feature-based method to identify anomalies. First, it computes many features for each network. Then it models the features using time series methods. Using time series residuals it detects anomalies. This way, the temporal dependencies are accounted for when identifying anomalies (Kandanaarachchi, Hyndman 2022) <arXiv:2210.07407>.
Simulation of recurrent event data for non-constant baseline hazard in the total time model with risk-free intervals and possibly a competing event. Possibility to cut the data to an interim data set. Data can be plotted. Details about the method can be found in Jahn-Eimermacher, A. et al. (2015) <doi:10.1186/s12874-015-0005-2>.
The goal of TailID is to detect sensitive points in the tail of a dataset using techniques from Extreme Value Theory (EVT). It utilizes the Generalized Pareto Distribution (GPD) for assessing tail behavior and detecting inconsistent points with the Identical Distribution hypothesis of the tail. For more details see Manau (2025)<doi:10.4230/LIPIcs.ECRTS.2025.20>.
This package provides methods for representations (i.e. dimensionality reduction, preprocessing, feature extraction) of time series to help more accurate and effective time series data mining. Non-data adaptive, data adaptive, model-based and data dictated (clipped) representation methods are implemented. Also various normalisation methods (min-max, z-score, Box-Cox, Yeo-Johnson), and forecasting accuracy measures are implemented.
MEDIPS was developed for analyzing data derived from methylated DNA immunoprecipitation (MeDIP) experiments followed by sequencing (MeDIP-seq). However, MEDIPS provides functionalities for the analysis of any kind of quantitative sequencing data (e.g. ChIP-seq, MBD-seq, CMS-seq and others) including calculation of differential coverage between groups of samples and saturation and correlation analysis.
This is an R package to make it easier to import and store phylogenetic trees with associated data; and to link external data from different sources to phylogeny. It also supports exporting phylogenetic trees with heterogeneous associated data to a single tree file and can be served as a platform for merging tree with associated data and converting file formats.
The R package ggplot2 is a plotting system based on the grammar of graphics. GGally extends ggplot2 by adding several functions to reduce the complexity of combining geometric objects with transformed data. Some of these functions include a pairwise plot matrix, a two group pairwise plot matrix, a parallel coordinates plot, a survival plot, and several functions to plot networks.
This package is a flexible and comprehensive R toolbox for model-based optimization. It implements Efficient Global Optimization Algorithm for single- and multi-objective optimization. It supports mixed parameters. The machine learning toolbox mlr offers regression learners. It provides various infill criteria and features batch proposal, parallel execution, visualization, and logging. Its modular implementation allows easy customization by the user.
REDUCE is a portable general-purpose computer algebra system supporting scalar, vector, matrix and tensor algebra, symbolic differential and integral calculus, arbitrary precision numerical calculations and output in LaTeX format. REDUCE is based on Lisp and is available on the two dialects Portable Standard Lisp ('PSL') and Codemist Standard Lisp ('CSL'). The redcas package provides an interface for executing arbitrary REDUCE code interactively from R', returning output as character vectors. R code and REDUCE code can be interspersed. It also provides a specialized function for calling the REDUCE feature for solving systems of equations, returning the output as an R object designed for the purpose. A further specialized function uses REDUCE features to generate LaTeX output and post-processes this for direct use in LaTeX documents, e.g. using Sweave'.
Sets the alpha level for coefficients in a regression model as a decreasing function of the sample size through the use of Jeffreys Approximate Bayes factor. You tell alphaN() your sample size, and it tells you to which value you must lower alpha to avoid Lindley's Paradox. For details, see Wulff and Taylor (2024) <doi:10.1177/14761270231214429>.
Survey systems and other third-party data sources commonly use non-standard representations of logical values when it comes to qualitative data - "Yes", "No" and "N/A", say. batman is a package designed to seamlessly convert these into logicals. It is highly localised, and contains equivalents to boolean values in languages including German, French, Spanish, Italian, Turkish, Chinese and Polish.