Utilities for simple manipulation and quick plotting of time series data. These utilities use the tframe package which provides a programming kernel for time series. Extensions to tframe provided in tframePlus
can also be used. See the Guide vignette for examples.
This package provides a global-local approximation framework for large-scale Gaussian process modeling. Please see Vakayil and Joseph (2024) <doi:10.1080/00401706.2023.2296451> for details. This work is supported by U.S. NSF grants CMMI-1921646 and DMREF-1921873.
This package provides methods for calculating the variance scale exponent to identify memory patterns in time series data. Includes tests for white noise, short memory, and long memory. See Fu, H. et al. (2018) <doi:10.1016/j.physa.2018.06.092>.
This package provides functions for the analysis of whole-genome sequencing studies to simultaneously detect the existence, and estimate the locations of association signals at genome-wide scale. The functions allow genome-wide association scan, candidate region scan and single window test.
Manhattan plot and QQ Plot are commonly used to visualize the end result of Genome Wide Association Study. The "ggmanh" package aims to keep the generation of these plots simple while maintaining customizability. Main functions include manhattan_plot, qqunif, and thinPoints
.
This package enables regression and classification on high-dimensional data with different relative strengths of penalization for different feature groups, such as different assays or omic types. The optimal relative strengths are chosen adaptively. Optimisation is performed using a variational Bayes approach.
simPIC
is a package for simulating single-cell ATAC-seq count data. It provides a user-friendly, well documented interface for data simulation. Functions are provided for parameter estimation, realistic scATAC-seq
data simulation, and comparing real and simulated datasets.
This package provides functions to analyze methylation data can be found here. Some functions are relevant for single cell methylation data but most other functions can be used for any methylation data. Highlight of this workflow is the comprehensive quality control report.
High-throughput single-cell measurements of DNA methylation allows studying inter-cellular epigenetic heterogeneity, but this task faces the challenges of sparsity and noise. We present vmrseq, a statistical method that overcomes these challenges and identifies variably methylated regions accurately and robustly.
This package provides a framework for the analysis and exploration of single-cell chromatin data. The Signac package contains functions for quantifying single-cell chromatin data, computing per-cell quality control metrics, dimension reduction and normalization, visualization, and DNA sequence motif analysis.
This package provides type-stable rolling window functions over any R data type. Cumulative and expanding windows are also supported. For more advanced usage, an index can be used as a secondary vector that defines how sliding windows are to be created.
This package provides a tool for detecting reversions for a given pathogenic mutation from next-generation DNA sequencing data. It analyses reads aligned to the locus of the pathogenic mutation and reports reversion events where secondary mutations have restored or undone the deleterious effect of the original pathogenic mutation, e.g., secondary indels complement to a frameshift pathogenic mutation converting the orignal frameshift mutation into inframe mutaions, deletions or SNVs that replaced the original pathogenic mutation restoring the open reading frame, SNVs changing the stop codon caused by the original nonsense SNV into an amino acid, etc.
Implementation of estimators for inferring the mean of censored cost data. Including the estimators BT from Bang and Tsiatis (2000) <doi:10.1093/biomet/87.2.329> and ZT from Zhao and Tian (2001) <doi:10.1111/j.0006-341X.2001.01002.x>.
This package provides a convenient R wrapper to the Comet API, which is a cloud platform allowing you to track, compare, explain and optimize machine learning experiments and models. Experiments can be viewed on the Comet online dashboard at <https://www.comet.com>.
DNA copy number data evaluation using both their initial form (copy number as a noisy function of genomic position) and their approximation by a piecewise-constant function (segmentation), for the purpose of identifying genomic regions where the copy number differs from the norm.
This package provides a wrapper for the DeepL
API <https://developers.deepl.com/docs>, a web service for translating texts between different languages. A DeepL
API developer account is required to use the service (see <https://www.deepl.com/pro#developer>).
Several tests for differential methylation in methylation array data, including one-sided differential mean and variance test. Methods used in the package refer to Dai, J, Wang, X, Chen, H and others (2021) "Incorporating increased variability in discovering cancer methylation markers", Biostatistics, submitted.
Easy access to species distribution data for 6 regions in the world, for a total of 226 anonymised species. These data are described and made available by Elith et al (2020) <doi:10.17161/bi.v15i2.13384> to compare species distribution modelling methods.
Package provides a set of tools for robust estimation and inference for probit model with endogenous covariates. The current version contains a robust two-step estimator. For technical details, see Naghi, Varadi and Zhelonkin (2022), <doi:10.1016/j.ecosta.2022.05.001>.
Matrix algebra using the Eigen C++ library: determinant, rank, inverse, pseudo-inverse, kernel and image, QR decomposition, Cholesky decomposition, Schur decomposition, Hessenberg decomposition, linear least-squares problems. Also provides matrix functions such as exponential, logarithm, power, sine and cosine. Complex matrices are supported.
Clustering algorithm developed for use with plot inventories of species. It groups plots by subsets of diagnostic species rather than overall species composition. There is an unsupervised and a supervised mode, the latter accepting suggestions for species with greater weight and cluster medoids.
This package provides an interface for image recognition using the Google Vision API <https://cloud.google.com/vision/> . Converts API data for features such as object detection and optical character recognition to data frames. The package also includes functions for analyzing image annotations.
This package provides a function for the estimation of mixture of longitudinal factor analysis models using the iterative expectation-maximization algorithm (Ounajim, Slaoui, Louis, Billot, Frasca, Rigoard (2023) <doi:10.1002/sim.9804>) and several tools for visualizing and interpreting the models parameters.
This package provides a flexible framework for estimating the variance-covariance matrix of estimated parameters. Estimation relies on unbiased estimating functions to compute the empirical sandwich variance. (i.e., M-estimation in the vein of Tsiatis et al. (2019) <doi:10.1201/9780429192692>.