SOHPIE (pronounced as SOFIE) is a novel pseudo-value regression approach for differential co-abundance network analysis of microbiome data, which can include additional clinical covariate in the model. The full methodological details can be found in Ahn S and Datta S (2023) <arXiv:2303.13702v1>.
Algorithms of nonparametric sequential test and online change-point detection for streams of univariate (sub-)Gaussian, binary, and bounded random variables, introduced in following publications - Shin et al. (2024) <doi:10.48550/arXiv.2203.03532>, Shin et al. (2021) <doi:10.48550/arXiv.2010.08082>.
Implementation of the original Sequence Globally Unique Identifier (SEGUID) algorithm [Babnigg and Giometti (2006) <doi:10.1002/pmic.200600032>] and SEGUID v2 (<https://www.seguid.org>), which extends SEGUID v1 with support for linear, circular, single- and double-stranded biological sequences, e.g. DNA, RNA, and proteins.
This package provides functions are collected to analyse weather data for agriculture purposes including to read weather records in multiple formats, calculate extreme climate index. Demonstration data are included the SILO daily climate data (licensed under CC BY 4.0, <https://www.longpaddock.qld.gov.au/silo/>).
This package provides tools for robust regression model fitting using the RANSAC (Random Sample Consensus) algorithm. RANSAC is an iterative method to estimate parameters of a model from a dataset that contains outliers. This package allows fitting both linear lm and nonlinear nls models using RANSAC, helping users obtain more reliable models in the presence of noisy or corrupted data. The methods are particularly useful in contexts where traditional least squares regression fails due to the influence of outliers. Implementations include support for performance metrics such as RMSE, MAE, and R² based on the inlier subset. For further details, see Fischler and Bolles (1981) <doi:10.1145/358669.358692>.
Tool for analysis of codon usage in various unannotated or KEGG/COG annotated DNA sequences. Calculates different measures of CU bias and CU-based predictors of gene expressivity, and performs gene set enrichment analysis for annotated sequences. Implements several methods for visualization of CU and enrichment analysis results.
Harman is a PCA and constrained optimisation based technique that maximises the removal of batch effects from datasets, with the constraint that the probability of overcorrection (i.e. removing genuine biological signal along with batch noise) is kept to a fraction which is set by the end-user.
CelliD is a clustering-free method for extracting per-cell gene signatures from scRNA-seq. CelliD allows unbiased cell identity recognition across different donors, tissues-of-origin, model organisms and single-cell omics protocols. The package can also be used to explore functional pathways enrichment in single cell data.
With the dedicated fortify method implemented for flowSet, ncdfFlowSet and GatingSet classes, both raw and gated flow cytometry data can be plotted directly with ggplot. The ggcyto wrapper and some custom layers also make it easy to add gates and population statistics to the plot.
This package provides a set of little functions that have been found useful to do little odds and ends such as plotting the results of K-means clustering, substituting special text characters, viewing parts of a data.frame, constructing formulas from text and building design and response matrices.
The GNU readline library allows users to edit command lines as they are typed in. It can maintain a searchable history of previously entered commands, letting you easily recall, edit and re-enter past commands. It features both Emacs-like and vi-like keybindings, making its usage comfortable for anyone.
The GNU readline library allows users to edit command lines as they are typed in. It can maintain a searchable history of previously entered commands, letting you easily recall, edit and re-enter past commands. It features both Emacs-like and vi-like keybindings, making its usage comfortable for anyone.
The GNU readline library allows users to edit command lines as they are typed in. It can maintain a searchable history of previously entered commands, letting you easily recall, edit and re-enter past commands. It features both Emacs-like and vi-like keybindings, making its usage comfortable for anyone.
This package provides functions for implementing the Analysis-of-marginal-Tail-Means (ATM) method, a robust optimization method for discrete black-box optimization. Technical details can be found in Mak and Wu (2018+) <arXiv:1712.03589>. This work was supported by USARO grant W911NF-17-1-0007.
Prior transcription factor binding knowledge and target gene expression data are integrated in a Bayesian framework for functional cis-regulatory module inference. Using Gibbs sampling, we iteratively estimate transcription factor associations for each gene, regulation strength for each binding event and the hidden activity for each transcription factor.
Computes classification accuracy and consistency indices under Item Response Theory. Implements the total score IRT-based methods in Lee, Hanson & Brennen (2002) and Lee (2010), the IRT-based methods in Rudner (2001, 2005), and the total score nonparametric methods in Lathrop & Cheng (2014). For dichotomous and polytomous tests.
You can retrieve Spotify API Information such as artists, albums, tracks, features tracks, recommendations or related artists. This package allows you to search all the information by name and also includes a distance based algorithm to find similar songs. More information: <https://developer.spotify.com/documentation/web-api/> .
Access data related to the European union from GISCO <https://ec.europa.eu/eurostat/web/gisco>, the Geographic Information System of the European Commission, via its rest API at <https://gisco-services.ec.europa.eu>. This package tries to make it easier to get these data into R.
Fits a variety of hidden Markov models, structured in an extended generalized linear model framework. See T. Rolf Turner, Murray A. Cameron, and Peter J. Thomson (1998) <doi:10.2307/3315677>, and Rolf Turner (2008) <doi:10.1016/j.csda.2008.01.029> and the references cited therein.
Connecting spatiotemporal exposure to individual and population-level risk via source-to-outcome continuum modeling. The package, methods, and case-studies are described in Messier, Reif, and Marvel (2024) <doi:10.1101/2024.09.23.24314096> and Eccles et al. (2023) <doi:10.1016/j.scitotenv.2022.158905>.
Shiny application for the analysis of groundwater monitoring data, designed to work with simple time-series data for solute concentration and ground water elevation, but can also plot non-aqueous phase liquid (NAPL) thickness if required. Also provides the import of a site basemap in GIS shapefile format.
The package allows to simulate Hawkes process both in univariate and multivariate settings. It gives functions to compute different moments of the number of jumps of the process on a given interval, such as mean, variance or autocorrelation of process jumps on time intervals separated by a lag.
Generates a Graphviz graph of the most significant 3-way interaction gains (i.e. conditional information gains) based on a provided discrete data frame. Various output formats are supported ('Graphviz', SVG, PNG, PDF, PS). For references, see the webpage of Aleks Jakulin <http://stat.columbia.edu/~jakulin/Int/>.
This package provides classes and methods for objects, whose indexing naturally starts from zero. Subsetting, indexing and mathematical operations are defined naturally between lagged objects and lagged and base R objects. Recycling is not used, except for singletons. The single bracket operator doesn't drop dimensions by default.