This method identifies topological domains in genomes from Hi-C sequence data. The authors published an implementation of their method as an R script. This package originates from those original TopDom
R scripts and provides help pages adopted from the original TopDom
PDF documentation. It also provides a small number of bug fixes to the original code.
This package is a model building aid for nonlinear mixed-effects (population) model analysis using NONMEM, facilitating data set checkout, exploration and visualization, model diagnostics, candidate covariate identification and model comparison. The methods are described in Keizer et al. (2013) <doi:10.1038/psp.2013.24>, and Jonsson et al. (1999) <doi:10.1016/s0169-2607(98)00067-4>.
Allows the user to learn Bayesian networks from datasets containing thousands of variables. It focuses on score-based learning, mainly the BIC and the BDeu score functions. It provides state-of-the-art algorithms for the following tasks: (1) parent set identification - Mauro Scanagatta (2015) <http://papers.nips.cc/paper/5803-learning-bayesian-networks-with-thousands-of-variables>; (2) general structure optimization - Mauro Scanagatta (2018) <doi:10.1007/s10994-018-5701-9>, Mauro Scanagatta (2018) <http://proceedings.mlr.press/v73/scanagatta17a.html>; (3) bounded treewidth structure optimization - Mauro Scanagatta (2016) <http://papers.nips.cc/paper/6232-learning-treewidth-bounded-bayesian-networks-with-thousands-of-variables>; (4) structure learning on incomplete data sets - Mauro Scanagatta (2018) <doi:10.1016/j.ijar.2018.02.004>. Distributed under the LGPL-3 by IDSIA.
Data exploration and prediction with focus on high dimensional data and chemometrics. The package was initially designed about partial least squares regression and discrimination models and variants, in particular locally weighted PLS models (LWPLS). Then, it has been expanded to many other methods for analyzing high dimensional data. The name rchemo comes from the fact that the package is orientated to chemometrics, but most of the provided methods are fully generic to other domains. Functions such as transform()
, predict()
, coef()
and summary()
are available. Tuning the predictive models is facilitated by generic functions gridscore()
(validation dataset) and gridcv()
(cross-validation). Faster versions are also available for models based on latent variables (LVs) (gridscorelv()
and gridcvlv()
) and ridge regularization (gridscorelb()
and gridcvlb()
).
The Resource Description Framework, or RDF is a widely used data representation model that forms the cornerstone of the Semantic Web. RDF represents data as a graph rather than the familiar data table or rectangle of relational databases. The rdflib package provides a friendly and concise user interface for performing common tasks on RDF data, such as reading, writing and converting between the various serializations of RDF data, including rdfxml', turtle', nquads', ntriples', and json-ld'; creating new RDF graphs, and performing graph queries using SPARQL'. This package wraps the low level redland R package which provides direct bindings to the redland C library. Additionally, the package supports the newer and more developer friendly JSON-LD format through the jsonld package. The package interface takes inspiration from the Python rdflib library.
Interact with Condor from R via SSH connection. Files are first uploaded from user machine to submitter machine, and the job is then submitted from the submitter machine to Condor'. Functions are provided to submit, list, and download Condor jobs from R. Condor is an open source high-throughput computing software framework for distributed parallelization of computationally intensive tasks.
This package implements a kernel-based association test for copy number variation (CNV) aggregate analysis in a certain genomic region (e.g., gene set, chromosome, or genome) that is robust to the within-locus and across-locus etiological heterogeneity, and bypass the need to define a "locus" unit for CNVs. Brucker, A., et al. (2020) <doi:10.1101/666875>.
This package contains tools for working with data during statistical analysis, promoting flexible, intuitive, and reproducible workflows. There are functions designated for specific statistical tasks such building a custom univariate descriptive table, computing pairwise association statistics, etc. These are built on a collection of data manipulation tools designed for general use that are motivated by the functional programming concept.
This package produces diversity estimates and species lists with associated global distribution for any vascular plant family and genus from Plants of the World Online database <https://powo.science.kew.org/>, by interacting with the source code of each plant taxon page. It also creates global maps of species richness, graphics of species discoveries and nomenclatural changes over time.
This package contains functions for operations with fuzzy cognitive maps using t-norm and s-norm operators. T-norms and S-norms are described by Dov M. Gabbay and George Metcalfe (2007) <doi:10.1007/s00153-007-0047-1>. System indicators are described by Cox, Earl D. (1995) <isbn:1886801010>. Executable examples are provided in the "inst/examples" folder.
This Rcpp'-based package implements highly efficient functions for the calculation of the Jonckheere-Terpstra statistic. It can be used for a variety of applications, including feature selection in machine learning problems, or to conduct genome-wide association studies (GWAS) with multiple quantitative phenotypes. The code leverages OpenMP
directives for multi-core computing to reduce overall processing time.
This package contains Rcpp and RcppEigen
implementations of matrix operations useful for Gaussian process models, such as the inversion of a symmetric Toeplitz matrix, sampling from multivariate normal distributions, evaluation of the log-density of a multivariate normal vector, and Bayesian inference for latent variable Gaussian process models with elliptical slice sampling (Murray, Adams, and MacKay
2010).
Extends the capabilities of ggplot2 by providing grammatical elements and plot helpers designed for visualizing temporal patterns. The package implements a grammar of temporal graphics, which leverages calendar structures to highlight changes over time. The package also provides plot helper functions to quickly produce commonly used time series graphics, including time plots, season plots, and seasonal sub-series plots.
Approximate frequentist inference for generalized linear mixed model analysis with expectation propagation used to circumvent the need for multivariate integration. In this version, the random effects can be any reasonable dimension. However, only probit mixed models with one level of nesting are supported. The methodology is described in Hall, Johnstone, Ormerod, Wand and Yu (2018) <arXiv:1805.08423v1>
.
An RStudio Addin for Hippie Expand (AKA Hippie Code Completion or Cyclic Expand Word). This type of completion searches for matching tokens within the user's current source editor file, regardless of file type. By searching only within the current source file, hippie offers a fast way to identify and insert completions that appear around the user's cursor.
H-index and h-alpha are a bibliometric indicators. This package provides functions to simulate how these indicators may develop over time for a given set of researchers and to visualize the simulation data. The implementation is based on the STATA ado h-index and is described in more detail in Bornmann et al. (2019) <arXiv:1905.11052>
.
This package provides a set of streamlined functions that allow easy generation of linear regression diagnostic plots necessarily for checking linear model assumptions. This package is meant for easy scheming of linear regression diagnostics, while preserving merits of "The Grammar of Graphics" as implemented in ggplot2'. See the ggplot2 website for more information regarding the specific capability of graphics.
Maximum a posteriori (MAP) estimation for topic models (i.e., Latent Dirichlet Allocation) in text analysis, as described in Taddy (2012) On estimation and selection for topic models'. Previous versions of this code were included as part of the textir package. If you want to take advantage of openmp parallelization, uncomment the relevant flags in src/MAKEVARS before compiling.
This package provides a way to estimate and test marginal mediation effects for zero-inflated compositional mediators. Estimates of Natural Indirect Effect (NIE), Natural Direct Effect (NDE) of each taxon, as well as their standard errors and confident intervals, were provided as outputs. Zeros will not be imputed during analysis. See Wu et al. (2022) <doi:10.3390/genes13061049>.
Analyses species distribution models and evaluates their performance. It includes functions for variation partitioning, extracting variable importance, computing several metrics of model discrimination and calibration performance, optimizing prediction thresholds based on a number of criteria, performing multivariate environmental similarity surface (MESS) analysis, and displaying various analytical plots. Initially described in Barbosa et al. (2013) <doi:10.1111/ddi.12100>.
Anomaly detection in dynamic, temporal networks. The package oddnet uses a feature-based method to identify anomalies. First, it computes many features for each network. Then it models the features using time series methods. Using time series residuals it detects anomalies. This way, the temporal dependencies are accounted for when identifying anomalies (Kandanaarachchi, Hyndman 2022) <arXiv:2210.07407>
.
Simulation of recurrent event data for non-constant baseline hazard in the total time model with risk-free intervals and possibly a competing event. Possibility to cut the data to an interim data set. Data can be plotted. Details about the method can be found in Jahn-Eimermacher, A. et al. (2015) <doi:10.1186/s12874-015-0005-2>.
This package provides methods for representations (i.e. dimensionality reduction, preprocessing, feature extraction) of time series to help more accurate and effective time series data mining. Non-data adaptive, data adaptive, model-based and data dictated (clipped) representation methods are implemented. Also various normalisation methods (min-max, z-score, Box-Cox, Yeo-Johnson), and forecasting accuracy measures are implemented.
This is an R package to make it easier to import and store phylogenetic trees with associated data; and to link external data from different sources to phylogeny. It also supports exporting phylogenetic trees with heterogeneous associated data to a single tree file and can be served as a platform for merging tree with associated data and converting file formats.