Holds functions developed by the University of Ottawa's SAiVE
(Spatio-temporal Analysis of isotope Variations in the Environment) research group with the intention of facilitating the re-use of code, foster good code writing practices, and to allow others to benefit from the work done by the SAiVE
group. Contributions are welcome via the GitHub
repository <https://github.com/UO-SAiVE/SAiVE>
by group members as well as non-members.
Supports the calculation of meteorological characteristics in evapotranspiration research and reference crop evapotranspiration, and offers three models to simulate crop evapotranspiration and soil water balance in the field, including single crop coefficient and dual crop coefficient, as well as the Shuttleworth-Wallace model. These calculations main refer to Allen et al.(1998, ISBN:92-5-104219-5), Teh (2006, ISBN:1-58-112-998-X), and Liu et al.(2006) <doi:10.1016/j.agwat.2006.01.018>.
This package provides a system contains easy-to-use tools as a support for time series analysis courses. In particular, it incorporates a technique called Generalized Method of Wavelet Moments (GMWM) as well as its robust implementation for fast and robust parameter estimation of time series models which is described, for example, in Guerrier et al. (2013) <doi: 10.1080/01621459.2013.799920>. More details can also be found in the paper linked to via the URL below.
An extension for NetSurfP-2.0
(Klausen et al. (2019) <doi:10.1002/prot.25674>) which is specifically designed to analyze the results of bottom-up-proteomics that is primarily analyzed with MaxQuant
(Cox, J., Mann, M. (2008) <doi:10.1038/nbt.1511>). This tool is designed to process a large number of yeast peptides that produced as a results of whole yeast cell-proteome digestion and provide a coherent picture of secondary structure of proteins.
An R package that offers a workflow to predict condition-specific enhancers from ChIP-seq
data. The prediction of regulatory units is done in four main steps: Step 1 - the normalization of the ChIP-seq
counts. Step 2 - the prediction of active enhancers binwise on the whole genome. Step 3 - the condition-specific clustering of the putative active enhancers. Step 4 - the detection of possible target genes of the condition-specific clusters using RNA-seq counts.
BiFET identifies transcription factors (TFs) whose footprints are over-represented in target regions compared to background regions after correcting for the bias arising from the imbalance in read counts and GC contents between the target and background regions. For a given TF k, BiFET tests the null hypothesis that the target regions have the same probability of having footprints for the TF k as the background regions while correcting for the read count and GC content bias.
Random generation of survival data from a wide range of regression models, including accelerated failure time (AFT), proportional hazards (PH), proportional odds (PO), accelerated hazard (AH), Yang and Prentice (YP), and extended hazard (EH) models. The package rsurv also stands out by its ability to generate survival data from an unlimited number of baseline distributions provided that an implementation of the quantile function of the chosen baseline distribution is available in R. Another nice feature of the package rsurv lies in the fact that linear predictors are specified via a formula-based approach, facilitating the inclusion of categorical variables and interaction terms. The functions implemented in the package rsurv can also be employed to simulate survival data with more complex structures, such as survival data with different types of censoring mechanisms, survival data with cure fraction, survival data with random effects (frailties), multivariate survival data, and competing risks survival data. Details about the R package rsurv can be found in Demarqui (2024) <doi:10.48550/arXiv.2406.01750>
.
This package provides a program for analyzing vegetation time series, with two algorithms: 1) change detection algorithm that detects trend changes, determines their type (abrupt or non-abrupt), and estimates their timing, magnitude, number, and direction; 2) generalization algorithm that simplifies the temporal trend into main features. The user can set the number of major breakpoints or magnitude of greatest changes of interest for detection, and can control the generalization process by setting an additional parameter of generalization-percentage.
This package provides functions to create simulated time series of environmental exposures (e.g., temperature, air pollution) and health outcomes for use in power analysis and simulation studies in environmental epidemiology. This package also provides functions to evaluate the results of simulation studies based on these simulated time series. This work was supported by a grant from the National Institute of Environmental Health Sciences (R00ES022631) and a fellowship from the Colorado State University Programs for Research and Scholarly Excellence.
Diagnostic plots for optimisation, with a focus on projection pursuit. These show paths the optimiser takes in the high-dimensional space in multiple ways: by reducing the dimension using principal component analysis, and also using the tour to show the path on the high-dimensional space. Several botanical colour palettes are included, reflecting the name of the package. A paper describing the methodology can be found at <https://journal.r-project.org/archive/2021/RJ-2021-105/index.html>.
This package provides a fold change rank based method is presented to search for genes with changing expression and to detect recurrent chromosomal copy number aberrations. This method may be useful for high-throughput biological data (micro-array, sequencing, ...). Probabilities are associated with genes or probes in the data set and there is no problem of multiple tests when using this method. For array-based comparative genomic hybridization data, segmentation results are obtained by merging the significant probes detected.
This package provides tools for studying genotype-phenotype maps for bi-allelic loci underlying quantitative phenotypes. The 0.1 version is released in connection with the publication of Gjuvsland et al (2013) and implements basic line plots and the monotonicity measures for GP maps presented in the paper. Reference: Gjuvsland AB, Wang Y, Plahte E and Omholt SW (2013) Monotonicity is a key feature of genotype-phenotype maps. Frontier in Genetics 4:216 <doi:10.3389/fgene.2013.00216>.
Genomic biology is not limited to the confines of the canonical B-forming DNA duplex, but includes over ten different types of other secondary structures that are collectively termed non-B DNA structures. Of these non-B DNA structures, the G-quadruplexes are highly stable four-stranded structures that are recognized by distinct subsets of nuclear factors. This package provide functions for predicting intramolecular G quadruplexes. In addition, functions for predicting other intramolecular nonB
DNA structures are included.
Fits linear regression, logistic and multinomial regression models, Poisson regression, Cox model via Global Adaptive Generative Adjustment Algorithm. For more detailed information, see Bin Wang, Xiaofei Wang and Jianhua Guo (2022) <arXiv:1911.00658>
. This paper provides the theoretical properties of Gaga linear model when the load matrix is orthogonal. Further study is going on for the nonorthogonal cases and generalized linear models. These works are in part supported by the National Natural Foundation of China (No.12171076).
An implementation of k-means specifically design to cluster joint trajectories (longitudinal data on several variable-trajectories). Like kml', it provides facilities to deal with missing value, compute several quality criterion (Calinski and Harabatz, Ray and Turie, Davies and Bouldin, BIC,...) and propose a graphical interface for choosing the best number of clusters. In addition, the 3D graph representing the mean joint-trajectories of each cluster can be exported through LaTeX
in a 3D dynamic rotating PDF graph.
This package implements the multivariate adaptive shrinkage (mash) method of Urbut et al (2019) <DOI:10.1038/s41588-018-0268-8> for estimating and testing large numbers of effects in many conditions (or many outcomes). Mash takes an empirical Bayes approach to testing and effect estimation; it estimates patterns of similarity among conditions, then exploits these patterns to improve accuracy of the effect estimates. The core linear algebra is implemented in C++ for fast model fitting and posterior computation.
This package provides Scilab n1qn1'. This takes more memory than traditional L-BFGS. The n1qn1 routine is useful since it allows prespecification of a Hessian. If the Hessian is near enough the truth in optimization it can speed up the optimization problem. The algorithm is described in the Scilab optimization documentation located at <https://www.scilab.org/sites/default/files/optimization_in_scilab.pdf>. This version uses manually modified code from f2c to make this a C only binary.
Storm is a distributed real-time computation system. Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing real-time computation. . Storm includes a "Multi-Language" (or "Multilang") Protocol to allow implementation of Bolts and Spouts in languages other than Java. This R extension provides implementations of utility functions to allow an application developer to focus on application-specific functionality rather than Storm/R communications plumbing.
Generate data objects from XML versions of the Swiss Register of Plant Protection Products. An online version of the register can be accessed at <https://www.psm.admin.ch/de/produkte>. There is no guarantee of correspondence of the data read in using this package with that online version, or with the original registration documents. Also, the Federal Food Safety and Veterinary Office, coordinating the authorisation of plant protection products in Switzerland, does not answer requests regarding this package.
Non-negative Matrix Factorization(NMF) is a powerful tool for identifying the key features of microbial communities and a dimension-reduction method. When we are interested in the differences between the structures of two groups of communities, supervised NMF(Yun Cai, Hong Gu and Tobby Kenney (2017),<doi:10.1186/s40168-017-0323-1>) provides a better way to do this, while retaining all the advantages of NMF -- such as interpretability, and being based on a simple biological intuition.
Allows fitting of step-functions to univariate serial data where neither the number of jumps nor their positions is known by implementing the multiscale regression estimators SMUCE, simulataneous multiscale changepoint estimator, (K. Frick, A. Munk and H. Sieling, 2014) <doi:10.1111/rssb.12047> and HSMUCE, heterogeneous SMUCE, (F. Pein, H. Sieling and A. Munk, 2017) <doi:10.1111/rssb.12202>. In addition, confidence intervals for the change-point locations and bands for the unknown signal can be obtained.
Maximum likelihood estimation of univariate Gaussian Mixture Autoregressive (GMAR), Student's t Mixture Autoregressive (StMAR
), and Gaussian and Student's t Mixture Autoregressive (G-StMAR
) models, quantile residual tests, graphical diagnostics, forecast and simulate from GMAR, StMAR
and G-StMAR
processes. Leena Kalliovirta, Mika Meitz, Pentti Saikkonen (2015) <doi:10.1111/jtsa.12108>, Mika Meitz, Daniel Preve, Pentti Saikkonen (2023) <doi:10.1080/03610926.2021.1916531>, Savi Virolainen (2022) <doi:10.1515/snde-2020-0060>.
Imputes HLA classical alleles using GWAS SNP data, and it relies on a training set of HLA and SNP genotypes. HIBAG can be used by researchers with published parameter estimates instead of requiring access to large training sample datasets. It combines the concepts of attribute bagging, an ensemble classifier method, with haplotype inference for SNPs and HLA types. Attribute bagging is a technique which improves the accuracy and stability of classifier ensembles using bootstrap aggregating and random variable selection.
The goal of `tpSVG`
is to detect and visualize spatial variation in the gene expression for spatially resolved transcriptomics data analysis. Specifically, `tpSVG`
introduces a family of count-based models, with generalizable parametric assumptions such as Poisson distribution or negative binomial distribution. In addition, comparing to currently available count-based model for spatially resolved data analysis, the `tpSVG`
models improves computational time, and hence greatly improves the applicability of count-based models in SRT data analysis.