This function performs genomic prediction of cross performance using genotype and phenotype data. It processes data in several steps including loading necessary software, converting genotype data, processing phenotype data, fitting mixed models, and predicting cross performance based on weighted marker effects. For more information, see Labroo et al. (2023) <doi:10.1007/s00122-023-04377-z>.
This package provides implementation of the generic composite similarity measure (GCSM) described in Liu et al. (2020) <doi:10.1016/j.ecoinf.2020.101169>. The implementation is in C++ and uses RcppArmadillo
'. Additionally, implementations of the structural similarity (SSIM) and the composite similarity measure based on means, standard deviations, and correlation coefficient (CMSC), are included.
This package provides a Hierarchical Spatial Autoregressive Model (HSAR), based on a Bayesian Markov Chain Monte Carlo (MCMC) algorithm (Dong and Harris (2014) <doi:10.1111/gean.12049>). The creation of this package was supported by the Economic and Social Research Council (ESRC) through the Applied Quantitative Methods Network: Phase II, grant number ES/K006460/1.
Detection of haplotype patterns that include single nucleotide polymorphisms (SNPs) and non-contiguous haplotypes that are associated with a phenotype. Methods for implementing HTRX are described in Yang Y, Lawson DJ (2023) <doi:10.1093/bioadv/vbad038> and Barrie W, Yang Y, Irving-Pease E.K, et al (2024) <doi:10.1038/s41586-023-06618-z>.
Estimate test-retest reliability for complex sampling strategies and extract variances using IntraClass
Effect Decomposition. Developed by Brandmaier et al. (2018) "Assessing reliability in neuroimaging research through intra-class effect decomposition (ICED)" <doi:10.7554/eLife.35718>
Also includes functions to simulate data based on sampling strategy. Unofficial version release name: "Good work squirrels".
Algorithms for multivariate outlier detection when missing values occur. Algorithms are based on Mahalanobis distance or data depth. Imputation is based on the multivariate normal model or uses nearest neighbour donors. The algorithms take sample designs, in particular weighting, into account. The methods are described in Bill and Hulliger (2016) <doi:10.17713/ajs.v45i1.86>.
Download data from Brazil's Origin Destination Surveys. The package covers both data from household travel surveys, dictionaries of variables, and the spatial geometries of surveys conducted in different years and across various urban areas in Brazil. For some cities, the package will include enhanced versions of the data sets with variables "harmonized" across different years.
This package provides a method of clustering functional data using subregion information of the curves. It is intended to supplement the fda and fda.usc packages in functional data object clustering. It also facilitates the printing and plotting of the results in a tree format and limits the partitioning candidates into a specific set of subregions.
Historic Pell grant data as provided by the US Department of Education. This package contains data about how much pell grant was awarded by which institution in which year. This data comes from the US Department of Education. Raw data can be downloaded from here: <https://www2.ed.gov/finaid/prof/resources/data/pell-institution.html>.
The SC-SR Algorithm is used to calculate fully non-parametric and self-consistent estimators of the cause-specific failure probabilities in the presence of interval-censoring and possible making of the failure cause in a competing risks environment. In the version 2.0 the function creating the probability matrix from double-censored data is added.
Create correlation networks using St. Nicolas House Analysis ('SNHA'). The package can be used for visualizing multivariate data similar to Principal Component Analysis or Multidimensional Scaling using a ranking approach. In contrast to MDS and PCA', SNHA uses a network approach to explore interacting variables. For details see Hermanussen et. al. 2021', <doi:10.3390/ijerph18041741>.
Computes the maximum likelihood estimator of the generalised additive and index regression with shape constraints. Each additive component function is assumed to obey one of the nine possible shape restrictions: linear, increasing, decreasing, convex, convex increasing, convex decreasing, concave, concave increasing, or concave decreasing. For details, see Chen and Samworth (2016) <doi:10.1111/rssb.12137>.
This package provides tools for a wavelet-based approach to analyzing spatial synchrony, principally in ecological data. Some tools will be useful for studying community synchrony. See, for instance, Sheppard et al (2016) <doi: 10.1038/NCLIMATE2991>, Sheppard et al (2017) <doi: 10.1051/epjnbp/2017000>, Sheppard et al (2019) <doi: 10.1371/journal.pcbi.1006744>.
Many modern biological datasets consist of small counts that are not well fit by standard linear-Gaussian methods such as principal component analysis. This package provides implementations of count-based feature selection and dimension reduction algorithms. These methods can be used to facilitate unsupervised analysis of any high-dimensional data such as single-cell RNA-seq.
MDQC is a multivariate quality assessment method for microarrays based on quality control (QC) reports. The Mahalanobis distance of an array's quality attributes is used to measure the similarity of the quality of that array against the quality of the other arrays. Then, arrays with unusually high distances can be flagged as potentially low-quality.
This package is an implementation of about 6 major classes of statistical regression models. Currently only fixed-effects models are implemented, i.e., no random-effects models. Many (150+) models and distributions are estimated by maximum likelihood estimation (MLE) or penalized MLE, using Fisher scoring. VGLMs can be loosely thought of as multivariate generalised linear models.
reptyr is a utility for taking an existing running program and attaching it to a new terminal. Started a long-running process over ssh
, but have to leave and don't want to interrupt it? Just start a screen
, use reptyr to grab it, and then kill the ssh
session and head on home.
In order to make Arrow Database Connectivity ('ADBC <https://arrow.apache.org/adbc/>) accessible from R, an interface compliant with the DBI package is provided, using driver back-ends that are implemented in the adbcdrivermanager framework. This enables interacting with database systems using the Arrow data format, thereby offering an efficient alternative to ODBC for analytical applications.
This package provides a method to filter correlation and covariance matrices by averaging bootstrapped filtered hierarchical clustering and boosting. See Ch. Bongiorno and D. Challet, Covariance matrix filtering with bootstrapped hierarchies (2020) <arXiv:2003.05807>
and Ch. Bongiorno and D. Challet, Reactive Global Minimum Variance Portfolios with k-BAHC covariance cleaning (2020) <arXiv:2005.08703>
.
The main function generateDataset()
processes a user-supplied .R file that contains metadata parameters in order to generate actual data. The metadata parameters have to be structured in the form of metadata objects, the format of which is outlined in the package vignette. This approach allows to generate artificial data in a transparent and reproducible manner.
Unsupervised, multivariate, binary clustering for meaningful annotation of data, taking into account the uncertainty in the data. A specific constructor for trajectory analysis in movement ecology yields behavioural annotation of trajectories based on estimated local measures of velocity and turning angle, eventually with solar position covariate as a daytime indicator, ("Expectation-Maximization Binary Clustering for Behavioural Annotation").
An implementation of extended state-space SIR models developed by Song Lab at UM school of Public Health. There are several functions available by 1) including a time-varying transmission modifier, 2) adding a time-dependent quarantine compartment, 3) adding a time-dependent antibody-immunization compartment. Wang L. (2020) <doi:10.6339/JDS.202007_18(3).0003>.
Automated compound deconvolution, alignment across samples, and identification of metabolites by spectral library matching in Gas Chromatography - Mass spectrometry (GC-MS) untargeted metabolomics. Outputs a table with compound names, matching scores and the integrated area of the compound for each sample. Package implementation is described in Domingo-Almenara et al. (2016) <doi:10.1021/acs.analchem.6b02927>.
Because fungicide resistance is an important phenotypic trait for fungi and oomycetes, it is necessary to have a standardized method of statistically analyzing the Effective Concentration (EC) values. This package is designed for those who are not terribly familiar with R to be able to analyze and plot an entire set of isolates using the drc package.