This package provides functions and datasets to support Smilde, Næs and Liland (2021, ISBN: 978-1-119-60096-1) "Multiblock Data Fusion in Statistics and Machine Learning - Applications in the Natural and Life Sciences". This implements and imports a large collection of methods for multiblock data analysis with common interfaces, result- and plotting functions, several real data sets and six vignettes covering a range different applications.
This package contains a set of tools for constructing and coercing into and from the "mdate" class. This date class implements ISO 8601-2:2019(E) and allows regular dates to be annotated to express unspecified date components, approximate or uncertain date components, date ranges, and sets of dates. This is useful for describing and analysing temporal information, whether historical or recent, where date precision may vary.
Matching with string distance has never been easier! messy.cats contains various functions that employ string distance tools in order to make data management easier for users working with categorical data. Categorical data, especially user inputted categorical data that often tends to be plagued by typos, can be difficult to work with. messy.cats aims to provide functions that make cleaning categorical data simple and easy.
Flexibly simulates a dataset with time-varying covariates with user-specified exchangeable correlation structures across and within clusters. Covariates can be normal or binary and can be static within a cluster or time-varying. Time-varying normal variables can optionally have linear trajectories within each cluster. See ?make_one_dataset for the main wrapper function. See Montez-Rath et al. <arXiv:1709.10074>
for methodological details.
Scale alignment is a new procedure for rescaling dimensions of between-items multidimensional Rasch family models so that dimensions scores can be compared directly (Feuerstahler & Wilson, 2019; under review) <doi:10.1111/jedm.12209>. This package includes functions for implementing delta-dimensional alignment (DDA) and logistic regression alignment (LRA) for dichotomous or polytomous data. This function also includes a wrapper for models fit using the TAM package.
This package provides a set of functions for manipulating data frames in accordance with specific business rules. In addition, it includes wrapper functions for commonly used functions from the popular tidyverse package, making it easy to integrate these functions into data analysis workflows. The package is designed to streamline data preprocessing and help users quickly and efficiently perform data transformations that are specific to their business needs.
Grammatical evolution (see O'Neil, M. and Ryan, C. (2003,ISBN:1-4020-7444-1)) uses decoders to convert linear (binary or integer genes) into programs. In addition, automatic determination of codon precision with a limited rule choice bias is provided. For a recent survey of grammatical evolution, see Ryan, C., O'Neill, M., and Collins, J. J. (2018) <doi:10.1007/978-3-319-78717-6>.
The package provides ready to use epigenomes (obtained from TWGBS) and transcriptomes (RNA-seq) from various tissues as obtained in the study (Delacher and Imbusch 2017, PMID: 28783152). Regulatory T cells (Treg cells) perform two distinct functions: they maintain self-tolerance, and they support organ homeostasis by differentiating into specialized tissue Treg cells. The underlying dataset characterises the epigenetic and transcriptomic modifications for specialized tissue Treg cells.
PaleoClim
<http://www.paleoclim.org> (Brown et al. 2019, <doi:10.1038/sdata.2018.254>) is a set of free, high resolution paleoclimate surfaces covering the whole globe. It includes data on surface temperature, precipitation and the standard bioclimatic variables commonly used in ecological modelling, derived from the HadCM3
general circulation model and downscaled to a spatial resolution of up to 2.5 minutes. Simulations are available for key time periods from the Late Holocene to mid-Pliocene. Data on current and Last Glacial Maximum climate is derived from CHELSA (Karger et al. 2017, <doi:10.1038/sdata.2017.122>) and reprocessed by PaleoClim
to match their format; it is available at up to 30 seconds resolution. This package provides a simple interface for downloading PaleoClim
data in R, with support for caching and filtering retrieved data by period, resolution, and geographic extent.
Estimation of regression models for sparse asynchronous longitudinal observations, where time-dependent response and covariates are mismatched and observed intermittently within subjects. Kernel weighted estimating equations are used for generalized linear models with either time-invariant or time-dependent coefficients. Cao, H., Li, J., and Fine, J. P. (2016) <doi:10.1214/16-EJS1141>. Cao, H., Zeng, D., and Fine, J. P. (2015) <doi:10.1111/rssb.12086>.
This package performs Bayesian non-parametric calibration of multiple related radiocarbon determinations, and summarises the calendar age information to plot their joint calendar age density (see Heaton (2022) <doi:10.1111/rssc.12599>). Also models the occurrence of radiocarbon samples as a variable-rate (inhomogeneous) Poisson process, plotting the posterior estimate for the occurrence rate of the samples over calendar time, and providing information about potential change points.
Calculate p-values and confidence intervals using cluster-adjusted t-statistics (based on Ibragimov and Muller (2010) <DOI:10.1198/jbes.2009.08046>, pairs cluster bootstrapped t-statistics, and wild cluster bootstrapped t-statistics (the latter two techniques based on Cameron, Gelbach, and Miller (2008) <DOI:10.1162/rest.90.3.414>. Procedures are included for use with GLM, ivreg, plm (pooling or fixed effects), and mlogit models.
Fits engression models for nonlinear distributional regression. Predictors and targets can be univariate or multivariate. Functionality includes estimation of conditional mean, estimation of conditional quantiles, or sampling from the fitted distribution. Training is done full-batch on CPU (the python version offers GPU-accelerated stochastic gradient descent). Based on "Engression: Extrapolation for nonlinear regression?" by Xinwei Shen and Nicolai Meinshausen (2023). Also supports classification (experimental). <arxiv:2307.00835>.
Simulate general insurance policies, losses and loss emergence. The functions contemplate deterministic and stochastic policy retention and growth scenarios. Retention and growth rates are percentages relative to the expiring portfolio. Claims are simulated for each policy. This is accomplished either be assuming a frequency distribution per development lag or by generating random wait times until claim emergence and settlement. Loss simulation uses standard loss distributions for claim amounts.
This package provides a C++ backend for multivariate phylogenetic comparative models implemented in the R-package PCMBase'. Can be used in combination with PCMBase to enable fast and parallel likelihood calculation. Implements the pruning likelihood calculation algorithm described in Mitov et al. (2018) <arXiv:1809.09014>
. Uses the SPLITT C++ library for parallel tree traversal described in Mitov and Stadler (2018) <doi:10.1111/2041-210X.13136>.
Phenotypic analysis of field trials using mixed models with and without spatial components. One of a series of statistical genetic packages for streamlining the analysis of typical plant breeding experiments developed by Biometris. Some functions have been created to be used in conjunction with the R package asreml for the ASReml software, which can be obtained upon purchase from VSN international (<https://vsni.co.uk/software/asreml-r/>).
This package provides functions for fitting multi-state semi-Markov models to longitudinal data. A parametric maximum likelihood estimation method adapted to deal with Exponential, Weibull and Exponentiated Weibull distributions is considered. Right-censoring can be taken into account and both constant and time-varying covariates can be included using a Cox proportional model. Reference: A. Krol and P. Saint-Pierre (2015) <doi:10.18637/jss.v066.i06>.
C++ classes for sparse matrix methods including implementation of sparse LDL decomposition of symmetric matrices and solvers described by Timothy A. Davis (2016) <https://fossies.org/linux/SuiteSparse/LDL/Doc/ldl_userguide.pdf>
. Provides a set of C++ classes for basic sparse matrix specification and linear algebra, and a class to implement sparse LDL decomposition and solvers. See <https://github.com/samuel-watson/SparseChol>
for details.
This R package supports interactive visualization of multi-channel images and segmentation masks generated by imaging mass cytometry and other highly multiplexed imaging techniques using shiny. The cytoviewer interface is divided into image-level (Composite and Channels) and cell-level visualization (Masks). It allows users to overlay individual images with segmentation masks, integrates well with SingleCellExperiment
and SpatialExperiment
objects for metadata visualization and supports image downloads.
The ddPCRclust
algorithm can automatically quantify the CPDs of non-orthogonal ddPCR
reactions with up to four targets. In order to determine the correct droplet count for each target, it is crucial to both identify all clusters and label them correctly based on their position. For more information on what data can be analyzed and how a template needs to be formatted, please check the vignette.
Epialleles are specific DNA methylation patterns that are mitotically and/or meiotically inherited. This package calls and reports cytosine methylation as well as frequencies of hypermethylated epialleles at the level of genomic regions or individual cytosines in next-generation sequencing data using binary alignment map (BAM) files as an input. Among other things, this package can also extract and visualise methylation patterns and assess allele specificity of methylation.
This package facilitates phyloseq exploration and analysis of taxonomic profiling data. This package provides tools for the manipulation, statistical analysis, and visualization of taxonomic profiling data. In addition to targeted case-control studies, microbiome facilitates scalable exploration of population cohorts. This package supports the independent phyloseq data format and expands the available toolkit in order to facilitate the standardization of the analyses and the development of best practices.
Set of functions for analyzing Atomic Force Microscope (AFM) force-distance curves. It allows to obtain the contact and unbinding points, perform the baseline correction, estimate the Young's modulus, fit up to two exponential decay function to a stress-relaxation / creep experiment, obtain adhesion energies. These operations can be done either over a single F-d curve or over a set of F-d curves in batch mode.
Providing a set of functions to easily generate and iterate complex networks. The functions can be used to generate realistic networks with a wide range of different clustering, density, and average path length. For more information consult research articles by Amiyaal Ilany and Erol Akcay (2016) <doi:10.1093/icb/icw068> and Ilany and Erol Akcay (2016) <doi:10.1101/026120>, which have inspired many methods in this package.