Stratigraphic ranges of fossil marine animal genera from Sepkoski's (2002) published compendium. No changes have been made to any taxonomic names. However, first and last appearance intervals have been updated to be consistent with stages of the International Geological Timescale. Functionality for generating a plot of Sepkoski's evolutionary fauna is also included. For specific details on the compendium see: Sepkoski, J. J. (2002). A compendium of fossil marine animal genera. Bulletins of American Paleontology, 363, pp. 1รข 560 (ISBN 0-87710-450-6). Access: <https://www.biodiversitylibrary.org/item/40634#page/5/mode/1up>.
This package implements a maximum likelihood estimation (MLE) method for estimation and prediction of Gaussian process-based spatially varying coefficient (SVC) models (Dambon et al. (2021a) <doi:10.1016/j.spasta.2020.100470>). Covariance tapering (Furrer et al. (2006) <doi:10.1198/106186006X132178>) can be applied such that the method scales to large data. Further, it implements a joint variable selection of the fixed and random effects (Dambon et al. (2021b) <doi:10.1080/13658816.2022.2097684>). The package and its capabilities are described in (Dambon et al. (2021c) <doi:10.48550/arXiv.2106.02364>).
Interactive tools for generating random samples. Users select an .xlsx, .csv, or delimited .txt file with population data and are walked through selecting the sample type (Simple Random Sample or Stratified), the number of backups desired, and a "stratify_on" value (if desired). The sample size is determined using a normal approximation to the hypergeometric distribution based on Nicholson (1956) <doi:10.1214/aoms/1177728270>. An .xlsx file is created with the sample and key metadata for reference. It is menu-driven and lets users pick an output directory. See vignettes for a detailed walk-through.
This package implements time series clustering along with optimized techniques related to the dynamic time warping distance and its corresponding lower bounds. The implementations of partitional, hierarchical, fuzzy, k-Shape and TADPole clustering are available. Functionality can be easily extended with custom distance measures and centroid definitions. Implementations of DTW barycenter averaging, a distance based on global alignment kernels, and the soft-DTW distance and centroid routines are also provided. All included distance functions have custom loops optimized for the calculation of cross-distance matrices, including parallelization support. Several cluster validity indices are included.
This package provides a statistical tool to inference the multi-level partial correlations based on multi-subject time series data, especially for brain functional connectivity. It combines both individual and population level inference by using the methods of Qiu and Zhou. (2021)<DOI: 10.1080/01621459.2021.1917417> and Genovese and Wasserman. (2006)<DOI: 10.1198/016214506000000339>. It realizes two reliable estimation methods of partial correlation coefficients, using scaled lasso and lasso. It can be used to estimate individual- or population-level partial correlations, identify nonzero ones, and find out unequal partial correlation coefficients between two populations.
This package provides methods for working with nominal dates, times, and durations. Base R has sophisticated facilities for handling time, but these can give unexpected results if, for example, timezone is not handled properly. This package provides a more casual approach to support cases which do not require rigorous treatment. It systematically deconstructs the concepts origin and timezone, and de-emphasizes the display of seconds. It also converts among nominal durations such as seconds, hours, days, and weeks. See ?datetime and ?duration for examples. Adapted from metrumrg <http://r-forge.r-project.org/R/?group_id=1215>.
This package provides a zero dependency package containing functions to declare labels and missing values, coupled with associated functions to create (weighted) tables of frequencies and various other summary measures. Some of the base functions have been rewritten to make use of the specific information about the missing values, most importantly to distinguish between empty NA and declared NA values. Some functions have similar functionality with the corresponding ones from packages "haven" and "labelled". The aim is to ensure as much compatibility as possible with these packages, while offering an alternative in the objects of class "declared".
Two Gray Level Co-occurrence Matrix ('GLCM') implementations are included: The first is a fast GLCM feature texture computation based on Python Numpy arrays ('Github Repository, <https://github.com/tzm030329/GLCM>). The second is a fast GLCM RcppArmadillo implementation which is parallelized (using OpenMP') with the option to return all GLCM features at once. For more information, see "Artifact-Free Thin Cloud Removal Using Gans" by Toizumi Takahiro, Zini Simone, Sagi Kazutoshi, Kaneko Eiji, Tsukada Masato, Schettini Raimondo (2019), IEEE International Conference on Image Processing (ICIP), pp. 3596-3600, <doi:10.1109/ICIP.2019.8803652>.
Aligns peak based on peak retention times and matches homologous peaks across samples. The underlying alignment procedure comprises three sequential steps. (1) Full alignment of samples by linear transformation of retention times to maximise similarity among homologous peaks (2) Partial alignment of peaks within a user-defined retention time window to cluster homologous peaks (3) Merging rows that are likely representing homologous substances (i.e. no sample shows peaks in both rows and the rows have similar retention time means). The algorithm is described in detail in Ottensmann et al., 2018 <doi:10.1371/journal.pone.0198311>.
Routines for two different test types, the Constant Conditional Correlation (CCC) test and the Vectorial Independence (VI) test are provided (Kurz and Spanhel (2022) <doi:10.1214/22-EJS2051>). The tests can be applied to check whether a conditional copula coincides with its partial copula. Functions to test whether a regular vine copula satisfies the so-called simplifying assumption or to test a single copula within a regular vine copula to be a (j-1)-th order partial copula are available. The CCC test comes with a decision tree approach to allow testing in high-dimensional settings.
InferCNV is used to explore tumor single cell RNA-Seq data to identify evidence for somatic large-scale chromosomal copy number alterations, such as gains or deletions of entire chromosomes or large segments of chromosomes. This is done by exploring expression intensity of genes across positions of a tumor genome in comparison to a set of reference "normal" cells. A heatmap is generated illustrating the relative expression intensities across each chromosome, and it often becomes readily apparent as to which regions of the tumor genome are over-abundant or less-abundant as compared to that of normal cells.
This package provides a versatile interior point solver that solves linear programs (LPs), quadratic programs (QPs), second-order cone programs (SOCPs), semidefinite programs (SDPs), and problems with exponential and power cone constraints (https://clarabel.org/stable/). For quadratic objectives, unlike interior point solvers based on the standard homogeneous self-dual embedding (HSDE) model, Clarabel handles quadratic objective without requiring any epigraphical reformulation of its objective function. It can therefore be significantly faster than other HSDE-based solvers for problems with quadratic objective functions. Infeasible problems are detected using using a homogeneous embedding technique.
This package makes the qhull library available in R, in a similar manner as in Octave. Qhull computes convex hulls, Delaunay triangulations, halfspace intersections about a point, Voronoi diagrams, furthest-site Delaunay triangulations, and furthest-site Voronoi diagrams. It runs in 2-d, 3-d, 4-d, and higher dimensions. It implements the Quickhull algorithm for computing the convex hull. Qhull does not support constrained Delaunay triangulations, or mesh generation of non-convex objects, but the package does include some R functions that allow for this. Currently the package only gives access to Delaunay triangulation and convex hull computation.
This package provides an interface to build a unified database of genomic annotations and their coordinates (gene, transcript and exon levels). It is aimed to be used when simple tab-delimited annotations (or simple GRanges objects) are required instead of the more complex annotation Bioconductor packages. Also useful when combinatorial annotation elements are reuired, such as RefSeq coordinates with Ensembl biotypes. Finally, it can download, construct and handle annotations with versioned genes and transcripts (where available, e.g. RefSeq and latest Ensembl). This is particularly useful in precision medicine applications where the latter must be reported.
The real-life time series data are hardly pure linear or nonlinear. Merging a linear time series model like the autoregressive moving average (ARMA) model with a nonlinear neural network model such as the Long Short-Term Memory (LSTM) model can be used as a hybrid model for more accurate modeling purposes. Both the autoregressive integrated moving average (ARIMA) and autoregressive fractionally integrated moving average (ARFIMA) models can be implemented. Details can be found in Box et al. (2015, ISBN: 978-1-118-67502-1) and Hochreiter and Schmidhuber (1997) <doi:10.1162/neco.1997.9.8.1735>.
Computes the center of gravity (COG) of character-like binary images using three different methods. This package provides functions for estimating stroke-based, contour-based, and potential energy-based COG. It is useful for analyzing glyph structure in areas such as visual cognition research and font development. The contour-based method was originally proposed by Kotani et al. (2004) <https://ipsj.ixsq.nii.ac.jp/records/36793> and Kotani (2011) <https://shonan-it.repo.nii.ac.jp/records/2000243>, while the potential energy-based method was introduced by Kotani et al. (2006) <doi:10.11371/iieej.35.296>.
Generates simulated data representing the LOX drop testing process (also known as impact testing). A simulated process allows for accelerated study of test behavior. Functions are provided to simulate trials, test series, and groups of test series. Functions for creating plots specific to this process are also included. Test attributes and criteria can be set arbitrarily. This work is not endorsed by or affiliated with NASA. See "ASTM G86-17, Standard Test Method for Determining Ignition Sensitivity of Materials to Mechanical Impact in Ambient Liquid Oxygen and Pressurized Liquid and Gaseous Oxygen Environments" <doi:10.1520/G0086-17>.
It allows you to automatically monitor trends of social media messages by time, place and topic aiming at detecting public health threats early through the detection of signals (i.e., an unusual increase in the number of messages per time, topic and location). It was designed to focus on infectious diseases, and it can be extended to all hazards or other fields of study by modifying the topics and keywords. More information on the original package epitweetr is available in the peer-review publication Espinosa et al. (2022) <doi:10.2807/1560-7917.ES.2022.27.39.2200177>.
This package provides a small subset of plots throughout the U.S. are sampled and assessed "on-the-ground" as forested or non-forested by the U.S. Department of Agriculture, Forest Service, Forest Inventory and Analysis (FIA) Program, but the FIA also has access to remotely sensed data for all land in the country. The forested package contains data frames intended for use in predictive modeling applications where the more easily-accessible remotely sensed data can be used to predict whether a plot is forested or non-forested. Currently, the package provides data for Washington and Georgia.
Generalizes application of gray-level co-occurrence matrix (GLCM) metrics to objects outside of images. The current focus is to apply GLCM metrics to the study of biological networks and fitness landscapes that are used in studying evolutionary medicine and biology, particularly the evolution of cancer resistance. The package was developed as part of the author's publication in Physics in Medicine and Biology Barker-Clarke et al. (2023) <doi:10.1088/1361-6560/ace305>. A general reference to learn more about mathematical oncology can be found at Rockne et al. (2019) <doi:10.1088/1478-3975/ab1a09>.
An implementation of SPRE (standardised predicted random-effects) statistics in R to explore heterogeneity in genetic association meta- analyses, as described by Magosi et al. (2019) <doi:10.1093/bioinformatics/btz590>. SPRE statistics are precision weighted residuals that indicate the direction and extent with which individual study-effects in a meta-analysis deviate from the average genetic effect. Overly influential positive outliers have the potential to inflate average genetic effects in a meta-analysis whilst negative outliers might lower or change the direction of effect. See the getspres website for documentation and examples <https://magosil86.github.io/getspres/>.
This package provides a lightweight framework for creating high quality, complex heatmaps using base graphics. Supports hierarchical clustering with dendrograms, column and row scaling, cluster sub-divisions, customizable cell colours, shapes and sizes, legends, and flexible layouts for arranging multiple heatmaps. Designed to return plot objects that can be easily arranged with other plots without sacrificing resolution. Methods for hierarchical clustering and distance computations are described in Murtagh and Contreras (2012) <doi:10.1002/wics.53>. Dendrogram visualisation methods are based on the ggdendro package by de Vries and Ripley (2020) <https://CRAN.R-project.org/package=ggdendro>.
Fits the joint model proposed by Henderson and colleagues (2000) <doi:10.1093/biostatistics/1.4.465>, but extended to the case of multiple continuous longitudinal measures. The time-to-event data is modelled using a Cox proportional hazards regression model with time-varying covariates. The multiple longitudinal outcomes are modelled using a multivariate version of the Laird and Ware linear mixed model. The association is captured by a multivariate latent Gaussian process. The model is estimated using a Monte Carlo Expectation Maximization algorithm. This project was funded by the Medical Research Council (Grant number MR/M013227/1).
The Cauchy distribution is a special case of the t distribution when the degrees of freedom are equal to 1. The functions are related to the multivariate Cauchy distribution and include simulation, computation of the density, maximum likelihood estimation, contour plot of the bivariate Cauchy distribution, and discriminant analysis. References include: Nadarajah S. and Kotz S. (2008). "Estimation methods for the multivariate t distribution". Acta Applicandae Mathematicae, 102(1): 99--118. <doi:10.1007/s10440-008-9212-8>, and Kanti V. Mardia, John T. Kent and John M. Bibby (1979). "Multivariate analysis", ISBN:978-0124712522. Academic Press, London.