This package provides constrained triangulation of polygons. Ear cutting (or ear clipping) applies constrained triangulation by successively cutting triangles from a polygon defined by path/s. Holes are supported by introducing a bridge segment between polygon paths. This package wraps the header-only library earcut.hpp <https://github.com/mapbox/earcut.hpp.git> which includes a reference to the method used by Held, M. (2001) <doi:10.1007/s00453-001-0028-4>.
Fits a Gaussian process model to data. Gaussian processes are commonly used in computer experiments to fit an interpolating model. The model is stored as an R6 object and can be easily updated with new data. There are options to run in parallel, and Rcpp has been used to speed up calculations. For more info about Gaussian process software, see Erickson et al. (2018) <doi:10.1016/j.ejor.2017.10.002>.
Introduces a Copilot'-like completion experience, but it knows how to talk to the objects in your R environment. ellmer chats are integrated directly into your RStudio and Positron sessions, automatically incorporating relevant context from surrounding lines of code and your global environment (like data frame columns and types). Open the package dialog box with a keyboard shortcut, type your request, and the assistant will stream its response directly into your documents.
Implementing the interventional effects for mediation analysis for up to 3 mediators. The methods used are based on VanderWeele
, Vansteelandt and Robins (2014) <doi:10.1097/ede.0000000000000034>, Vansteelandt and Daniel (2017) <doi:10.1097/ede.0000000000000596> and Chan and Leung (2020; unpublished manuscript, available on request from the author of this package). Linear regression, logistic regression and Poisson regression are used for continuous, binary and count mediator/outcome variables respectively.
This package implements Individual Conditional Expectation (ICE) plots, a tool for visualizing the model estimated by any supervised learning algorithm. ICE plots refine Friedman's partial dependence plot by graphing the functional relationship between the predicted response and a covariate of interest for individual observations. Specifically, ICE plots highlight the variation in the fitted values across the range of a covariate of interest, suggesting where and to what extent they may exist.
Generate interactive html reports that enable quick visual review of multiple related time series stored in a data frame. For static datasets, this can help to identify any temporal artefacts that may affect the validity of subsequent analyses. For live data feeds, regularly scheduled reports can help to pro-actively identify data feed problems or unexpected trends that may require action. The reports are self-contained and shareable without a web server.
Maximum likelihood estimates (MLE) of the proportions of 5-mC
and 5-hmC
in the DNA using information from BS-conversion, TAB-conversion, and oxBS-conversion
methods. One can use information from all three methods or any combination of two of them. Estimates are based on Binomial model by Qu et al. (2013) <doi:10.1093/bioinformatics/btt459> and Kiihl et al. (2019) <doi:10.1515/sagmb-2018-0031>.
This package provides tools for loading and processing passive acoustic data. Read in data that has been processed in Pamguard (<https://www.pamguard.org/>), apply a suite processing functions, and export data for reports or external modeling tools. Parameter calculations implement methods by Oswald et al (2007) <doi:10.1121/1.2743157>, Griffiths et al (2020) <doi:10.1121/10.0001229> and Baumann-Pickering et al (2010) <doi:10.1121/1.3479549>.
Sonification (or audification) is the process of representing data by sounds in the audible range. This package provides the R function sonify()
that transforms univariate data, sampled at regular or irregular intervals, into a continuous sound with time-varying frequency. The ups and downs in frequency represent the ups and downs in the data. Sonify provides a substitute for R's plot function to simplify data analysis for the visually impaired.
Provide model averaging-based approaches that can be used to predict personalized survival probabilities. The key underlying idea is to approximate the conditional survival function using a weighted average of multiple candidate models. Two scenarios of candidate models are allowed: (Scenario 1) partial linear Cox model and (Scenario 2) time-varying coefficient Cox model. A reference of the underlying methods is Li and Wang (2023) <doi:10.1016/j.csda.2023.107759>.
Classical methods for combining summary data from genome-wide association studies (GWAS) only use marginal genetic effects and power can be compromised in the presence of heterogeneity. subgxe is a R package that implements p-value assisted subset testing for association (pASTA
), a method developed by Yu et al. (2019) <doi:10.1159/000496867>. pASTA
generalizes association analysis based on subsets by incorporating gene-environment interactions into the testing procedure.
This package provides methods for spatial and spatio-temporal smoothing of demographic and health indicators using survey data, with particular focus on estimating and projecting under-five mortality rates, described in Mercer et al. (2015) <doi:10.1214/15-AOAS872>, Li et al. (2019) <doi:10.1371/journal.pone.0210645>, Wu et al. (DHS Spatial Analysis Reports No. 21, 2021), and Li et al. (2023) <doi:10.48550/arXiv.2007.05117>
.
This package provides a set of commonly used distance measures and some additional functions which, although initially not designed for this purpose, can be used to measure the dissimilarity between time series. These measures can be used to perform clustering, classification or other data mining tasks which require the definition of a distance measure between time series. U. Mori, A. Mendiburu and J.A. Lozano (2016), <doi:10.32614/RJ-2016-058>.
This package provides tools for the analysis of complex survey samples. The provided features include: summary statistics, two-sample tests, rank tests, generalised linear models, cumulative link models, Cox models, loglinear models, and general maximum pseudolikelihood estimation for multistage stratified, cluster-sampled, unequally weighted survey samples; variances by Taylor series linearisation or replicate weights; post-stratification, calibration, and raking; two-phase subsampling designs; graphics; PPS sampling without replacement; principal components, and factor analysis.
Multivariate regression methodologies including classical reduced-rank regression (RRR) studied by Anderson (1951) <doi:10.1214/aoms/1177729580> and Reinsel and Velu (1998) <doi:10.1007/978-1-4757-2853-8>, reduced-rank regression via adaptive nuclear norm penalization proposed by Chen et al. (2013) <doi:10.1093/biomet/ast036> and Mukherjee et al. (2015) <doi:10.1093/biomet/asx080>, robust reduced-rank regression (R4) proposed by She and Chen (2017) <doi:10.1093/biomet/asx032>, generalized/mixed-response reduced-rank regression (mRRR
) proposed by Luo et al. (2018) <doi:10.1016/j.jmva.2018.04.011>, row-sparse reduced-rank regression (SRRR) proposed by Chen and Huang (2012) <doi:10.1080/01621459.2012.734178>, reduced-rank regression with a sparse singular value decomposition (RSSVD) proposed by Chen et al. (2012) <doi:10.1111/j.1467-9868.2011.01002.x> and sparse and orthogonal factor regression (SOFAR) proposed by Uematsu et al. (2019) <doi:10.1109/TIT.2019.2909889>.
This package implements anomaly detection as binary classification for cross-sectional data. Uses maximum likelihood estimates and normal probability functions to classify observations as anomalous. The method is presented in the following lecture from the Machine Learning course by Andrew Ng: <https://www.coursera.org/learn/machine-learning/lecture/C8IJp/algorithm/>, and is also described in: Aleksandar Lazarevic, Levent Ertoz, Vipin Kumar, Aysel Ozgur, Jaideep Srivastava (2003) <doi:10.1137/1.9781611972733.3>.
Impute the survival times for censored observations based on their conditional survival distributions derived from the Kaplan-Meier estimator. CondiS
can replace the censored observations with the best approximations from the statistical model, allowing for direct application of machine learning-based methods. When covariates are available, CondiS
is extended by incorporating the covariate information through machine learning-based regression modeling ('CondiS_X
'), which can further improve the imputed survival time.
Individual gene expression patterns are encoded into a series of eigenvector patterns ('WGCNA package). Using the framework of linear model-based differential expression comparisons ('limma package), time-course expression patterns for genes in different conditions are compared and analyzed for significant pattern changes. For reference, see: Greenham K, Sartor RC, Zorich S, Lou P, Mockler TC and McClung
CR. eLife
. 2020 Sep 30;9(4). <doi:10.7554/eLife.58993>
.
This package contains logic for computing sparse principal components via the EESPCA method, which is based on an approximation of the eigenvector/eigenvalue identity. Includes logic to support execution of the TPower and rifle sparse PCA methods, as well as logic to estimate the sparsity parameters used by EESPCA, TPower and rifle via cross-validation to minimize the out-of-sample reconstruction error. H. Robert Frost (2021) <doi:10.1080/10618600.2021.1987254>.
This package implements a simple, likelihood-based estimation of the reproduction number (R0) using a branching process with a Poisson likelihood. This model requires knowledge of the serial interval distribution, and dates of symptom onsets. Infectiousness is determined by weighting R0 by the probability mass function of the serial interval on the corresponding day. It is a simplified version of the model introduced by Cori et al. (2013) <doi:10.1093/aje/kwt133>.
Computes the sample probability value (p-value) for the estimated coefficient from a standard genome-wide univariate regression. It computes the exact finite-sample p-value under the assumption that the measured phenotype (the dependent variable in the regression) has a known Bernoulli-normal mixture distribution. Finite-sample genome-wide regression p-values (Gwrpv) with a non-normally distributed phenotype (Gregory Connor and Michael O'Neill, bioRxiv
204727 <doi:10.1101/204727>).
This package provides causal inference with interactive fixed-effect models. It imputes counterfactuals for each treated unit using control group information based on a linear interactive fixed effects model that incorporates unit-specific intercepts interacted with time-varying coefficients. This method generalizes the synthetic control method to the case of multiple treated units and variable treatment periods, and improves efficiency and interpretability. This version supports unbalanced panels and implements the matrix completion method.
This package provides tools for the development of packages related to General Transit Feed Specification (GTFS) files. Establishes a standard for representing GTFS feeds using R data types. Provides fast and flexible functions to read and write GTFS feeds while sticking to this standard. Defines a basic gtfs class which is meant to be extended by packages that depend on it. And offers utility functions that support checking the structure of GTFS objects.
Implement a coherent and flexible protocol for animal color tagging. GenTag
provides a simple computational routine with low CPU usage to create color sequences for animal tag. First, a single-color tag sequence is created from an algorithm selected by the user, followed by verification of the combination uniqueness. Three methods to produce color tag sequences are provided. Users can modify the main function core to allow a wide range of applications.