Simulating bivariate survival data from copula models. Estimation of the association parameter in copula models. Two different ways to estimate the association parameter in copula models are implemented. A goodness-of-fit test for a given copula model is implemented. See Emura, Lin and Wang (2010) <doi:10.1016/j.csda.2010.03.013> for details.
Quick and easy access to datasets that let you replicate the empirical examples in Cameron and Trivedi (2005) "Microeconometrics: Methods and Applications" (ISBN: 9780521848053).The data are available as soon as you install and load the package (lazy-loading) as data frames. The documentation includes reference to chapter sections and page numbers where the datasets are used.
Counts colors within color range(s) in images, and provides a masked version of the image with targeted pixels changed to a different color. Output includes the locations of the pixels in the images, and the proportion of the image within the target color range with optional background masking. Users can specify multiple color ranges for masking.
Set of functions for step-wise generation of (weighted) graphs. Aimed for research in the field of single- and multi-objective combinatorial optimization. Graphs are generated adding nodes, edges and weights. Each step may be repeated multiple times with different predefined and custom generators resulting in high flexibility regarding the graph topology and structure of edge weights.
This package provides a sparklyr <https://spark.rstudio.com/> extension that provides an R interface for GraphFrames
<https://graphframes.github.io/>. GraphFrames
is a package for Apache Spark that provides a DataFrame-based
API for working with graphs. Functionality includes motif finding and common graph algorithms, such as PageRank
and Breadth-first search.
This package provides a bridge between the loon and ggplot2 packages. Extends the grammar of ggplot to add clauses to create interactive loon plots. Existing ggplot(s) can be turned into interactive loon plots and loon plots into static ggplot(s); the function loon.ggplot()
is the bridge from one plot structure to the other.
R Client for the Microsoft Cognitive Services Text Analytics REST API, including Sentiment Analysis, Topic Detection, Language Detection, and Key Phrase Extraction. An account MUST be registered at the Microsoft Cognitive Services website <https://www.microsoft.com/cognitive-services/> in order to obtain a (free) API key. Without an API key, this package will not work properly.
This package contains functions intended to facilitate the production of plant taxonomic monographs. The package includes functions to convert tables into taxonomic descriptions, lists of collectors, examined specimens, identification keys (dichotomous and interactive), and can generate a monograph skeleton. Additionally, wrapper functions to batch the production of phenology histograms and distributional and diversity maps are also available.
This package provides a collection of various oversampling techniques developed from SMOTE is provided. SMOTE is a oversampling technique which synthesizes a new minority instance between a pair of one minority instance and one of its K nearest neighbor. Other techniques adopt this concept with other criteria in order to generate balanced dataset for class imbalance problem.
Balancing computational and statistical efficiency, subsampling techniques offer a practical solution for handling large-scale data analysis. Subsampling methods enhance statistical modeling for massive datasets by efficiently drawing representative subsamples from full dataset based on tailored sampling probabilities. These probabilities are optimized for specific goals, such as minimizing the variance of coefficient estimates or reducing prediction error.
This package performs angle-based outlier detection on a given data frame. It offers three methods to process data:
full but slow implementation using all the data that has cubic complexity;
a fully randomized method;
a method using k-nearest neighbours.
These algorithms are well suited for high dimensional data outlier detection.
Suppose we have data that has so many series that it is hard to identify them by their colors as the differences are so subtle. With gghighlight we can highlight those lines that match certain criteria. The result is a usual ggplot
object, so it is fully customizable and can be used with custom themes and facets.
Construct an explainable nomogram for a machine learning (ML) model to improve availability of an ML prediction model in addition to a computer application, particularly in a situation where a computer, a mobile phone, an internet connection, or the application accessibility are unreliable. This package enables a nomogram creation for any ML prediction models, which is conventionally limited to only a linear/logistic regression model. This nomogram may indicate the explainability value per feature, e.g., the Shapley additive explanation value, for each individual. However, this package only allows a nomogram creation for a model using categorical without or with single numerical predictors. Detailed methodologies and examples are documented in our vignette, available at <https://htmlpreview.github.io/?https://github.com/herdiantrisufriyana/rmlnomogram/blob/master/doc/ml_nomogram_exemplar.html>.
This package provides tools to process and analyze chest expansion using 3D marker data from motion capture systems. Includes functions for data processing, marker position adjustment, volume calculation using convex hulls, and visualization in 2D and 3D. Barber et al. (1996) <doi:10.1145/235815.235821>. TAMIYA Hiroyuki et al. (2021) <doi:10.1038/s41598-021-01033-8>.
Analyzes group patterns using discourse analysis data with graph theory mathematics. Takes the order of which individuals talk and converts it to a network edge and weight list. Returns the density, centrality, centralization, and subgroup information for each group. Based on the analytical framework laid out in Chai et al. (2019) <doi:10.1187/cbe.18-11-0222>.
Open, read data from and modify Data Packages. Data Packages are an open standard for bundling and describing data sets (<https://datapackage.org>). When data is read from a Data Package care is taken to convert the data as much a possible to R appropriate data types. The package can be extended with plugins for additional data types.
Utility functions to be used to analyse datasets obtained from seed germination/emergence assays. Fits several types of seed germination/emergence models, including those reported in Onofri et al. (2018) "Hydrothermal-time-to-event models for seed germination", European Journal of Agronomy, 101, 129-139 <doi:10.1016/j.eja.2018.08.011>. Contains several datasets for practicing.
Four fertility models are fitted using non-linear least squares. These are the Hadwiger, the Gamma, the Model1 and Model2, following the terminology of the following paper: Peristera P. and Kostaki A. (2007). "Modeling fertility in modern populations". Demographic Research, 16(6): 141--194. <doi:10.4054/DemRes.2007.16.6>
. Model based averaging is also supported.
Harmony is a tool using AI which allows you to compare items from questionnaires and identify similar content. You can try Harmony at <https://harmonydata.ac.uk/app/> and you can read our blog at <https://harmonydata.ac.uk/blog/> or at <https://fastdatascience.com/how-does-harmony-work/>. Documentation at <https://harmonydata.ac.uk/harmony-r-released/>.
Electricity is not made equal and it vary in its carbon footprint (or carbon intensity) depending on its source. This package enables to access and query data provided by the Carbon Intensity API (<https://carbonintensity.org.uk/>). National Gridâ s Carbon Intensity API provides an indicative trend of regional carbon intensity of the electricity system in Great Britain.
Here we provide an implementation of the linear and logistic regression-based Reliable Change Index (RCI), to be used with lm and binomial glm model objects, respectively, following Moral et al. <https://psyarxiv.com/gq7az/>. The RCI function returns a score assumed to be approximately normally distributed, which is helpful to detect patients that may present cognitive decline.
Makes the time series prediction easier by automatizing this process using four main functions: prep()
, modl()
, pred()
and postp()
. Features different preprocessing methods to homogenize variance and to remove trend and seasonality. Also has the potential to bring together different predictive models to make comparatives. Features ARIMA and Data Mining Regression models (using caret).
Understanding the dynamics of potentially heterogeneous variables is important in statistical applications. This package provides tools for estimating the degree of heterogeneity across cross-sectional units in the panel data analysis. The methods are developed by Okui and Yanagi (2019) <doi:10.1016/j.jeconom.2019.04.036> and Okui and Yanagi (2020) <doi:10.1093/ectj/utz019>.
Routines in qtl2 to study allele patterns in quantitative trait loci (QTL) mapping over a chromosome. Useful in crosses with more than two alleles to identify how sets of alleles, genetically different strands at the same locus, have different response levels. Plots show profiles over a chromosome. Can handle multiple traits together. See <https://github.com/byandell/qtl2pattern>.