This package contains a set of clustering methods and evaluation metrics to select the best number of the clusters based on clustering stability. Two references describe the methodology: Fahimeh Nezhadmoghadam, and Jose Tamez-Pena (2021)<doi:10.1016/j.compbiomed.2021.104753>, and Fahimeh Nezhadmoghadam, et al.(2021)<doi:10.2174/1567205018666210831145825>.
This package implements estimation methods for parameters of common distribution families. The common d, p, q, r function family for each distribution is enriched with the ll, e, and v counterparts, computing the log-likelihood, performing estimation, and calculating the asymptotic variance - covariance matrix, respectively. Parameter estimation is performed analytically whenever possible.
Satellite data collected between 2003 and 2022, in conjunction with gridded bathymetric data (50-150 m resolution), are used to estimate the irradiance reaching the bottom of a series of representative EU Arctic fjords. An Earth System Science Data (ESSD) manuscript, Schlegel et al. (2024), provides a detailed explanation of the methodology.
We propose two distribution-free test statistics based on between-sample edge counts and measure the degree of relevance by standardized counts. Users can set edge costs in the graph to compare the parameters of the distributions. Methods for comparing distributions are as described in: Xiaoping Shi (2021) <arXiv:2107.00728>
.
This package implements the Goldilocks adaptive trial design for a time to event outcome using a piecewise exponential model and conjugate Gamma prior distributions. The method closely follows the article by Broglio and colleagues <doi:10.1080/10543406.2014.888569>, which allows users to explore the operating characteristics of different trial designs.
This package provides a set of tools to create georeferenced hillshade relief raster maps using ray-tracing and other advanced hill-shading techniques. It includes a wrapper function to create a georeferenced, ray-traced hillshade map from a digital elevation model, and other functions that can be used in a rayshader pipeline.
Datasets related to Hong Kong, including information on the 2019 elected District Councillors (<https://www.districtcouncils.gov.hk> and <https://dce2019.hk01.com/>) and traffic collision data from the Hong Kong Department of Transport (<https://www.td.gov.hk/>). All of the data in this package is available in the public domain.
Code to specify, run, and then visualize and analyze the results of Ixodidae (hard-bodied ticks) population and infection dynamics models. Such models exist in the literature, but the source code to run them is not always available. IxPopDyMod
provides an easy way for these models to be written and shared.
This package provides a small collection of various network data sets, to use with the igraph package: the Enron email network, various food webs, interactions in the immunoglobulin protein, the karate club network, Koenigsberg's bridges, visuotactile brain areas of the macaque monkey, UK faculty friendship network, domestic US flights network, etc.
The mycobacrvR
package contains utilities to provide detailed information for B cell and T cell epitopes for predicted adhesins from various servers such as ABCpred, Bcepred, Bimas, Propred, NetMHC
and IEDB. Please refer the URL below to download data files (data_mycobacrvR.zip
) used in functions of this package.
This package provides functions for row-reducing and inverting matrices with entries in many of the finite fields (those with a prime number of elements). With this package, users will be able to find the reduced row echelon form (RREF) of a matrix and calculate the inverse of a (square, invertible) matrix.
Generates derived parameter(s) from Monte Carlo Markov Chain (MCMC) samples using R code. This allows Bayesian models to be fitted without the inclusion of derived parameters which add unnecessary clutter and slow model fitting. For more information on MCMC samples see Brooks et al. (2011) <isbn:978-1-4200-7941-8>.
Identifies the optimal transformation of a surrogate marker and estimates the proportion of treatment explained (PTE) by the optimally-transformed surrogate at an earlier time point when the primary outcome of interest is a censored time-to-event outcome; details are described in Wang et al (2021) <doi:10.1002/sim.9185>.
Generates a random quotation from a database of quotes on topics in statistics, data visualization and science. Other functions allow searching the quotes database by key term tags, or authors or creating a word cloud. The output is designed to be suitable for use at the console, in Rmarkdown and LaTeX
.
Message translation is often managed with po files and the gettext programme, but sometimes another solution is needed. In contrast to po files, a more flexible approach is used as in the Fluent <https://projectfluent.org/> project with R Markdown snippets. The key-value approach allows easier handling of the translated messages.
This package provides functions implementing minimal distance estimation methods for parametric tail dependence models, as proposed in Einmahl, J.H.J., Kiriliouk, A., Krajina, A., and Segers, J. (2016) <doi:10.1111/rssb.12114> and Einmahl, J.H.J., Kiriliouk, A., and Segers, J. (2018) <doi:10.1007/s10687-017-0303-7>.
This package provides data sets for teaching statistics and data science courses. It includes a sample of data from John Edmund Kerrich's famous coinflip experiment. These are data that I used for statistics. The package also contains an R Markdown template with the required formatting for assignments in my former courses.
This package provides tools to query the U.S. National Library of Medicine's Clinical Trials database. Functions are provided for a variety of techniques for searching the data using range queries, categorical filtering, and by searching for full-text keywords. Minimal graphical tools are also provided for interactively exploring the constructed data.
This package provides a comprehensive set of tools designed for optimizing likelihood within a tie-oriented (Butts, C., 2008, <doi:10.1111/j.1467-9531.2008.00203.x>) or an actor-oriented modelling framework (Stadtfeld, C., & Block, P., 2017, <doi:10.15195/v4.a14>) in relational event networks. The package accommodates both frequentist and Bayesian approaches. The frequentist approaches that the package incorporates are the Maximum Likelihood Optimization (MLE) and the Gradient-based Optimization (GDADAMAX). The Bayesian methodologies included in the package are the Bayesian Sampling Importance Resampling (BSIR) and the Hamiltonian Monte Carlo (HMC). The flexibility of choosing between frequentist and Bayesian optimization approaches allows researchers to select the estimation approach which aligns the most with their analytical preferences.
This package provides a spatiotemporal model that simulates the spread of Ascochyta blight in chickpea fields based on location-specific weather conditions. This model is adapted from a model developed by Diggle et al. (2002) <doi:10.1094/PHYTO.2002.92.10.1110> for simulating the spread of anthracnose in a lupin field.
This package provides WHO 2007 References for School-age Children and Adolescents (5 to 19 years) (z-scores) with confidence intervals and standard errors around the prevalence estimates, taking into account complex sample designs. More information on the methods is available online: <https://www.who.int/tools/growth-reference-data-for-5to19-years>.
Fits smoothing spline regression models using scalable algorithms designed for large samples. Seven marginal spline types are supported: linear, cubic, different cubic, cubic periodic, cubic thin-plate, ordinal, and nominal. Random effects and parametric effects are also supported. Response can be Gaussian or non-Gaussian: Binomial, Poisson, Gamma, Inverse Gaussian, or Negative Binomial.
This package implements the framework introduced in Di Francesco and Mellace (2025) <doi:10.48550/arXiv.2502.11691>
, shifting the focus to well-defined and interpretable estimands that quantify how treatment affects the probability distribution over outcome categories. It supports selection-on-observables, instrumental variables, regression discontinuity, and difference-in-differences designs.
Data from various catalogs of astrophysical gamma-ray sources detected by NASA's Large Area Telescope (The Astrophysical Journal, 697, 1071, 2009 June 1), on board the Fermi gamma-ray satellite. More information on Fermi and its data products is available from the Fermi Science Support Center (http://fermi.gsfc.nasa.gov/ssc/).