Satellite data collected between 2003 and 2022, in conjunction with gridded bathymetric data (50-150 m resolution), are used to estimate the irradiance reaching the bottom of a series of representative EU Arctic fjords. An Earth System Science Data (ESSD) manuscript, Schlegel et al. (2024), provides a detailed explanation of the methodology.
We propose two distribution-free test statistics based on between-sample edge counts and measure the degree of relevance by standardized counts. Users can set edge costs in the graph to compare the parameters of the distributions. Methods for comparing distributions are as described in: Xiaoping Shi (2021) <arXiv:2107.00728>.
This package implements the Goldilocks adaptive trial design for a time to event outcome using a piecewise exponential model and conjugate Gamma prior distributions. The method closely follows the article by Broglio and colleagues <doi:10.1080/10543406.2014.888569>, which allows users to explore the operating characteristics of different trial designs.
This package provides a set of tools to create georeferenced hillshade relief raster maps using ray-tracing and other advanced hill-shading techniques. It includes a wrapper function to create a georeferenced, ray-traced hillshade map from a digital elevation model, and other functions that can be used in a rayshader pipeline.
Datasets related to Hong Kong, including information on the 2019 elected District Councillors (<https://www.districtcouncils.gov.hk> and <https://dce2019.hk01.com/>) and traffic collision data from the Hong Kong Department of Transport (<https://www.td.gov.hk/>). All of the data in this package is available in the public domain.
This package provides a small collection of various network data sets, to use with the igraph package: the Enron email network, various food webs, interactions in the immunoglobulin protein, the karate club network, Koenigsberg's bridges, visuotactile brain areas of the macaque monkey, UK faculty friendship network, domestic US flights network, etc.
Code to specify, run, and then visualize and analyze the results of Ixodidae (hard-bodied ticks) population and infection dynamics models. Such models exist in the literature, but the source code to run them is not always available. IxPopDyMod provides an easy way for these models to be written and shared.
Generates derived parameter(s) from Monte Carlo Markov Chain (MCMC) samples using R code. This allows Bayesian models to be fitted without the inclusion of derived parameters which add unnecessary clutter and slow model fitting. For more information on MCMC samples see Brooks et al. (2011) <isbn:978-1-4200-7941-8>.
This package provides functions for row-reducing and inverting matrices with entries in many of the finite fields (those with a prime number of elements). With this package, users will be able to find the reduced row echelon form (RREF) of a matrix and calculate the inverse of a (square, invertible) matrix.
The mycobacrvR package contains utilities to provide detailed information for B cell and T cell epitopes for predicted adhesins from various servers such as ABCpred, Bcepred, Bimas, Propred, NetMHC and IEDB. Please refer the URL below to download data files (data_mycobacrvR.zip) used in functions of this package.
Identifies the optimal transformation of a surrogate marker and estimates the proportion of treatment explained (PTE) by the optimally-transformed surrogate at an earlier time point when the primary outcome of interest is a censored time-to-event outcome; details are described in Wang et al (2021) <doi:10.1002/sim.9185>.
Conduct a priori power analyses via Monte-Carlo style data simulation for linear and generalized linear mixed-effects models (LMMs/GLMMs). Provides a user-friendly workflow with helper functions to easily define fixed and random effects as well as diagnostic functions to evaluate the adequacy of the results of the power analysis.
Message translation is often managed with po files and the gettext programme, but sometimes another solution is needed. In contrast to po files, a more flexible approach is used as in the Fluent <https://projectfluent.org/> project with R Markdown snippets. The key-value approach allows easier handling of the translated messages.
Generates a random quotation from a database of quotes on topics in statistics, data visualization and science. Other functions allow searching the quotes database by key term tags, or authors or creating a word cloud. The output is designed to be suitable for use at the console, in Rmarkdown and LaTeX.
This package provides functions implementing minimal distance estimation methods for parametric tail dependence models, as proposed in Einmahl, J.H.J., Kiriliouk, A., Krajina, A., and Segers, J. (2016) <doi:10.1111/rssb.12114> and Einmahl, J.H.J., Kiriliouk, A., and Segers, J. (2018) <doi:10.1007/s10687-017-0303-7>.
This package provides data sets for teaching statistics and data science courses. It includes a sample of data from John Edmund Kerrich's famous coinflip experiment. These are data that I used for statistics. The package also contains an R Markdown template with the required formatting for assignments in my former courses.
The NCI-60 cancer cell line panel has been used over the course of several decades as an anti-cancer drug screen. This panel was developed as part of the Developmental Therapeutics Program (DTP, http://dtp.nci.nih.gov/) of the U.S. National Cancer Institute (NCI). Thousands of compounds have been tested on the NCI-60, which have been extensively characterized by many platforms for gene and protein expression, copy number, mutation, and others (Reinhold, et al., 2012). The purpose of the CellMiner project (http://discover.nci.nih.gov/ cellminer) has been to integrate data from multiple platforms used to analyze the NCI-60 and to provide a powerful suite of tools for exploration of NCI-60 data.
This package provides a comprehensive set of tools designed for optimizing likelihood within a tie-oriented (Butts, C., 2008, <doi:10.1111/j.1467-9531.2008.00203.x>) or an actor-oriented modelling framework (Stadtfeld, C., & Block, P., 2017, <doi:10.15195/v4.a14>) in relational event networks. The package accommodates both frequentist and Bayesian approaches. The frequentist approaches that the package incorporates are the Maximum Likelihood Optimization (MLE) and the Gradient-based Optimization (GDADAMAX). The Bayesian methodologies included in the package are the Bayesian Sampling Importance Resampling (BSIR) and the Hamiltonian Monte Carlo (HMC). The flexibility of choosing between frequentist and Bayesian optimization approaches allows researchers to select the estimation approach which aligns the most with their analytical preferences.
This package provides an R interface to Illumina's BaseSpace cloud computing environment, enabling the fast development of data analysis and visualization tools. Besides providing an easy to use set of tools for manipulating the data from BaseSpace, it also facilitates the access to R's rich environment of statistical and data analysis tools.
spacefillr enables generation of random and quasi-random space-filling sequences. It supports the following sequences: Halton, Sobol, Owen-scrambled Sobol, Owen-scrambled Sobol with errors distributed as blue noise, progressive jittered, progressive multi-jittered (PMJ), PMJ with blue noise, PMJ02, and PMJ02 with blue noise. The package also includes a C++ API.
This package provides utilities for processing the parameters of various statistical models. Beyond computing p values, CIs, and other indices for a wide variety of models, this package implements features like standardization or bootstrapping of parameters and models, feature reduction (feature extraction and variable selection) as well as conversion between indices of effect size.
An R package for deeping mining gene co-expression networks in multi-trait expression data. Provides functions for analyzing, comparing, and visualizing WGCNA networks across conditions. multiWGCNA was designed to handle the common case where there are multiple biologically meaningful sample traits, such as disease vs wildtype across development or anatomical region.
The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. This package provides a collection of functions for retrieving, processing, and re-packaging UniProt web services. The package makes use of UniProt's modernized REST API and allows mapping of identifiers accross different databases.
This package provides WHO 2007 References for School-age Children and Adolescents (5 to 19 years) (z-scores) with confidence intervals and standard errors around the prevalence estimates, taking into account complex sample designs. More information on the methods is available online: <https://www.who.int/tools/growth-reference-data-for-5to19-years>.