Estimation, prediction, and simulation of nonstationary Gaussian process with modular covariate-based covariance functions. Sources of nonstationarity, such as spatial mean, variance, geometric anisotropy, smoothness, and nugget, can be considered based on spatial characteristics. An induced compact-supported nonstationary covariance function is provided, enabling fast and memory-efficient computations when handling densely sampled domains.
Many modern C/C++ development tools in the clang toolchain, such as clang-tidy or clangd', rely on the presence of a compilation database in JSON format <https://clang.llvm.org/docs/JSONCompilationDatabase.html>
. This package temporarily injects additional build flags into the R build process to generate such a compilation database.
Draws stylized choropleth maps -- hexagonal maps and triangular multiclass hex maps -- for New Zealand District Health Boards and Regional Council areas. These allow faceted, coloured displays of quantitative information for comparison across District Health Boards or Regional Councils. The preprint Lumley (2019) <arXiv:1912.04435>
is based on the methods in this package.
Populate data from an R environment into .doc and .docx templates. Create a template document in a program such as Word', and add strings encased in guillemet characters to create flags («example»). Use getDictionary()
to create a dictionary of flags and replacement values, then call docket()
to generate a populated document.
This package provides tools for simulating draws from continuous time processes with well-defined exponential family random graph (ERGM) equilibria, i.e. ERGM generating processes (EGPs). A number of EGPs are supported, including the families identified in Butts (2023) <doi:10.1080/0022250X.2023.2180001>, as are functions for hazard calculation and timing calibration.
The Explainable Ensemble Trees e2tree approach has been proposed by Aria et al. (2024) <doi:10.1007/s00180-022-01312-6>. It aims to explain and interpret decision tree ensemble models using a single tree-like structure. e2tree is a new way of explaining an ensemble tree trained through randomForest
or xgboost packages.
Estimate gender from names in Spanish and Portuguese. Works with vectors and dataframes. The estimation works not only for first names but also full names. The package relies on a compilation of common names with it's most frequent associated gender in both languages which are used as look up tables for gender inference.
This package provides an interface to HDFql <https://www.hdfql.com/> and helper functions for reading data from and writing data to HDF5 files. HDFql provides a high-level language for managing HDF5 data that is platform independent. For more information, see the reference manual <https://www.hdfql.com/resources/HDFqlReferenceManual.pdf>
.
Calculates intraclass correlation coefficient (ICC) for assessing reproducibility of interval-censored data with two repeated measurements (Kovacic and Varnai (2014) <doi:10.1097/EDE.0000000000000139>). ICC is estimated by maximum likelihood from model with one fixed and one random effect (both intercepts). Help in model checking (normality of subjects means and residuals) is provided.
This package contains functions for fitting a joinpoint proportional hazards model to relative survival or cause-specific survival data, including estimates of joinpoint years at which survival trends have changed and trend measures in the hazard and cumulative survival scale. See Yu et al.(2009) <doi:10.1111/j.1467-985X.2009.00580.x>.
This package provides the tables from the Sean Lahman Baseball Database as a set of R data.frames. It uses the data on pitching, hitting and fielding performance and other tables from 1871 through 2023, as recorded in the 2024 version of the database. Documentation examples show how many baseball questions can be investigated.
Simulation and estimation of univariate and multivariate log-GARCH models. The main functions of the package are: lgarchSim()
, mlgarchSim()
, lgarch()
and mlgarch()
. The first two functions simulate from a univariate and a multivariate log-GARCH model, respectively, whereas the latter two estimate a univariate and multivariate log-GARCH model, respectively.
Computes efficient data distributions from highly inconsistent datasets with many missing values using multi-set intersections. Based upon hash functions, mulset can quickly identify intersections from very large matrices of input vectors across columns and rows and thus provides scalable solution for dealing with missing values. Tomic et al. (2019) <doi:10.1101/545186>.
This package implements an interface to the legacy Fortran code from O'Connell and Dobson (1984) <DOI:10.2307/2531148>. Implements Fortran 77 code for the methods developed by Schouten (1982) <DOI:10.1111/j.1467-9574.1982.tb00774.x>. Includes estimates of average agreement for each observer and average agreement for each subject.
Supports visual interpretation of hierarchical composite endpoints (HCEs). HCEs are complex constructs used as primary endpoints in clinical trials, combining outcomes of different types into ordinal endpoints, in which each patient contributes the most clinically important event (one and only one) to the analysis. See Karpefors M et al. (2022) <doi:10.1177/17407745221134949>.
Procedures to fit species distributions models from occurrence records and environmental variables, using glmnet for model fitting. Model structure is the same as for the Maxent Java package, version 3.4.0, with the same feature types and regularization options. See the Maxent website <http://biodiversityinformatics.amnh.org/open_source/maxent> for more details.
This package provides a small package designed for interpreting continuous and categorical latent variables. You provide a data set with a latent variable you want to understand and some other explanatory variables. It provides a description of the latent variable based on the explanatory variables. It also provides a name to the latent variable.
This package provides a set of techniques that can be used to develop, validate, and implement automated classifiers. A powerful tool for transforming raw data into meaningful information, ncodeR
(Shaffer, D. W. (2017) Quantitative Ethnography. ISBN: 0578191687) is designed specifically for working with big data: large document collections, logfiles, and other text data.
An implementation of the National Information Platforms for Nutrition or NiPN's
analytic methods for assessing quality of anthropometric datasets that include measurements of weight, height or length, middle upper arm circumference, sex and age. The focus is on anthropometric status but many of the presented methods could be applied to other variables.
Collection of pivotal algorithms for: relabelling the MCMC chains in order to undo the label switching problem in Bayesian mixture models; fitting sparse finite mixtures; initializing the centers of the classical k-means algorithm in order to obtain a better clustering solution. For further details see Egidi, Pappadà , Pauli and Torelli (2018b)<ISBN:9788891910233>.
Allows to download current and historical METAR weather reports extract and parse basic parameters and present main weather information. Current reports are downloaded from Aviation Weather Center <https://aviationweather.gov/data/metar/> and historical reports from Iowa Environmental Mesonet web page of Iowa State University ASOS-AWOS-METAR <http://mesonet.agron.iastate.edu/AWOS/>.
To construct a model in 2D space from 2D embedding data and then lift it to the high-dimensional space. Additionally, it provides tools to visualize the model in 2D space and to overlay the fitted model on data using the tour technique. Furthermore, it facilitates the generation of summaries of high-dimensional distributions.
Create and format tables and APA statistics for scientific publication. This includes making a Table 1 to summarize demographics across groups, correlation tables with significance indicated by stars, and extracting formatted statistical summarizes from simple tests for in-text notation. The package also includes functions for Winsorizing data based on a Z-statistic cutoff.
Fits univariate Bayesian spatial regression models for large datasets using Nearest Neighbor Gaussian Processes (NNGP) detailed in Finley, Datta, Banerjee (2022) <doi:10.18637/jss.v103.i05>, Finley, Datta, Cook, Morton, Andersen, and Banerjee (2019) <doi:10.1080/10618600.2018.1537924>, and Datta, Banerjee, Finley, and Gelfand (2016) <doi:10.1080/01621459.2015.1044091>.