This package provides functions for quickly writing (and reading back) a data.frame to file in SQLite format. The name stands for *Store Tables using SQLite'*, or alternatively for *Quick Store Tables* (either way, it could be pronounced as *Quest*). For data.frames containing the supported data types it is intended to work as a drop-in replacement for the write_*()
and read_*()
functions provided by similar packages.
This package provides functions and example files to calculate the tRNA
adaptation index, a measure of the level of co-adaptation between the set of tRNA
genes and the codon usage bias of protein-coding genes in a given genome. The methodology is described in dos Reis, Wernisch and Savva (2003) <doi:10.1093/nar/gkg897>, and dos Reis, Savva and Wernisch (2004) <doi:10.1093/nar/gkh834>.
Call job::job(<code here>)
to run R code as an RStudio job and keep your console free in the meantime. This allows for a productive workflow while testing (multiple) long-running chunks of code. It can also be used to organize results using the RStudio Jobs GUI or to test code in a clean environment. Two RStudio Addins can be used to run selected code as a job.
This package provides a set of three two-census methods to the estimate the degree of death registration coverage for a population. Implemented methods include the Generalized Growth Balance method (GGB), the Synthetic Extinct Generation method (SEG), and a hybrid of the two, GGB-SEG. Each method offers automatic estimation, but users may also specify exact parameters or use a graphical interface to guess parameters in the traditional way if desired.
OpenAI
Gym is a open-source Python toolkit for developing and comparing reinforcement learning algorithms. This is a wrapper for the OpenAI
Gym API, and enables access to an ever-growing variety of environments. For more details on OpenAI
Gym, please see here: <https://github.com/openai/gym>. For more details on the OpenAI
Gym API specification, please see here: <https://github.com/openai/gym-http-api>.
This package provides a library for generic interval manipulations using a new interval vector class. Capabilities include: locating various kinds of relationships between two interval vectors, merging overlaps within a single interval vector, splitting an interval vector on its overlapping endpoints, and applying set theoretical operations on interval vectors. Many of the operations in this package were inspired by James Allen's interval algebra, Allen (1983) <doi:10.1145/182.358434>.
Several robust estimators for linear regression and variable selection are provided. Included are Maximum tangent likelihood estimator by Qin, et al., (2017), arXiv
preprint <doi:10.48550/arXiv.1708.05439>
, least absolute deviance estimator and Huber regression. The penalized version of each of these estimator incorporates L1 penalty function, i.e., LASSO and Adaptive Lasso. They are able to produce consistent estimates for both fixed and high-dimensional settings.
Fitting and testing multinomial processing tree (MPT) models, a class of nonlinear models for categorical data. The parameters are the link probabilities of a tree-like graph and represent the latent cognitive processing steps executed to arrive at observable response categories (Batchelder & Riefer, 1999 <doi:10.3758/bf03210812>; Erdfelder et al., 2009 <doi:10.1027/0044-3409.217.3.108>; Riefer & Batchelder, 1988 <doi:10.1037/0033-295x.95.3.318>).
Utility functions to convert between the Spatial classes specified by the package sp', and the well-known binary (WKB) representation for geometry specified by the Open Geospatial Consortium'. Supports Spatial objects of class SpatialPoints
', SpatialPointsDataFrame
', SpatialLines
', SpatialLinesDataFrame
', SpatialPolygons
', and SpatialPolygonsDataFrame
'. Supports WKB geometry types Point', LineString
', Polygon', MultiPoint
', MultiLineString
', and MultiPolygon
'. Includes extensions to enable creation of maps with TIBCO Spotfire'.
The CLL package contains the chronic lymphocytic leukemia (CLL) gene expression data. The CLL data had 24 samples that were either classified as progressive or stable in regards to disease progression. The data came from Dr. Sabina Chiaretti at Division of Hematology, Department of Cellular Biotechnologies and Hematology, University La Sapienza, Rome, Italy and Dr. Jerome Ritz at Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts.
The fst package for R provides a fast, easy and flexible way to serialize data frames. With access speeds of multiple GB/s, fst is specifically designed to unlock the potential of high speed solid state disks. Data frames stored in the fst format have full random access, both in column and rows. The fst format allows for random access of stored data and compression with the LZ4 and ZSTD compressors.
This package provides methods for choosing the rank of an SVD (singular value decomposition) approximation via cross validation. The package provides both Gabriel-style "block" holdouts and Wold-style "speckled" holdouts. It also includes an implementation of the SVDImpute algorithm. For more information about Bi-cross-validation, see Owen & Perry's 2009 AoAS
article (at <arXiv:0908.2062>
) and Perry's 2009 PhD
thesis (at <arXiv:0909.3052>
).
Set of forecasting tools to predict ICU beds using a Vector Error Correction model with a single cointegrating vector. Method described in Berta, P. Lovaglio, P.G. Paruolo, P. Verzillo, S., 2020. "Real Time Forecasting of Covid-19 Intensive Care Units demand" Health, Econometrics and Data Group (HEDG) Working Papers 20/16, HEDG, Department of Economics, University of York, <https://www.york.ac.uk/media/economics/documents/hedg/workingpapers/2020/2016.pdf>.
Structure mining from XGBoost and LightGBM
models. Key functionalities of this package cover: visualisation of tree-based ensembles models, identification of interactions, measuring of variable importance, measuring of interaction importance, explanation of single prediction with break down plots (based on xgboostExplainer
and iBreakDown
packages). To download the LightGBM
use the following link: <https://github.com/Microsoft/LightGBM>
. EIX is a part of the DrWhy.AI
universe.
R interface for H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection
, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML
).
These datasets and functions accompany Wolfe and Schneider (2017) - Intuitive Introductory Statistics (ISBN: 978-3-319-56070-0) <doi:10.1007/978-3-319-56072-4>. They are used in the examples throughout the text and in the end-of-chapter exercises. The datasets are meant to cover a broad range of topics in order to appeal to the diverse set of interests and backgrounds typically present in an introductory Statistics class.
Calibrate and apply multivariate bias correction algorithms for climate model simulations of multiple climate variables. Three methods described by Cannon (2016) <doi:10.1175/JCLI-D-15-0679.1> and Cannon (2018) <doi:10.1007/s00382-017-3580-6> are implemented â (i) MBC Pearson correlation (MBCp), (ii) MBC rank correlation (MBCr), and (iii) MBC N-dimensional PDF transform (MBCn) â as is the Rank Resampling for Distributions and Dependences (R2D2) method.
Used for general multiple mediation analysis. The analysis method is described in Yu and Li (2022) (ISBN: 9780367365479) "Statistical Methods for Mediation, Confounding and Moderation Analysis Using R and SAS", published by Chapman and Hall/CRC; and Yu et al.(2017) <DOI:10.1016/j.sste.2017.02.001> "Exploring racial disparity in obesity: a mediation analysis considering geo-coded environmental factors", published on Spatial and Spatio-temporal Epidemiology, 21, 13-23.
This package provides functionality for structural equation modeling for the social relations model (Kenny & La Voie, 1984; <doi:10.1016/S0065-2601(08)60144-6>; Warner, Kenny, & Soto, 1979, <doi:10.1037/0022-3514.37.10.1742>). Maximum likelihood estimation (Gill & Swartz, 2001, <doi:10.2307/3316080>; Nestler, 2018, <doi:10.3102/1076998617741106>) and least squares estimation is supported (Bond & Malloy, 2018, <doi:10.1016/B978-0-12-811967-9.00014-X>).
This package provides a collection of interactive shiny applications for performing comprehensive analyses in the field of tree breeding and genetics. The package is designed to assist users in visualizing and interpreting experimental data through a user-friendly interface. Each application is launched via a simple function, and users can upload data in Excel format for analysis. For more information, refer to Singh, R.K. and Chaudhary, B.D. (1977, ISBN:9788176633079).
The TWN-list (Taxa Waterbeheer Nederland) is the Dutch standard for naming taxons in Dutch Watermanagement. This package makes it easier to use the TWN-list for ecological analyses. It consists of two parts. First it makes the TWN-list itself available in R. Second, it has a few functions that make it easy to perform some basic and often recurring tasks for checking and consulting taxonomic data from the TWN-list.
An implementation of the additive heredity model for the mixture-of-mixtures experiments of Shen et al. (2019) in Technometrics <doi:10.1080/00401706.2019.1630010>. The additive heredity model considers an additive structure to inherently connect the major components with the minor components. The additive heredity model has a meaningful interpretation for the estimated model because of the hierarchical and heredity principles applied and the nonnegative garrote technique used for variable selection.
This package provides functions are provided to fit temporal lag models to dynamic networks. The models are build on top of exponential random graph models (ERGM) framework. There are functions for simulating or forecasting networks for future time points. Abhirup Mallik & Zack W. Almquist (2019) Stable Multiple Time Step Simulation/Prediction From Lagged Dynamic Network Regression Models, Journal of Computational and Graphical Statistics, 28:4, 967-979, <DOI: 10.1080/10618600.2019.1594834>.
Collection of datasets as prepared by Profs. A.P. Gore, S.A. Paranjape, and M.B. Kulkarni of Department of Statistics, Poona University, India. With their permission, first letter of their names forms the name of this package, the package has been built by me and made available for the benefit of R users. This collection requires a rich class of models and can be a very useful building block for a beginner.