The STRINGdb
package provides an R interface to the STRING protein-protein interactions database. STRING is a database of known and predicted protein-protein interactions. The interactions include direct (physical) and indirect (functional) associations. Each interaction is associated with a combined confidence score that integrates the various evidences.
Variants of strategy estimation (Dal Bo & Frechette, 2011, <doi:10.1257/aer.101.1.411>), including the model with parameters for the choice probabilities of the strategies (Breitmoser, 2015, <doi:10.1257/aer.20130675>), and the model with individual level covariates for the selection of strategies by individuals (Dvorak & Fehrler, 2018, <doi:10.2139/ssrn.2986445>).
Graphical outputs and treatment for a database of fish pass monitoring. It is a part of the STACOMI open source project developed in France by the French Office for Biodiversity institute to centralize data obtained by fish pass monitoring. This version is available in French and English. See <http://stacomir.r-forge.r-project.org/> for more information on STACOMI'.
An implementation of local and global statistical complexity measures (aka Information Theory Quantifiers, ITQ) for time series analysis based on ordinal statistics (Bandt and Pompe (2002) <DOI:10.1103/PhysRevLett.88.174102>
). Several distance measures that operate on ordinal pattern distributions, auxiliary functions for ordinal pattern analysis, and generating functions for stochastic and deterministic-chaotic processes for ITQ testing are provided.
This package provides a set of statistical tools for spatio-temporal data exploration. Includes simple plotting functions, covariance calculations and computations similar to principal component analysis for spatio-temporal data. Can use both dataframes and stars objects for all plots and computations. For more details refer Spatio-Temporal Statistics with R (Christopher K. Wikle, Andrew Zammit-Mangion, Noel Cressie, 2019, ISBN:9781138711136).
This package provides functions to estimate a strategic selection estimator. A strategic selection estimator is an agent error model in which the two random components are not assumed to be orthogonal. In addition this package provides generic functions to print and plot objects of its class as well as the necessary functions to create tables for LaTeX
. There is also a function to create dyadic data sets.
This package provides a Package for selecting variables for the joint modeling of mean and dispersion (including models for mixture experiments) based on hypothesis testing and the quality of model's fit. In each iteration of the selection process, a criterion for checking the goodness of fit is used as a filter for choosing the terms that will be evaluated by a hypothesis test. Pinto & Pereira (2021) <arXiv:2109.07978>
.
This package provides a set of methods to implement Generalized Method of Moments and Maximal Likelihood methods for Random Utility Models. These methods are meant to provide inference on rank comparison data. These methods accept full, partial, and pairwise rankings, and provides methods to break down full or partial rankings into their pairwise components. Please see Generalized Method-of-Moments for Rank Aggregation from NIPS 2013 for a description of some of our methods.
Users can build and test customized quantitative trading strategies. Some quantitative trading strategies are already implemented, e.g. various moving-average filters with trend following approaches. The implemented class called "Strategy" allows users to access several methods to analyze performance figures, plots and backtest the strategies. Furthermore, custom strategies can be added, a generic template is available. The custom strategies require a certain input and output so they can be called from the Strategy-constructor.
Large data files can be difficult to work with in R, where data generally resides in memory. This package encourages a style of programming where data is streamed from disk into R via a `producer and through a series of `consumers that, typically reduce the original data to a manageable size. The package provides useful Producer and Consumer stream components for operations such as data input, sampling, indexing, and transformation; see package?Streamer for details.
The main function is icweib()
, which fits a stratified Weibull proportional hazards model for left censored, right censored, interval censored, and non-censored survival data. We parameterize the Weibull regression model so that it allows a stratum-specific baseline hazard function, but where the effects of other covariates are assumed to be constant across strata. Please refer to Xiangdong Gu, David Shapiro, Michael D. Hughes and Raji Balasubramanian (2014) <doi:10.32614/RJ-2014-003> for more details.
This is an interface for the Python package StepMix
'. It is a Python package following the scikit-learn API for model-based clustering and generalized mixture modeling (latent class/profile analysis) of continuous and categorical data. StepMix
handles missing values through Full Information Maximum Likelihood (FIML) and provides multiple stepwise Expectation-Maximization (EM) estimation methods based on pseudolikelihood theory. Additional features include support for covariates and distal outcomes, various simulation utilities, and non-parametric bootstrapping, which allows inference in semi-supervised and unsupervised settings.
Efficient algorithms for fully Bayesian estimation of stochastic volatility (SV) models with and without asymmetry (leverage) via Markov chain Monte Carlo (MCMC) methods. Methodological details are given in Kastner and Frühwirth-Schnatter (2014) <doi:10.1016/j.csda.2013.01.002> and Hosszejni and Kastner (2019) <doi:10.1007/978-3-030-30611-3_8>; the most common use cases are described in Hosszejni and Kastner (2021) <doi:10.18637/jss.v100.i12> and Kastner (2016) <doi:10.18637/jss.v069.i05> and the package examples.
Allows the user to estimate a vector logistic smooth transition autoregressive model via maximum log-likelihood or nonlinear least squares. It further permits to test for linearity in the multivariate framework against a vector logistic smooth transition autoregressive model with a single transition variable. The estimation method is discussed in Terasvirta and Yang (2014, <doi:10.1108/S0731-9053(2013)0000031008>). Also, realized covariances can be constructed from stock market prices or returns, as explained in Andersen et al. (2001, <doi:10.1016/S0304-405X(01)00055-1>).
Traditional model evaluation metrics fail to capture model performance under less than ideal conditions. This package employs techniques to evaluate models "under-stress". This includes testing models extrapolation ability, or testing accuracy on specific sub-samples of the overall model space. Details describing stress-testing methods in this package are provided in Haycock (2023) <doi:10.26076/2am5-9f67>. The other primary contribution of this package is provided to R users access to the Python library PyCaret
<https://pycaret.org/> for quick and easy access to auto-tuned machine learning models.
This package provides a minimalist implementation of model stacking by Wolpert (1992) <doi:10.1016/S0893-6080(05)80023-1> for boosted tree models. A classic, two-layer stacking model is implemented, where the first layer generates features using gradient boosting trees, and the second layer employs a logistic regression model that uses these features as inputs. Utilities for training the base models and parameters tuning are provided, allowing users to experiment with different ensemble configurations easily. It aims to provide a simple and efficient way to combine multiple gradient boosting models to improve predictive model performance and robustness.
Implementation of popular mortality models using the rstan package, which provides the R interface to the Stan C++ library for Bayesian estimation. The package supports well-known models proposed in the actuarial and demographic literature including the Lee-Carter (1992) <doi:10.1080/01621459.1992.10475265> and the Cairns-Blake-Dowd (2006) <doi:10.1111/j.1539-6975.2006.00195.x> models. By a simple call, the user inputs deaths and exposures and the package outputs the MCMC simulations for each parameter, the log likelihoods and predictions. Moreover, the package includes tools for model selection and Bayesian model averaging by leave future-out validation.
Estimates the authors or speakers of texts. Methods developed in Huang, Perry, and Spirling (2020) <doi:10.1017/pan.2019.49>. The model is built on a Bayesian framework in which the distinctiveness of each speaker is defined by how different, on average, the speaker's terms are to everyone else in the corpus of texts. An optional cross-validation method is implemented to select the subset of terms that generate the most accurate speaker predictions. Once a set of terms is selected, the model can be estimated. Speaker distinctiveness and term influence can be recovered from parameters in the model using package functions. Once fitted, the model can be used to predict authorship of new texts.
The fossil record is a joint expression of ecological, taphonomic, evolutionary, and stratigraphic processes (Holland and Patzkowsky, 2012, ISBN:978-0226649382). This package allowing to simulate biological processes in the time domain (e.g., trait evolution, fossil abundance), and examine how their expression in the rock record (stratigraphic domain) is influenced based on age-depth models, ecological niche models, and taphonomic effects. Functions simulating common processes used in modeling trait evolution or event type data such as first/last occurrences are provided and can be used standalone or as part of a pipeline. The package comes with example data sets and tutorials in several vignettes, which can be used as a template to set up one's own simulation.
Collision Risk Models for avian fauna (seabird and migratory birds) at offshore wind farms. The base deterministic model is derived from Band (2012) <https://tethys.pnnl.gov/publications/using-collision-risk-model-assess-bird-collision-risks-offshore-wind-farms>. This was further expanded on by Masden (2015) <doi:10.7489/1659-1> and code used here is heavily derived from this work with input from Dr A. Cook at the British Trust for Ornithology. These collision risk models are useful for marine ornithologists who are working in the offshore wind industry, particularly in UK waters. However, many of the species included in the stochastic collision risk models can also be found in the North Atlantic in the United States and Canada, and could be applied there.
Practitioners of Bayesian statistics often use Markov chain Monte Carlo (MCMC) samplers to sample from a posterior distribution. This package determines whether the MCMC sample is large enough to yield reliable estimates of the target distribution. In particular, this calculates a Gelman-Rubin convergence diagnostic using stable and consistent estimators of Monte Carlo variance. Additionally, this uses the connection between an MCMC sample's effective sample size and the Gelman-Rubin diagnostic to produce a threshold for terminating MCMC simulation. Finally, this informs the user whether enough samples have been collected and (if necessary) estimates the number of samples needed for a desired level of accuracy. The theory underlying these methods can be found in "Revisiting the Gelman-Rubin Diagnostic" by Vats and Knudson (2018) <arXiv:1812:09384>
.
This package provides multiple sources of stopwords, for use in text analysis and natural language processing.
This package provides a consistently well behaved method of interpolation based on piecewise rational functions using Stineman's algorithm.
Interfaces the stepcount Python module <https://github.com/OxWearables/stepcount>
to estimate step counts and other activities from accelerometry data.