This package provides simple crosstab output with optional statistics (e.g., Goodman-Kruskal Gamma, Somers d, and Kendall's tau-b) as well as two-way and one-way tables. The package is used within the statistics component of the Masters of Science (MSc) in Social Science of the Internet at the Oxford Internet Institute (OII), University of Oxford, but the functions should be useful for general data analysis and especially for analysis of categorical and ordinal data.
Permutation (randomisation) test for single-case phase design data with two phases (e.g., pre- and post-treatment). Correction for dependency of observations is done through stepwise resampling the time series while varying the distance between observations. The required distance 0,1,2,3.. is determined based on repeated dependency testing while stepwise increasing the distance. In preparation: Vroegindeweij et al. "A Permutation distancing test for single-case observational AB phase design data: A Monte Carlo simulation study".
Implementation of the wavelet-based spatial verification method of Buschow and Friederichs "SAD: Verifying the Scale, Anisotropy and Direction of precipitation forecasts" (2020, submitted to QJRMS). Forecasts and Observations are transformed by a decimated or redundant dual-tree complex wavelet transform to analyze the spatial scale, degree of anisotropy and preferred direction in each field. These structural attributes are compared by a series of scores. An experimental algorithm for the correction of these errors is included as well.
Independent hypothesis weighting (IHW) is a multiple testing procedure that increases power compared to the method of Benjamini and Hochberg by assigning data-driven weights to each hypothesis. The input to IHW is a two-column table of p-values and covariates. The covariate can be any continuous-valued or categorical variable that is thought to be informative on the statistical properties of each hypothesis test, while it is independent of the p-value under the null hypothesis.
Nonparametric detection of nonuniformity and dependence with Binary Expansion Testing (BET). See Kai Zhang (2019) BET on Independence, Journal of the American Statistical Association, 114:528, 1620-1637, <DOI:10.1080/01621459.2018.1537921>, Kai Zhang, Wan Zhang, Zhigen Zhao, Wen Zhou. (2023). BEAUTY Powered BEAST, <doi:10.48550/arXiv.2103.00674>
and Wan Zhang, Zhigen Zhao, Michael Baiocchi, Yao Li, Kai Zhang. (2023) SorBET
: A Fast and Powerful Algorithm to Test Dependence of Variables, Techinical report.
An ensemble method for the statistical detection of a rare class in two-class classification problems. The method uses an ensemble of classifiers where the constituent models of the ensemble use disjoint subsets (phalanxes) of explanatory variables. We provide an implementation of the phalanx-formation algorithm. Please see Tomal et al. (2015) <doi:10.1214/14-AOAS778>, Tomal et al. (2016) <doi:10.1021/acs.jcim.5b00663>, and Tomal et al. (2019) <arXiv:1706.06971>
for more details.
This package provides a selection of 3 different inference rules (including additionally the clamped types of the referred inference rules) and 4 threshold functions in order to obtain the inference of the FCM (Fuzzy Cognitive Map). Moreover, the fcm package returns a data frame of the concepts values of each state after the inference procedure. Fuzzy cognitive maps were introduced by Kosko (1986) <doi:10.1002/int.4550010405> providing ideal causal cognition tools for modeling and simulating dynamic systems.
After develop a ODK <https://opendatakit.org/> frame, we can link the frame to Google Sheets <https://www.google.com/sheets/about/> and collect data through Android <https://www.android.com/>. This data uploaded to a Google sheets'. odk2spss()
function help to convert the odk frame into SPSS <https://www.ibm.com/analytics/us/en/technology/spss/> frame. Also able to add downloaded Google sheets data or read data from Google sheets by using ODK frame submission_url'.
Streamlines and accelerates the process of saving and loading R objects, improving speed and compression compared to other methods. The package provides two compression formats: the qs2 format, which uses R serialization via the C API while optimizing compression and disk I/O, and the qdata format, featuring custom serialization for slightly faster performance and better compression. Additionally, the qs2 format can be directly converted to the standard RDS format, ensuring long-term compatibility with future versions of R.
This package produces quality scores for each of the US companies from the Russell 3000, following the approach described in "Quality Minus Junk" (Asness, Frazzini, & Pedersen, 2013) <http://www.aqr.com/library/working-papers/quality-minus-junk>. The package includes datasets for users who wish to view the most recently uploaded quality scores. It also provides tools to automatically gather relevant financials and stock price information, allowing users to update their data and customize their universe for further analysis.
We provide a suite of tools for estimating the sample complexity of a chosen model through theoretical bounds and simulation. The package incorporates methods for estimating the Vapnik-Chervonenkis dimension (VCD) of a chosen algorithm, which can be used to estimate its sample complexity. Alternatively, we provide simulation methods to estimate sample complexity directly. For more details, see Carter, P & Choi, D (2024). "Learning from Noise: Applying Sample Complexity for Political Science Research" <doi:10.31219/osf.io/evrcj>.
This package provides conditional maximum likelihood (CML) item parameter estimation of both sequential and cumulative deterministic multistage designs (Zwitser & Maris, 2015, <doi:10.1007/s11336-013-9369-6>) and probabilistic sequential and cumulative multistage designs (Steinfeld & Robitzsch, 2024, <doi:10.1007/s41237-024-00228-3>). Supports CML item parameter estimation of conventional linear designs and additional functions for the likelihood ratio test (Andersen, 1973, <doi:10.1007/BF02291180>) as well as functions for simulating various types of multistage designs.
This package provides tools for the visualization of missing and/or imputed values are introduced, which can be used for exploring the data and the structure of the missing and/or imputed values. Depending on this structure of the missing values, the corresponding methods may help to identify the mechanism generating the missing values and explore the data including missing values. In addition, the quality of imputation can be visually explored using various univariate, bivariate, multiple and multivariate plot methods.
Extract and process bird sightings records from eBird
(<http://ebird.org>), an online tool for recording bird observations. Public access to the full eBird
database is via the eBird
Basic Dataset (EBD; see <http://ebird.org/ebird/data/download> for access), a downloadable text file. This package is an interface to AWK for extracting data from the EBD based on taxonomic, spatial, or temporal filters, to produce a manageable file size that can be imported into R.
This package provides a collection of tests to analyze the causal direction of dependence in linear models (Wiedermann, W., & von Eye, A., 2025, ISBN: 9781009381390). The package includes functions to perform Direction Dependence Analysis for variable distributions, residual distributions, and independence properties of predictors and residuals in competing causal models. In addition, the package contains functions to test the causal direction of dependence in conditional models (i.e., models with interaction terms) For more information see <https://www.ddaproject.com>.
Supports the analysis of Oceanographic data, including ADCP measurements, measurements made with argo floats, CTD measurements, sectional data, sea-level time series, coastline and topographic data, etc. Provides specialized functions for calculating seawater properties such as potential temperature in either the UNESCO or TEOS-10 equation of state. Produces graphical displays that conform to the conventions of the Oceanographic literature. This package is discussed extensively by Kelley (2018) "Oceanographic Analysis with R" <doi:10.1007/978-1-4939-8844-0>.
Computes the optimal flow, Nash flow and the Price of Anarchy for any routing game defined within the game theoretical framework. The input is a routing game in the form of itâ s cost and flow functions. Then transforms this into an optimisation problem, allowing both Nash and Optimal flows to be solved by nonlinear optimisation. See <https://en.wikipedia.org/wiki/Congestion_game> and Knight and Harper (2013) <doi:10.1016/j.ejor.2013.04.003> for more information.
Binding models which are useful when analysing protein-ligand interactions by techniques such as Biolayer Interferometry (BLI) or Surface Plasmon Resonance (SPR). Naman B. Shah, Thomas M. Duncan (2014) <doi:10.3791/51383>. Hoang H. Nguyen et al. (2015) <doi:10.3390/s150510481>. After initial binding parameters are known, binding curves can be simulated and parameters can be varied. The models within this package may also be used to fit a curve to measured binding data using non-linear regression.
Fit computational and measurement models using full Bayesian inference. The package provides a simple and accessible interface by translating complex domain-specific models into brms syntax, a powerful and flexible framework for fitting Bayesian regression models using Stan'. The package is designed so that users can easily apply state-of-the-art models in various research fields, and so that researchers can use it as a new model development framework. References: Frischkorn and Popov (2023) <doi:10.31234/osf.io/umt57>.
This package implements the dynamic panel models described by Allison, Williams, and Moral-Benito (2017 <doi:10.1177/2378023117710578>) in R. This class of models uses structural equation modeling to specify dynamic (lagged dependent variable) models with fixed effects for panel data. Additionally, models may have predictors that are only weakly exogenous, i.e., are affected by prior values of the dependent variable. Options also allow for random effects, dropping the lagged dependent variable, and a number of other specification choices.
Perform robust inference based on applying Fast and Robust Bootstrap on robust estimators (Van Aelst and Willems (2013) <doi:10.18637/jss.v053.i03>). This method constitutes an alternative to ordinary bootstrap or asymptotic inference. procedures when using robust estimators such as S-, MM- or GS-estimators. The available methods are multivariate regression, principal component analysis and one-sample and two-sample Hotelling tests. It provides both the robust point estimates and uncertainty measures based on the fast and robust bootstrap.
This package provides a general and efficient tool for fitting a response surface to a dataset via Gaussian processes. The dataset can have multiple responses and be noisy (with stationary variance). The fitted GP model can predict the gradient as well. The package is based on the work of Bostanabad, R., Kearney, T., Tao, S. Y., Apley, D. W. & Chen, W. (2018) Leveraging the nugget parameter for efficient Gaussian process modeling. International Journal for Numerical Methods in Engineering, 114, 501-516.
This package contains the framework of the estimation, sampling, and hypotheses testing for two special distributions (Exponentiated Exponential-Pareto and Exponentiated Inverse Gamma-Pareto) within the family of Generalized Exponentiated Composite distributions. The detailed explanation and the applications of these two distributions were introduced in Bowen Liu, Malwane M.A. Ananda (2022) <doi:10.1080/03610926.2022.2050399>, Bowen Liu, Malwane M.A. Ananda (2022) <doi:10.3390/math10111895>, and Bowen Liu, Malwane M.A. Ananda (2022) <doi:10.3390/app13010645>.
Informal implementation of some algorithms from Graph Theory and Combinatorial Optimization which arise in the subject "Graphs and Network Optimization" from first course of the EUPLA (Escuela Universitaria Politecnica de La Almunia) degree of Data Engineering in Industrial Processes. References used are: Cook et al (1998, ISBN:0-471-55894-X), Korte, Vygen (2018) <doi:10.1007/978-3-662-56039-6>, Hromkovic (2004) <doi:10.1007/978-3-662-05269-3>, Hartmann, Weigt (2005, ISBN:978-3-527-40473-5).