Miscellaneous functions for data cleaning and data analysis of educational assessments. Includes functions for descriptive analyses, character vector manipulations and weighted statistics. Mainly a lightweight dependency for the packages eatRep', eatGADS', eatPrep and eatModel (which will be subsequently submitted to CRAN'). The function for defining (weighted) contrasts in weighted effect coding refers to te Grotenhuis et al. (2017) <doi:10.1007/s00038-016-0901-1>. Functions for weighted statistics refer to Wolter (2007) <doi:10.1007/978-0-387-35099-8>.
Calculate numerical asymptotic distribution functions of likelihood ratio statistics for fractional unit root tests and tests of cointegration rank. For these distributions, the included functions calculate critical values and P-values used in unit root tests, cointegration tests, and rank tests in the Fractionally Cointegrated Vector Autoregression (FCVAR) model. The functions implement procedures for tests described in the following articles: Johansen, S. and M. Ã . Nielsen (2012) <doi:10.3982/ECTA9299>, MacKinnon, J. G. and M. Ã . Nielsen (2014) <doi:10.1002/jae.2295>.
Raster based flood modelling internally using hyd1d', an R package to interpolate 1d water level and gauging data. The package computes flood extent and duration through strategies originally developed for INFORM', an ArcGIS'-based hydro-ecological modelling framework. It does not provide a full, physical hydraulic modelling algorithm, but a simplified, near real time GIS approach for flood extent and duration modelling. Computationally demanding annual flood durations have been computed already and data products were published by Weber (2022) <doi:10.1594/PANGAEA.948042>.
This package provides a local haplotyping tool for use in trait association and trait prediction analyses pipelines. HaploVar enables users take single nucleotide polymorphisms (SNPs) (in VCF format) and a linkage disequilibrium (LD) matrix, calculate local haplotypes and format the output to be compatible with a wide range of trait association and trait prediction tools. The local haplotypes are calculated from the LD matrix using a clustering algorithm called density-based spatial clustering of applications with noise ('DBSCAN') (Ester et al., 1996) <ISBN: 1577350049>.
R is great for installing software. Through the installr package you can automate the updating of R (on Windows, using updateR()) and install new software. Software installation is initiated through a GUI (just run installr()), or through functions such as: install.Rtools(), install.pandoc(), install.git(), and many more. The updateR() command performs the following: finding the latest R version, downloading it, running the installer, deleting the installation file, copy and updating old packages to the new R installation.
This package provides a function for classifying a landscape into different categories based on the Topographic Position Index (TPI) and slope. It offers two types of classifications: Slope Position Classification, and Landform Classification. The function internally calculates the TPI for the given landscape and then uses it along with the slope to perform the classification. Optionally, descriptive statistics for every class are calculated and plotted. The classifications are useful for identifying the position of a location on a slope and for identifying broader landform types.
Fast imputations under the object-oriented programming paradigm. Moreover there are offered a few functions built to work with popular R packages such as data.table or dplyr'. The biggest improvement in time performance could be achieve for a calculation where a grouping variable have to be used. A single evaluation of a quantitative model for the multiple imputations is another major enhancement. A new major improvement is one of the fastest predictive mean matching in the R world because of presorting and binary search.
Fits mixed membership models with discrete multivariate data (with or without repeated measures) following the general framework of Erosheva et al (2004). This package uses a Variational EM approach by approximating the posterior distribution of latent memberships and selecting hyperparameters through a pseudo-MLE procedure. Currently supported data types are Bernoulli, multinomial and rank (Plackett-Luce). The extended GoM model with fixed stayers from Erosheva et al (2007) is now also supported. See Airoldi et al (2014) for other examples of mixed membership models.
This package provides tools for the structured processing of PET neuroimaging data in preparation for the estimation of Simultaneous Confidence Corridors (SCCs) for one-group, two-group, or single-patient vs group comparisons. The package facilitates PET image loading, data restructuring, integration into a Functional Data Analysis framework, contour extraction, identification of significant results, and performance evaluation. It bridges established packages (e.g., oro.nifti') with novel statistical methodologies (e.g., ImageSCC') and enables reproducible analysis pipelines, including comparison with Statistical Parametric Mapping ('SPM').
Create surface forms from matrix or raster data for flexible plotting and conversion to other mesh types. The functions quadmesh or triangmesh produce a continuous surface as a mesh3d object as used by the rgl package. This is used for plotting raster data in 3D (optionally with texture), and allows the application of a map projection without data loss and many processing applications that are restricted by inflexible regular grid rasters. There are discrete forms of these continuous surfaces available with dquadmesh and dtriangmesh functions.
Various methods for targeted and semiparametric inference including augmented inverse probability weighted (AIPW) estimators for missing data and causal inference (Bang and Robins (2005) <doi:10.1111/j.1541-0420.2005.00377.x>), variable importance and conditional average treatment effects (CATE) (van der Laan (2006) <doi:10.2202/1557-4679.1008>), estimators for risk differences and relative risks (Richardson et al. (2017) <doi:10.1080/01621459.2016.1192546>), assumption lean inference for generalized linear model parameters (Vansteelandt et al. (2022) <doi:10.1111/rssb.12504>).
This package stores the data employed in the vignette of the GSVA package. These data belong to the following publications: Armstrong et al. Nat Genet 30:41-47, 2002; Cahoy et al. J Neurosci 28:264-278, 2008; Carrel and Willard, Nature, 434:400-404, 2005; Huang et al. PNAS, 104:9758-9763, 2007; Pickrell et al. Nature, 464:768-722, 2010; Skaletsky et al. Nature, 423:825-837; Verhaak et al. Cancer Cell 17:98-110, 2010; Costa et al. FEBS J, 288:2311-2331, 2021.
This package provides a comprehensive toolbox for analysing Spatial Point Patterns. It is focused mainly on two-dimensional point patterns, including multitype/marked points, in any spatial region. It also supports three-dimensional point patterns, space-time point patterns in any number of dimensions, point patterns on a linear network, and patterns of other geometrical objects. It supports spatial covariate data such as pixel images and contains over 2000 functions for plotting spatial data, exploratory data analysis, model-fitting, simulation, spatial sampling, model diagnostics, and formal inference.
This package provides a toolset for the exploration of genetic and genomic data. Adegenet provides formal (S4) classes for storing and handling various genetic data, including genetic markers with varying ploidy and hierarchical population structure (genind class), alleles counts by populations (genpop), and genome-wide SNP data (genlight). It also implements original multivariate methods (DAPC, sPCA), graphics, statistical tests, simulation tools, distance and similarity measures, and several spatial methods. A range of both empirical and simulated datasets is also provided to illustrate various methods.
UMI-4C is a technique that allows characterization of 3D chromatin interactions with a bait of interest, taking advantage of a sonication step to produce unique molecular identifiers (UMIs) that help remove duplication bias, thus allowing a better differential comparsion of chromatin interactions between conditions. This package allows processing of UMI-4C data, starting from FastQ files provided by the sequencing facility. It provides two statistical methods for detecting differential contacts and includes a visualization function to plot integrated information from a UMI-4C assay.
This package contains data and functions that can be used to make actuarial life tables. Each function adds a column to the inputted dataset for each intermediate calculation between mortality rate and life expectancy. Users can run any of our functions to complete the life table until that step, or run lifetable() to output a full life table that can be customized to remove optional columns. Methods for creating lifetables are as described in Zedstatistics (2021) <https://www.youtube.com/watch?v=Dfe59glNXAQ>.
This package provides a comprehensive system for selecting variables and weighting data to match the specifications of the American National Election Studies. The package includes methods for identifying discrepant variables, raking data, and assessing the effects of the raking algorithm. It also allows automated re-raking if target variables fall outside identified bounds and allows greater user specification than other available raking algorithms. A variety of simple weighted statistics that were previously in this package (version .55 and earlier) have been moved to the package weights.'.
Power calculations are a critical component of any research study to determine the minimum sample size necessary to detect differences between multiple groups. Researchers often work with data taking the form of proportions that can be modeled with a beta distribution. Here we present an R package, BetaPASS', that perform power and sample size calculations for data following a beta distribution with comparative nonparametric output. This package allows flexibility with multiple options for link functions to fit the data and graphing functionality for visual comparisons.
An implementation of double generalized linear model (DGLM) building with variable selection procedures and handling of interaction terms and other complex situations. We also provide a method of handling convergence issues within the dglm() function. The package offers a simulation function for generating simulated data for testing purposes and utilizes the forward stepwise variable selection procedure in model-building. It also provides a new custom bootstrap function for mean and standard deviation estimation and functions for building crossplots and squareplots from a data set.
Provide simple functions to (i) compute a class of multi-functionality measures for a single ecosystem for given function weights, (ii) decompose gamma multi-functionality for pairs of ecosystems and K ecosystems (K can be greater than 2) into a within-ecosystem component (alpha multi-functionality) and an among-ecosystem component (beta multi-functionality). In each case, the correlation between functions can be corrected for. Based on biodiversity and ecosystem function data, this software also facilitates graphics for assessing biodiversity-ecosystem functioning relationships across scales.
This package creates sophisticated models of training data and validates the models with an independent test set, cross validation, or Out Of Bag (OOB) predictions on the training data. Create graphs and tables of the model validation results. Applies these models to GIS .img files of predictors to create detailed prediction surfaces. Handles large predictor files for map making, by reading in the .img files in chunks, and output to the .txt file the prediction for each data chunk, before reading the next chunk of data.
Due to Rstudio's status as open source software, we believe it will be utilized frequently for future data analysis by users whom lack formal training or experience with R'. The NMVANOVA (Novice Model Variation ANOVA) a streamlined variation of experimental design functions that allows novice Rstudio users to perform different model variations one-way analysis of variance without downloading multiple libraries or packages. Users can easily manipulate the data block, and needed inputs so that users only have to plugin the four designed variables/values.
Helps you determine the analysis window to use when analyzing densely-sampled time-series data, such as EEG data, using permutation testing (Maris & Oostenveld, 2007) <doi:10.1016/j.jneumeth.2007.03.024>. These permutation tests can help identify the timepoints where significance of an effect begins and ends, and the results can be plotted in various types of heatmap for reporting. Mixed-effects models are supported using an implementation of the approach by Lee & Braun (2012) <doi:10.1111/j.1541-0420.2011.01675.x>.
This package provides some basic routines for simulating a clinical trial. The primary intent is to provide some tools to generate trial simulations for trials with time to event outcomes. Piecewise exponential failure rates and piecewise constant enrollment rates are the underlying mechanism used to simulate a broad range of scenarios such as those presented in Lin et al. (2020) <doi:10.1080/19466315.2019.1697738>. However, the basic generation of data is done using pipes to allow maximum flexibility for users to meet different needs.