This package implements wild bootstrap tests for autocorrelation in Vector Autoregressive (VAR) models based on Ahlgren and Catani (2016) <doi:10.1007/s00362-016-0744-0>, a combined Lagrange Multiplier (LM) test for Autoregressive Conditional Heteroskedasticity (ARCH) in VAR models from Catani and Ahlgren (2016) <doi:10.1016/j.ecosta.2016.10.006>, and bootstrap-based methods for determining the cointegration rank from Cavaliere, Rahbek, and Taylor (2012) <doi:10.3982/ECTA9099> and Cavaliere, Rahbek, and Taylor (2014) <doi:10.1080/07474938.2013.825175>.
This package provides a comprehensive data analysis framework for NIH-funded research that streamlines workflows for both data cleaning and preparing NIH Data Archive ('NDA') submission templates. Provides unified access to multiple data sources ('REDCap', MongoDB', Qualtrics', SQL', ORACLE') through interfaces to their APIs, with specialized functions for data cleaning, filtering, merging, and parsing. Features automatic validation, field harmonization, and memory-aware processing to enhance reproducibility in multi-site collaborative research as described in Mittal et al. (2021) <doi:10.20900/jpbs.20210011>.
This package implements nested cross-validation applied to the glmnet and caret packages. With glmnet this includes cross-validation of elastic net alpha parameter. A number of feature selection filter functions (t-test, Wilcoxon test, ANOVA, Pearson/Spearman correlation, random forest, ReliefF) for feature selection are provided and can be embedded within the outer loop of the nested CV. Nested CV can be also be performed with the caret package giving access to the large number of prediction methods available in caret.
`BatchSVG` is a feature-based Quality Control (QC) to identify SVGs on spatial transcriptomics data with specific types of batch effect. Regarding to the spatial transcriptomics data experiments, the batch can be defined as "sample", "sex", and etc.The `BatchSVG` method is based on binomial deviance model (Townes et al, 2019) and applies cutoffs based on the number of standard deviation (nSD) of relative change in deviance and rank difference as the data-driven thresholding approach to detect the batch-biased outliers.
iSEEfier provides a set of functionality to quickly and intuitively create, inspect, and combine initial configuration objects. These can be conveniently passed in a straightforward manner to the function call to launch iSEE() with the specified configuration. This package currently works seamlessly with the sets of panels provided by the iSEE and iSEEu packages, but can be extended to accommodate the usage of any custom panel (e.g. from iSEEde, iSEEpathways, or any panel developed independently by the user).
This package provides users with an EZ-to-use platform for representing data with biplots. Currently principal component analysis (PCA), canonical variate analysis (CVA) and simple correspondence analysis (CA) biplots are included. This is accompanied by various formatting options for the samples and axes. Alpha-bags and concentration ellipses are included for visual enhancements and interpretation. For an extensive discussion on the topic, see Gower, J.C., Lubbe, S. and le Roux, N.J. (2011, ISBN: 978-0-470-01255-0) Understanding Biplots. Wiley: Chichester.
Is designed to test for association between methylation at CpG sites across the genome and a phenotype of interest, adjusting for any relevant covariates. The package can perform standard analyses of large datasets very quickly with no need to impute the data. It can also handle mixed effects models with chip or batch entering the model as a random intercept. Also includes tools to apply quality control filters, perform permutation tests, and create QQ plots, manhattan plots, and scatterplots for individual CpG sites.
Light-weight functions for computing descriptive statistics in different circular spaces (e.g., 2pi, 180, or 360 degrees), to handle angle-dependent biases, pad circular data, and more. Specifically aimed for psychologists and neuroscientists analyzing circular data. Basic methods are based on Jammalamadaka and SenGupta (2001) <doi:10.1142/4031>, removal of cardinal biases is based on the approach introduced in van Bergen, Ma, Pratte, & Jehee (2015) <doi:10.1038/nn.4150> and Chetverikov and Jehee (2023) <doi:10.1038/s41467-023-43251-w>.
Create correlation heatmaps from a numeric matrix. Ensembl Gene ID row names can be converted to Gene Symbols using, e.g., BioMart. Optionally, data can be clustered and filtered by correlation, tree cutting and/or number of missing values. Genes of interest can be highlighted in the plot and correlation significance be indicated by asterisks encoding corresponding P-Values. Plot dimensions and label measures are adjusted automatically by default. The plot features rely on the heatmap.n2() function in the heatmapFlex package.
Given count data from two conditions, it determines which transcripts are differentially expressed across the two conditions using Bayesian inference of the parameters of a bottom-up model for PCR amplification. This model is developed in Ndifon Wilfred, Hilah Gal, Eric Shifrut, Rina Aharoni, Nissan Yissachar, Nir Waysbort, Shlomit Reich Zeliger, Ruth Arnon, and Nir Friedman (2012), <http://www.pnas.org/content/109/39/15865.full>, and results in a distribution for the counts that is a superposition of the binomial and negative binomial distribution.
Miscellaneous functions for data cleaning and data analysis of educational assessments. Includes functions for descriptive analyses, character vector manipulations and weighted statistics. Mainly a lightweight dependency for the packages eatRep', eatGADS', eatPrep and eatModel (which will be subsequently submitted to CRAN'). The function for defining (weighted) contrasts in weighted effect coding refers to te Grotenhuis et al. (2017) <doi:10.1007/s00038-016-0901-1>. Functions for weighted statistics refer to Wolter (2007) <doi:10.1007/978-0-387-35099-8>.
This package provides methods for estimating species niche position and niche breadth under continuous environmental gradients. The package implements canonical correspondence analysis (CCA), partial CCA (pCCA), generalized additive models (GAM), and Levins niche breadth metrics for species-level and community-level analyses. Methods are based on ter Braak (1986) <doi:10.2307/1938672>, Okie et al. (2015) <doi:10.1098/rspb.2014.2630>, Feng et al. (2020) <doi:10.1111/mec.15441>, Wood (2017) <doi:10.1201/9781315370279>, and Levins (1968, ISBN:978-0691080628).
Calculate numerical asymptotic distribution functions of likelihood ratio statistics for fractional unit root tests and tests of cointegration rank. For these distributions, the included functions calculate critical values and P-values used in unit root tests, cointegration tests, and rank tests in the Fractionally Cointegrated Vector Autoregression (FCVAR) model. The functions implement procedures for tests described in the following articles: Johansen, S. and M. Ã . Nielsen (2012) <doi:10.3982/ECTA9299>, MacKinnon, J. G. and M. Ã . Nielsen (2014) <doi:10.1002/jae.2295>.
This package provides a local haplotyping tool for use in trait association and trait prediction analyses pipelines. HaploVar enables users take single nucleotide polymorphisms (SNPs) (in VCF format) and a linkage disequilibrium (LD) matrix, calculate local haplotypes and format the output to be compatible with a wide range of trait association and trait prediction tools. The local haplotypes are calculated from the LD matrix using a clustering algorithm called density-based spatial clustering of applications with noise ('DBSCAN') (Ester et al., 1996) <ISBN: 1577350049>.
Raster based flood modelling internally using hyd1d', an R package to interpolate 1d water level and gauging data. The package computes flood extent and duration through strategies originally developed for INFORM', an ArcGIS'-based hydro-ecological modelling framework. It does not provide a full, physical hydraulic modelling algorithm, but a simplified, near real time GIS approach for flood extent and duration modelling. Computationally demanding annual flood durations have been computed already and data products were published by Weber (2022) <doi:10.1594/PANGAEA.948042>.
R is great for installing software. Through the installr package you can automate the updating of R (on Windows, using updateR()) and install new software. Software installation is initiated through a GUI (just run installr()), or through functions such as: install.Rtools(), install.pandoc(), install.git(), and many more. The updateR() command performs the following: finding the latest R version, downloading it, running the installer, deleting the installation file, copy and updating old packages to the new R installation.
This package provides a function for classifying a landscape into different categories based on the Topographic Position Index (TPI) and slope. It offers two types of classifications: Slope Position Classification, and Landform Classification. The function internally calculates the TPI for the given landscape and then uses it along with the slope to perform the classification. Optionally, descriptive statistics for every class are calculated and plotted. The classifications are useful for identifying the position of a location on a slope and for identifying broader landform types.
Fits mixed membership models with discrete multivariate data (with or without repeated measures) following the general framework of Erosheva et al (2004). This package uses a Variational EM approach by approximating the posterior distribution of latent memberships and selecting hyperparameters through a pseudo-MLE procedure. Currently supported data types are Bernoulli, multinomial and rank (Plackett-Luce). The extended GoM model with fixed stayers from Erosheva et al (2007) is now also supported. See Airoldi et al (2014) for other examples of mixed membership models.
This package provides tools for the structured processing of PET neuroimaging data in preparation for the estimation of Simultaneous Confidence Corridors (SCCs) for one-group, two-group, or single-patient vs group comparisons. The package facilitates PET image loading, data restructuring, integration into a Functional Data Analysis framework, contour extraction, identification of significant results, and performance evaluation. It bridges established packages (e.g., oro.nifti') with novel statistical methodologies (e.g., ImageSCC') and enables reproducible analysis pipelines, including comparison with Statistical Parametric Mapping ('SPM').
Create surface forms from matrix or raster data for flexible plotting and conversion to other mesh types. The functions quadmesh or triangmesh produce a continuous surface as a mesh3d object as used by the rgl package. This is used for plotting raster data in 3D (optionally with texture), and allows the application of a map projection without data loss and many processing applications that are restricted by inflexible regular grid rasters. There are discrete forms of these continuous surfaces available with dquadmesh and dtriangmesh functions.
Variant determination and genotyping from high throughput sequences from multilocus amplicon libraries, typically sequenced in Illumina MiSeq or similar. It provides a set of core functions for the central steps: demultiplex by locus, truncate reads, variant calling, and genotype calling. Additionally, it provides a set of functions for diagnosis and estimation of best running parameters and multiple extensions for genotype/variants manipulation and reformatting. Output variants and genotypes are output in tidy format, thus facilitating reformatting, manipulation and potential connection to other R packages.
Various methods for targeted and semiparametric inference including augmented inverse probability weighted (AIPW) estimators for missing data and causal inference (Bang and Robins (2005) <doi:10.1111/j.1541-0420.2005.00377.x>), variable importance and conditional average treatment effects (CATE) (van der Laan (2006) <doi:10.2202/1557-4679.1008>), estimators for risk differences and relative risks (Richardson et al. (2017) <doi:10.1080/01621459.2016.1192546>), assumption lean inference for generalized linear model parameters (Vansteelandt et al. (2022) <doi:10.1111/rssb.12504>).
This package stores the data employed in the vignette of the GSVA package. These data belong to the following publications: Armstrong et al. Nat Genet 30:41-47, 2002; Cahoy et al. J Neurosci 28:264-278, 2008; Carrel and Willard, Nature, 434:400-404, 2005; Huang et al. PNAS, 104:9758-9763, 2007; Pickrell et al. Nature, 464:768-722, 2010; Skaletsky et al. Nature, 423:825-837; Verhaak et al. Cancer Cell 17:98-110, 2010; Costa et al. FEBS J, 288:2311-2331, 2021.
This package provides a toolset for the exploration of genetic and genomic data. Adegenet provides formal (S4) classes for storing and handling various genetic data, including genetic markers with varying ploidy and hierarchical population structure (genind class), alleles counts by populations (genpop), and genome-wide SNP data (genlight). It also implements original multivariate methods (DAPC, sPCA), graphics, statistical tests, simulation tools, distance and similarity measures, and several spatial methods. A range of both empirical and simulated datasets is also provided to illustrate various methods.