Manually bin data using weight of evidence and information value. Includes other binning methods such as equal length, quantile and winsorized. Options for combining levels of categorical data are also available. Dummy variables can be generated based on the bins created using any of the available binning methods. References: Siddiqi, N. (2006) <doi:10.1002/9781119201731.biblio>.
The Common Workflow Language (CWL) is an open standard for development of data analysis workflows that is portable and scalable across different tools and working environments. Rcwl provides a simple way to wrap command line tools and build CWL data analysis pipelines programmatically within R. It increases the ease of usage, development, and maintenance of CWL pipelines.
This package provides a summarization method to estimate allele-specific copy number signals for Affymetrix SNP microarrays using non-negative matrix factorization (NMF).
Fast Bayesian inference of marginal and conditional independence structures from high-dimensional data. Leday and Richardson (2019), Biometrics, <doi:10.1111/biom.13064>.
Retorna detalhes de dados de CEPs brasileiros, bairros, logradouros e tal. (Returns info of Brazilian postal codes, city names, addresses and so on.).
Detect and quantify community assembly processes using trait values of individuals or populations, the T-statistics and other metrics, and dedicated null models.
API wrapper to download statistical information from the Economic Statistics System (ECOS) of the Bank of Korea <https://ecos.bok.or.kr/api/#/>.
Statistical tools for environmental and ecological surveys. Simulation-based power and precision analysis; detection probabilities from different survey designs; visual fast count estimation.
We implement the Fast Covariance Estimation for Sparse Functional Data paper published in Statistics and Computing <doi: 10.1007/s11222-017-9744-8>.
Fits latent space models for single networks and hierarchical latent space models for ensembles of networks as described in Sweet, Thomas & Junker (2013).
Datasets and code examples that accompany our book Visser & Speekenbrink (2021), "Mixture and Hidden Markov Models with R", <https://depmix.github.io/hmmr/>.
This package provides functions and datasets from Hilbe, J.M., and Robinson, A.P. 2013. Methods of Statistical Model Estimation. Chapman & Hall / CRC.
This package provides the Arctic Ice Studio's Nord and Group of Seven inspired colour palettes for use with ggplot2 via custom functions.
Fits two-dimensional data by means of orthogonal nonlinear least-squares using Levenberg-Marquardt minimization and provides functionality for fit diagnostics and plotting.
Simultaneous inference procedures for high-dimensional linear models as described by Zhang, X., and Cheng, G. (2017) <doi:10.1080/01621459.2016.1166114>.
This package provides tools to download and merge data files on sub-national conflict, violence and protests from <http://www.x-sub.org>.
This package is desgined to perform statistical analysis to identify statistically significant differentially bound regions between multiple groups of ChIP-seq
dataset.
The GSRI package estimates the number of differentially expressed genes in gene sets, utilizing the concept of the Gene Set Regulation Index (GSRI).
This R package is providing functions to perform geneset significance analysis over simple cross-sectional data between 2 and 5 phenotypes of interest.
This package contains functions for the efficient design of factorial two-colour microarray experiments and for the statistical analysis of factorial microarray data.
This package provides software for the book Spectral Analysis for Physical Applications, Donald B. Percival and Andrew T. Walden, Cambridge University Press, 1993.
This package provides tools to read, write, create, and manipulate DESCRIPTION files. It is intended for packages that create or manipulate other packages.
Focused on (but not exclusive to) data sets hosted on PhysioNet
(<https://physionet.org>), ricu provides utilities for download, setup and access of intensive care unit (ICU) data sets. In addition to functions for running arbitrary queries against available data sets, a system for defining clinical concepts and encoding their representations in tabular ICU data is presented.
We introduce a robust matrix factor model that explicitly incorporates tail behavior and employs a mean-shift term to avoid efficiency losses through pre-centering of observed matrices. More details on the methods related to our paper are currently under submission. A full reference to the paper will be provided in future versions once the paper is published.