This package provides efficient low-level and highly reusable S4 classes for storing ranges of integers, RLE vectors (Run-Length Encoding), and, more generally, data that can be organized sequentially (formally defined as Vector
objects), as well as views on these Vector
objects. Efficient list-like classes are also provided for storing big collections of instances of the basic classes. All classes in the package use consistent naming and share the same rich and consistent "Vector API" as much as possible.
This package reads Bruker NMR data directories both zipped and unzipped. It provides automated and efficient signal processing for untargeted NMR metabolomics. It is able to interpolate the samples, detect outliers, exclude regions, normalize, detect peaks, align the spectra, integrate peaks, manage metadata and visualize the spectra. After spectra processing, it can apply multivariate analysis on extracted data. Efficient plotting with 1-D data is also available. Basic reading of 1D ACD/Labs exported JDX samples is also available.
This package provides the data for the gene expression enrichment analysis conducted in the package ABAEnrichment. The package includes three datasets which are derived from the Allen Brain Atlas:
Gene expression data from Human Brain (adults) averaged across donors,
Gene expression data from the Developing Human Brain pooled into five age categories and averaged across donors, and
a developmental effect score based on the Developing Human Brain expression data.
All datasets are restricted to protein coding genes.
The robin-map library is a C++ implementation of a fast hash map and hash set using open-addressing and linear robin hood hashing with backward shift deletion to resolve collisions.
Four classes are provided: tsl::robin_map, tsl::robin_set, tsl::robin_pg_map and tsl::robin_pg_set. The first two are faster and use a power of two growth policy, the last two use a prime growth policy instead and are able to cope better with a poor hash function.
This package provides a color palette generator inspired by American politics, with colors ranging from blue on the left to gray in the middle and red on the right. A variety of palettes allow for a range of applications from brief discrete scales (e.g., three colors for Democrats, Independents, and Republicans) to continuous interpolated arrays including dozens of shades graded from blue (left) to red (right). This package greatly benefitted from building on the source code (with permission) from Ram and Wickham (2015).
This package creates an area-proportional Venn diagram of 2 or 3 circles. BioVenn
is the only R package that can automatically generate an accurate area-proportional Venn diagram by having only lists of (biological) identifiers as input. Also offers the option to map Entrez and/or Affymetrix IDs to Ensembl IDs. In SVG mode, text and numbers can be dragged and dropped. Based on the BioVenn
web interface available at <https://www.biovenn.nl>. Hulsen (2021) <doi:10.3233/DS-210032>.
This package provides a convenient framework to simulate, test, power, and visualize data for differential expression studies with lognormal or negative binomial outcomes. Supported designs are two-sample comparisons of independent or dependent outcomes. Power may be summarized in the context of controlling the per-family error rate or family-wise error rate. Negative binomial methods are described in Yu, Fernandez, and Brock (2017) <doi:10.1186/s12859-017-1648-2> and Yu, Fernandez, and Brock (2020) <doi:10.1186/s12859-020-3541-7>.
This package provides read and write access to data and metadata from the DataONE
network <https://www.dataone.org> of data repositories. Each DataONE
repository implements a consistent repository application programming interface. Users call methods in R to access these remote repository functions, such as methods to query the metadata catalog, get access to metadata for particular data packages, and read the data objects from the data repository. Users can also insert and update data objects on repositories that support these methods.
It allows running Dynare program from base R, R Markdown and Quarto. Dynare is a software platform for handling a wide class of economic models, in particular dynamic stochastic general equilibrium ('DSGE') and overlapping generations ('OLG') models. This package does not only integrate R and Dynare but also serves as a Dynare Knit-Engine for knitr package. The package requires Dynare (<https://www.dynare.org/>) and Octave (<https://www.octave.org/download.html>). Write all your Dynare commands in R or R Markdown chunk.
Joint analysis and imputation of incomplete data in the Bayesian framework, using (generalized) linear (mixed) models and extensions there of, survival models, or joint models for longitudinal and survival data, as described in Erler, Rizopoulos and Lesaffre (2021) <doi:10.18637/jss.v100.i20>. Incomplete covariates, if present, are automatically imputed. The package performs some preprocessing of the data and creates a JAGS model, which will then automatically be passed to JAGS <https://mcmc-jags.sourceforge.io/> with the help of the package rjags'.
The lognormal distribution (Limpert et al. (2001) <doi:10.1641/0006-3568(2001)051%5B0341:lndats%5D2.0.co;2>) can characterize uncertainty that is bounded by zero. This package provides estimation of distribution parameters, computation of moments and other basic statistics, and an approximation of the distribution of the sum of several correlated lognormally distributed variables (Lo 2013 <doi:10.12988/ams.2013.39511>) and the approximation of the difference of two correlated lognormally distributed variables (Lo 2012 <doi:10.1155/2012/838397>).
Snow water equivalent is modeled with the process based delta.snow model and empirical regression models using relationships between density and diverse at-site parameters. The methods are described in Winkler et al. (2021) <doi:10.5194/hess-25-1165-2021>, Guyennon et al. (2019) <doi:10.1016/j.coldregions.2019.102859>, Pistocchi (2016) <doi:10.1016/j.ejrh.2016.03.004>, Jonas et al. (2009) <doi:10.1016/j.jhydrol.2009.09.021> and Sturm et al. (2010) <doi:10.1175/2010JHM1202.1>.
This package provides a method that analyzes quality control metrics from multi-sample genomic sequencing studies and nominates poor quality samples for exclusion. Per sample quality control data are transformed into z-scores and aggregated. The distribution of aggregated z-scores are modelled using parametric distributions. The parameters of the optimal model, selected either by goodness-of-fit statistics or user-designation, are used for outlier nomination. Two implementations of the Cosine Similarity Outlier Detection algorithm are provided with flexible parameters for dataset customization.
Displays provenance graphically for provenance collected by the rdt or rdtLite
packages, or other tools providing compatible PROV JSON output. The exact format of the JSON created by rdt and rdtLite
is described in <https://github.com/End-to-end-provenance/ExtendedProvJson>
. More information about rdtLite
and associated tools is available at <https://github.com/End-to-end-provenance/> and Barbara Lerner, Emery Boose, and Luis Perez (2018), Using Introspection to Collect Provenance in R, Informatics, <doi: 10.3390/informatics5010012>.
The algorithm combines the most predictive variable, such as count of the main International Classification of Diseases (ICD) codes, and other Electronic Health Record (EHR) features (e.g. health utilization and processed clinical note data), to obtain a score for accurate risk prediction and disease classification. In particular, it normalizes the surrogate to resemble gaussian mixture and leverages the remaining features through random corruption denoising. Background and details about the method can be found at Yu et al. (2018) <doi:10.1093/jamia/ocx111>.
We build an Susceptible-Infectious-Recovered (SIR) model where the rate of infection is the sum of the household rate and the community rate. We estimate the posterior distribution of the parameters using the Metropolis algorithm. Further details may be found in: F Scott Dahlgren, Ivo M Foppa, Melissa S Stockwell, Celibell Y Vargas, Philip LaRussa
, Carrie Reed (2021) "Household transmission of influenza A and B within a prospective cohort during the 2013-2014 and 2014-2015 seasons" <doi:10.1002/sim.9181>.
This package provides a suite of tests for segregation distortion in F1 polyploid populations (for now, just tetraploids). This is under different assumptions of meiosis. Details of these methods are described in Gerard et al. (2025) <doi:10.1007/s00122-025-04816-z>. This material is based upon work supported by the National Science Foundation under Grant No. 2132247. The opinions, findings, and conclusions or recommendations expressed are those of the author and do not necessarily reflect the views of the National Science Foundation.
Cluster-independent method based on topology structure of gene co-expression network for identifying feature gene sets, extracting cellular subpopulations, and elucidating intrinsic relationships among these subpopulations. Without prior cell clustering, SifiNet
circumvents potential inaccuracies in clustering that may influence subsequent analyses. This method is introduced in Qi Gao, Zhicheng Ji, Liuyang Wang, Kouros Owzar, Qi-Jing Li, Cliburn Chan, Jichun Xie "SifiNet
: a robust and accurate method to identify feature gene sets and annotate cells" (2024) <doi:10.1093/nar/gkae307>.
This package provides functions to compute compositional turnover using zeta-diversity, the number of species shared by multiple assemblages. The package includes functions to compute zeta-diversity for a specific number of assemblages and to compute zeta-diversity for a range of numbers of assemblages. It also includes functions to explain how zeta-diversity varies with distance and with differences in environmental variables between assemblages, using generalised linear models, linear models with negative constraints, generalised additive models,shape constrained additive models, and I-splines.
When testing multiple hypotheses simultaneously, this package provides functionality to calculate a lower bound for the number of correct rejections (as a function of the number of rejected hypotheses), which holds simultaneously -with high probability- for all possible number of rejections. As a special case, a lower bound for the total number of false null hypotheses can be inferred. Dependent test statistics can be handled for multiple tests of associations. For independent test statistics, it is sufficient to provide a list of p-values.
This package provides userspace components for the InfiniBand subsystem of the Linux kernel. Specifically it contains userspace libraries for the following device nodes:
/dev/infiniband/uverbsX
(libibverbs
)/dev/infiniband/rdma_cm
(librdmacm
)/dev/infiniband/umadX
(libibumad
)
The following service daemons are also provided:
srp_daemon
(for theib_srp
kernel module)iwpmd
(for iWARP kernel providers)ibacm
(for InfiniBand communication management assistant)
Collection of procedures to perform Bayesian analysis on a variety of factor models. Currently, it includes: "Bayesian Exploratory Factor Analysis" (befa) from G. Conti, S. Frühwirth-Schnatter, J.J. Heckman, R. Piatek (2014) <doi:10.1016/j.jeconom.2014.06.008>, an approach to dedicated factor analysis with stochastic search on the structure of the factor loading matrix. The number of latent factors, as well as the allocation of the manifest variables to the factors, are not fixed a priori but determined during MCMC sampling.
Easy access to data from Brazil's population censuses. The package provides a simple and efficient way to download and read the data sets and the documentation of all the population censuses taken in and after 1960 in the country. The package is built on top of the Arrow platform <https://arrow.apache.org/docs/r/>, which allows users to work with larger-than-memory census data using dplyr familiar functions. <https://arrow.apache.org/docs/r/articles/arrow.html#analyzing-arrow-data-with-dplyr>.
Implemented are three Wald-type statistic and respective permuted versions for null hypotheses formulated in terms of cumulative hazard rate functions, medians and the concordance measure, respectively, in the general framework of survival factorial designs with possibly heterogeneous survival and/or censoring distributions, for crossed designs with an arbitrary number of factors and nested designs with up to three factors. Ditzhaus, Dobler and Pauly (2020) <doi:10.1177/0962280220980784> Ditzhaus, Janssen, Pauly (2020) <arXiv
: 2004.10818v2> Dobler and Pauly (2019) <doi:10.1177/0962280219831316>.