Designed to facilitate the preprocessing and linking of GIS (Geographic Information System) databases <https://www.sciencedirect.com/topics/computer-science/gis-database>, the R package GISINTEGRATION offers a robust solution for efficiently preparing GIS data for advanced spatial analyses. This package excels in simplifying intrica procedures like data cleaning, normalization, and format conversion, ensuring that the data are optimally primed for precise and thorough analysis.
The Open Data Format (ODF) is a new, non-proprietary, multilingual, metadata enriched, and zip-compressed data format with metadata structured in the Data Documentation Initiative (DDI) Codebook standard. This package allows reading and writing of data files in the Open Data Format (ODF) in R, and displaying metadata in different languages. For further information on the Open Data Format, see <https://opendataformat.github.io/>.
Computes optimal changepoint models using the Poisson likelihood for non-negative count data, subject to the PeakSeg constraint: the first change must be up, second change down, third change up, etc. For more info about the models and algorithms, read "Constrained Dynamic Programming and Supervised Penalty Learning Algorithms for Peak Detection" <https://jmlr.org/papers/v21/18-843.html> by TD Hocking et al.
This package provides a collection of functions for estimating spatial regimes, aggregations of neighboring spatial units that are homogeneous in functional terms. The term spatial regime, therefore, should not be understood as a synonym for cluster. More precisely, the term cluster does not presuppose any functional relationship between the variables considered, while the term regime is linked to a regressive relationship underlying the spatial process.
This package allows importing most common specific structure (motif) types into R for use by functions provided by other Bioconductor motif-related packages. Motifs can be exported into most major motif formats from various classes as defined by other Bioconductor packages. A suite of motif and sequence manipulation and analysis functions are included, including enrichment, comparison, P-value calculation, shuffling, trimming, higher-order motifs, and others.
This package provides datasets for the nullranges package vignette, in particular example datasets for DNase hypersensitivity sites (DHS), CTCF binding sites, and CTCF genomic interactions. These are used to demonstrate generation of null hypothesis feature sets, either through block bootstrapping or matching, in the nullranges vignette. For more details, see the data object man pages, and the R scripts for object construction provided within the package.
Interactive data visualization for data practitioners. flourishcharts allows users to visualize their data using Flourish graphs that are grounded in data storytelling principles. Users can create racing bar & line charts, as well as other interactive elements commonly found in D3 graphics, easily in R and Python'. The package relies on an enterprise API provided by Flourish', a data visualization platform <https://developers.flourish.studio/api/introduction/>.
Identification of putative causal variants in genome-wide association studies using hybrid analysis of both the trio and population designs. The package implements the method in the paper: Yang, Y., Wang, Q., Wang, C., Buxbaum, J., & Ionita-Laza, I. (2024). KnockoffHybrid: A knockoff framework for hybrid analysis of trio and population designs in genome-wide association studies. The American Journal of Human Genetics, in press.
Helpers for customizing selected outputs from lavaan by Rosseel (2012) <doi:10.18637/jss.v048.i02> and print them. The functions are intended to be used by package developers in their packages and so are not designed to be user-friendly. They are designed to be let developers customize the tables by other functions. Currently the parameter estimates tables of a fitted object are supported.
Monte Carlo simulations of a game-theoretic model for the legal exemption system of the European cartel law are implemented in order to estimate the (mean) deterrent effect of this system. The input and output parameters of the simulated cartel opportunities can be visualized by three-dimensional projections. A description of the model is given in Moritz et al. (2018) <doi:10.1515/bejeap-2017-0235>.
This is a compendium of C++ routines useful for Bayesian statistics. We steal other people's C++ code, repurpose it, and export it so developers of R packages can use it in their C++ code. We actually don't steal anything, or claim that Thomas Bayes did, but copy code that is compatible with our GPL 3 licence, fully acknowledging the authorship of the original code.
This package implements an Entropy measure of dependence based on the Bhattacharya-Hellinger-Matusita distance. Can be used as a (nonlinear) autocorrelation/crosscorrelation function for continuous and categorical time series. The package includes tests for serial and cross dependence and nonlinearity based on it. Some routines have a parallel version that can be used in a multicore/cluster environment. The package makes use of S4 classes.
The package provides a single macro \randomize{TEXT} that typesets the characters of TEXT in random order, such that the resulting output appears correct, but most automated attempts to read the file will misunderstand it. This function allows one to include an email address in a TeX document and publish it online without fear of email address harvesters or spammers easily picking up the address.
Data for the mosaics package, consisting of (1) chromosome 22 ChIP and control sample data from a ChIP-seq experiment of STAT1 binding and H3K4me3 modification in MCF7 cell line from ENCODE database (HG19) and (2) chromosome 21 ChIP and control sample data from a ChIP-seq experiment of STAT1 binding, with mappability, GC content, and sequence ambiguity scores of human genome HG18.
Supports import/export for a number of datetime string standards and R datetime classes often including lossless re-export of any original reduced precision including ISO 8601 <https://en.wikipedia.org/wiki/ISO_8601> and pdfmark <https://opensource.adobe.com/dc-acrobat-sdk-docs/library/pdfmark/> datetime strings. Supports local/global datetimes with optional UTC offsets and/or (possibly heterogeneous) time zones with up to nanosecond precision.
Calculation of informative simultaneous confidence intervals for graphical described multiple test procedures and given information weights. Bretz et al. (2009) <doi:10.1002/sim.3495> and Brannath et al. (2024) <doi:10.48550/arXiv.2402.13719>. Furthermore, exploration of the behavior of the informative bounds in dependence of the information weights. Comparisons with compatible bounds are possible. Strassburger and Bretz (2008) <doi:10.1002/sim.3338>.
Read in SAS Data ('.sas7bdat Files) into Apache Spark from R. Apache Spark is an open source cluster computing framework available at <http://spark.apache.org>. This R package uses the spark-sas7bdat Spark package (<https://spark-packages.org/package/saurfang/spark-sas7bdat>) to import and process SAS data in parallel using Spark'. Hereby allowing to execute dplyr statements in parallel on top of SAS data.
The stress addition approach is an alternative to the traditional concentration addition or effect addition models. It allows the modelling of tri-phasic concentration-response relationships either as single toxicant experiments, in combination with an environmental stressor or as mixtures of two toxicants. See Liess et al. (2019) <doi:10.1038/s41598-019-51645-4> and Liess et al. (2020) <doi:10.1186/s12302-020-00394-7>.
Implementation of the following methods for event history analysis. Risk regression models for survival endpoints also in the presence of competing risks are fitted using binomial regression based on a time sequence of binary event status variables. A formula interface for the Fine-Gray regression model and an interface for the combination of cause-specific Cox regression models. A toolbox for assessing and comparing performance of risk predictions (risk markers and risk prediction models). Prediction performance is measured by the Brier score and the area under the ROC curve for binary possibly time-dependent outcome. Inverse probability of censoring weighting and pseudo values are used to deal with right censored data. Lists of risk markers and lists of risk models are assessed simultaneously. Cross-validation repeatedly splits the data, trains the risk prediction models on one part of each split and then summarizes and compares the performance across splits.
This package provides vectorized distribution objects with tools for manipulating, visualizing, and using probability distributions. It was designed to allow model prediction outputs to return distributions rather than their parameters, allowing users to directly interact with predictive distributions in a data-oriented workflow. In addition to providing generic replacements for p/d/q/r functions, other useful statistics can be computed including means, variances, intervals, and highest density regions.
The package provides commands to define enumerable items with a number and a long name, which can be referenced later with the name or just the short form. For instance, Milestone M1: Specification created can be defined and later on be referenced with M1 or M1 ("Specification created"). The text in the references is derived from the definition and also rendered as hyperlink to the definition.
Collection of ancillary functions and utilities to be used in conjunction with the TraMineR package for sequence data exploration. Includes, among others, specific functions such as state survival plots, position-wise group-typical states, dynamic sequence indicators, and dissimilarities between event sequences. Also includes contributions by non-members of the TraMineR team such as methods for polyadic data and for the comparison of groups of sequences.
Queries multiple resources authors HGNC (2019) <https://www.genenames.org>, authors limma (2015) <doi:10.1093/nar/gkv007> to find the correspondence between evolving nomenclature of human gene symbols, aliases, previous symbols or synonyms with stable, curated gene entrezID from NCBI database. This allows fast, accurate and up-to-date correspondence between human gene expression datasets from various date and platform (e.g: gene symbol: BRCA1 - ID: 672).
Constructs optimal policy trees which provide a rule-based treatment prescription policy. Input is covariate and reward data, where, typically, the rewards will be doubly robust reward estimates. This package aims to construct optimal policy trees more quickly than the existing policytree package and is intended to be used alongside that package. For more details see Cussens, Hatamyar, Shah and Kreif (2025) <doi:10.48550/arXiv.2506.15435>.