Quantify the causal effect of a binary exposure on a binary outcome with adjustment for multiple biases. The functions can simultaneously adjust for any combination of uncontrolled confounding, exposure/outcome misclassification, and selection bias. The underlying method generalizes the concept of combining inverse probability of selection weighting with predictive value weighting. Simultaneous multi-bias analysis can be used to enhance the validity and transparency of real-world evidence obtained from observational, longitudinal studies. Based on the work from Paul Brendel, Aracelis Torres, and Onyebuchi Arah (2023) <doi:10.1093/ije/dyad001>.
Optimal scaling of a data vector, relative to a set of targets, is obtained through a least-squares transformation subject to appropriate measurement constraints. The targets are usually predicted values from a statistical model. If the data are nominal level, then the transformation must be identity-preserving. If the data are ordinal level, then the transformation must be monotonic. If the data are discrete, then tied data values must remain tied in the optimal transformation. If the data are continuous, then tied data values can be untied in the optimal transformation.
Computation and visualization of Taxicab Correspondence Analysis, Choulakian (2006) <doi:10.1007/s11336-004-1231-4>. Classical correspondence analysis (CA) is a statistical method to analyse 2-dimensional tables of positive numbers and is typically applied to contingency tables (Benzecri, J.-P. (1973). L'Analyse des Donnees. Volume II. L'Analyse des Correspondances. Paris, France: Dunod). Classical CA is based on the Euclidean distance. Taxicab CA is like classical CA but is based on the Taxicab or Manhattan distance. For some tables, Taxicab CA gives more informative results than classical CA.
Chromatin segmentation analysis transforms ChIP-seq data into signals over the genome. The latter represents the observed states in a multivariate Markov model to predict the chromatin's underlying states. ChromHMM, written in Java, integrates histone modification datasets to learn the chromatin states de-novo. The goal of this package is to call chromHMM from within R, capture the output files in an S4 object and interface to other relevant Bioconductor analysis tools. In addition, segmenter provides functions to test, select and visualize the output of the segmentation.
TEKRABber is made to provide a user-friendly pipeline for comparing orthologs and transposable elements (TEs) between two species. It considers the orthology confidence between two species from BioMart to normalize expression counts and detect differentially expressed orthologs/TEs. Then it provides one to one correlation analysis for desired orthologs and TEs. There is also an app function to have a first insight on the result. Users can prepare orthologs/TEs RNA-seq expression data by their own preference to run TEKRABber following the data structure mentioned in the vignettes.
Manage storage in Microsoft's Azure cloud: <https://azure.microsoft.com/en-us/products/category/storage/>. On the admin side, AzureStor includes features to create, modify and delete storage accounts. On the client side, it includes an interface to blob storage, file storage, and Azure Data Lake Storage Gen2': upload and download files and blobs; list containers and files/blobs; create containers; and so on. Authenticated access to storage is supported, via either a shared access key or a shared access signature (SAS). Part of the AzureR family of packages.
This package implements two complementary high-dimensional feature screening methods, Adaptive Iterative Ridge High-dimensional Ordinary Least-squares Projection (Air-HOLP, suitable when the number of predictors p is greater than or equal to the sample size n) and Adaptive Iterative Ridge Ordinary Least Squares (Air-OLS, for n greater than p). Also provides helper functions to generate compound-symmetry and AR(1) correlated data, plus a unified Air() front end and a summary method. For methodological details see Joudah, Muller and Zhu (2025) <doi:10.1007/s11222-025-10599-6>.
Generates confidence intervals for standardized regression coefficients using delta method standard errors for models fitted by lm() as described in Yuan and Chan (2011) <doi:10.1007/s11336-011-9224-6> and Jones and Waller (2015) <doi:10.1007/s11336-013-9380-y>. The package can also be used to generate confidence intervals for differences of standardized regression coefficients and as a general approach to performing the delta method. A description of the package and code examples are presented in Pesigan, Sun, and Cheung (2023) <doi:10.1080/00273171.2023.2201277>.
Typical morphological profiling datasets have millions of cells and hundreds of features per cell. When working with this data, you must clean the data, normalize the features to make them comparable across experiments, transform the features, select features based on their quality, and aggregate the single-cell data, if needed. cytominer makes these steps fast and easy. Methods used in practice in the field are discussed in Caicedo (2017) <doi:10.1038/nmeth.4397>. An overview of the field is presented in Caicedo (2016) <doi:10.1016/j.copbio.2016.04.003>.
Allows for the specification of deep conditional transformation models (DCTMs) and ordinal neural network transformation models, as described in Baumann et al (2021) <doi:10.1007/978-3-030-86523-8_1> and Kook et al (2022) <doi:10.1016/j.patcog.2021.108263>. Extensions such as autoregressive DCTMs (Ruegamer et al, 2023, <doi:10.1007/s11222-023-10212-8>) and transformation ensembles (Kook et al, 2022, <doi:10.48550/arXiv.2205.12729>) are implemented. The software package is described in Kook et al (2024, <doi:10.18637/jss.v111.i10>).
We provide a comprehensive software to estimate general K-stage DTRs from SMARTs with Q-learning and a variety of outcome-weighted learning methods. Penalizations are allowed for variable selection and model regularization. With the outcome-weighted learning scheme, different loss functions - SVM hinge loss, SVM ramp loss, binomial deviance loss, and L2 loss - are adopted to solve the weighted classification problem at each stage; augmentation in the outcomes is allowed to improve efficiency. The estimated DTR can be easily applied to a new sample for individualized treatment recommendations or DTR evaluation.
This package provides a flexible framework for Agent-Based Models (ABM), the epiworldR package provides methods for prototyping disease outbreaks and transmission models using a C++ backend, making it very fast. It supports multiple epidemiological models, including the Susceptible-Infected-Susceptible (SIS), Susceptible-Infected-Removed (SIR), Susceptible-Exposed-Infected-Removed (SEIR), and others, involving arbitrary mitigation policies and multiple-disease models. Users can specify infectiousness/susceptibility rates as a function of agents features, providing great complexity for the model dynamics. Furthermore, epiworldR is ideal for simulation studies featuring large populations.
This package provides a comprehensive suite of functions for processing and visualizing taxonomic data. It includes functionality to clean and transform taxonomic data, categorize it into hierarchical ranks (such as Phylum, Class, Order, Family, and Genus), and calculate the relative abundance of each category. The package also generates a color palette for visual representation of the taxonomic data, allowing users to easily identify and differentiate between various taxonomic groups. Additionally, it features a river plot visualization to effectively display the distribution of individuals across different taxonomic ranks, facilitating insights into taxonomic visualization.
This package contains miscellaneous functions useful for managing NetCDF files (see <https://en.wikipedia.org/wiki/NetCDF>), get moon phase and time for sun rise and fall, tide level, analyse and reconstruct periodic time series of temperature with irregular sinusoidal pattern, show scales and wind rose in plot with change of color of text, Metropolis-Hastings algorithm for Bayesian MCMC analysis, plot graphs or boxplot with error bars, search files in disk by there names or their content, read the contents of all files from a folder at one time.
Automate the detection of gaps and elevations in mapped sequencing read coverage using a 2D pattern-matching algorithm. ProActive detects, characterizes and visualizes read coverage patterns in both genomes and metagenomes. Optionally, users may provide gene annotations associated with their genome or metagenome in the form of a .gff file. In this case, ProActive will generate an additional output table containing the gene annotations found within the detected regions of gapped and elevated read coverage. Additionally, users can search for gene annotations of interest in the output read coverage plots.
Machine learning provides algorithms that can learn from data and make inferences or predictions. Stochastic automata is a class of input/output devices which can model components. This work provides implementation an inference algorithm for stochastic automata which is similar to the Viterbi algorithm. Moreover, we specify a learning algorithm using the expectation-maximization technique and provide a more efficient implementation of the Baum-Welch algorithm for stochastic automata. This work is based on Inference and learning in stochastic automata was by Karl-Heinz Zimmermann(2017) <doi:10.12732/ijpam.v115i3.15>.
Efficient Markov chain Monte Carlo (MCMC) algorithms for fully Bayesian estimation of time-varying parameter models with shrinkage priors, both dynamic and static. Details on the algorithms used are provided in Bitto and Frühwirth-Schnatter (2019) <doi:10.1016/j.jeconom.2018.11.006> and Cadonna et al. (2020) <doi:10.3390/econometrics8020020> and Knaus and Frühwirth-Schnatter (2023) <doi:10.48550/arXiv.2312.10487>. For details on the package, please see Knaus et al. (2021) <doi:10.18637/jss.v100.i13>. For the multivariate extension, see the shrinkTVPVAR package.
This package provides a Shiny app for visual exploration of omic datasets as compositions, and differential abundance analysis using ALDEx2. Useful for exploring RNA-seq, meta-RNA-seq, 16s rRNA gene sequencing with visualizations such as principal component analysis biplots (coloured using metadata for visualizing each variable), dendrograms and stacked bar plots, and effect plots (ALDEx2). Input is a table of counts and metadata file (if metadata exists), with options to filter data by count or by metadata to remove low counts, or to visualize select samples according to selected metadata.
SpotClean is a computational method to adjust for spot swapping in spatial transcriptomics data. Recent spatial transcriptomics experiments utilize slides containing thousands of spots with spot-specific barcodes that bind mRNA. Ideally, unique molecular identifiers at a spot measure spot-specific expression, but this is often not the case due to bleed from nearby spots, an artifact we refer to as spot swapping. SpotClean is able to estimate the contamination rate in observed data and decontaminate the spot swapping effect, thus increase the sensitivity and precision of downstream analyses.
This package provides functions for importing external vector images and drawing them as part of R plots. This package is different from the grImport package because, where that package imports PostScript format images, this package imports SVG format images. Furthermore, this package imports a specific subset of SVG, so external images must be preprocessed using a package like rsvg to produce SVG that this package can import. SVG features that are not supported by R graphics, such as gradient fills, can be imported and then exported via the gridSVG package.
It offers simplified access to Brazilian macroeconomic and financial indicators selected from official sources, such as the IBGE (Brazilian Institute of Geography and Statistics) via the SIDRA API and the Central Bank of Brazil via the SGS API. It allows users to quickly retrieve and visualize data series such as the unemployment rate and the Selic interest rate. This package was developed for data access and visualization purposes, without generating forecasts or statistical results. For more information, see the official APIs: <https://sidra.ibge.gov.br/> and <https://dadosabertos.bcb.gov.br/dataset/>.
This package provides a simple tool to quantify the amount of transmission of an infectious disease of interest occurring within and between population groups. bumblebee uses counts of observed directed transmission pairs, identified phylogenetically from deep-sequence data or from epidemiological contacts, to quantify transmission flows within and between population groups accounting for sampling heterogeneity. Population groups might include: geographical areas (e.g. communities, regions), demographic groups (e.g. age, gender) or arms of a randomized clinical trial. See the bumblebee website for statistical theory, documentation and examples <https://magosil86.github.io/bumblebee/>.
Color values in R are often represented as strings of hexadecimal colors or named colors. This package offers fast conversion of these color representations to either an array of red/green/blue/alpha values or to the packed integer format used in native raster objects. Functions for conversion are also exported at the C level for use in other packages. This fast conversion of colors is implemented using an order-preserving minimal perfect hash derived from Majewski et al (1996) "A Family of Perfect Hashing Methods" <doi:10.1093/comjnl/39.6.547>.
This package provides fast moving-window ("focal") and buffer-based extraction for raster data using the terra package. Automatically selects between a C++ backend (via terra') and a Fast Fourier Transform (FFT) backend depending on problem size. The FFT backend supports sum and mean, while other statistics (e.g., median, min, max, standard deviation) are handled by the terra backend. Supports multiple kernel types (e.g., circle, rectangle, gaussian), with NA handling consistent with terra via na.rm and na.policy'. Operates on SpatRaster objects and returns results with the same geometry.