ChemmineOB
provides an R interface to a subset of cheminformatics functionalities implemented by the OpelBabel
C++ project. OpenBabel
is a free cheminformatics toolbox that includes utilities for structure format interconversions, descriptor calculations, compound similarity searching and more. ChemineOB
aims to make a subset of these utilities available from within R. For non-developers, ChemineOB
is primarily intended to be used from ChemmineR
as an add-on package rather than used directly.
Provide a sparse matrix format with data stored on disk, to be used in both R and C++. This is intended for more efficient use of sparse data in C++ and also when parallelizing, since data on disk does not need copying. Only a limited number of features will be implemented. For now, conversion can be performed from a dgCMatrix
or a dsCMatrix
from R package Matrix'. A new compact format is also now available.
One haplotype is a combination of SNP (Single Nucleotide Polymorphisms) within the QTL (Quantitative Trait Loci). clusterhap groups together all individuals of a population with the same haplotype. Each group contains individual with the same allele in each SNP, whether or not missing data. Thus, clusterhap groups individuals, that to be imputed, have a non-zero probability of having the same alleles in the entire sequence of SNP's. Moreover, clusterhap calculates such probability from relative frequencies.
This package provides a modified hierarchical test (Liu (2017) <doi:10.1214/17-AOS1539>) for detecting the structural difference between two Semiparametric Gaussian graphical models. The multiple testing procedure asymptotically controls the false discovery rate (FDR) at a user-specified level. To construct the test statistic, a truncated estimator is used to approximate the transformation functions and two R functions including lassoGGM()
and lassoNPN()
are provided to compute the lasso estimates of the regression coefficients.
This package provides a small set of tools for formatting numbers in R-markdown documents. Convert a numerical vector to character strings in power-of-ten form, decimal form, or measurement-units form; all are math-delimited for rendering as inline equations. Can also convert text into math-delimited text to match the font face and size of math-delimited numbers. Useful for rendering single numbers in inline R code chunks and for rendering columns in tables.
Reconstruct birth-year specific probabilities of immune imprinting to influenza A, using the methods of Gostic et al. (2016) <doi:10.1126/science.aag1322>. Plot, save, or export the calculated probabilities for use in your own research. By default, the package calculates subtype-specific imprinting probabilities, but with user-provided frequency data, it is possible to calculate probabilities for arbitrary kinds of primary exposure to influenza A, including primary vaccination and exposure to specific clades, strains, etc.
Enables small area estimation (SAE) of health and demographic indicators in low- and middle-income countries (LMICs). It powers an R shiny application for generating subnational estimates and prevalence maps of 150+ binary indicators from Demographic and Health Surveys (DHS). It builds on the SAE analysis workflow from the surveyPrev
package. For documentation, visit <https://sae4health.stat.uw.edu/>. Methodological details can be found at Wu et al. (2025) <doi:10.48550/arXiv.2505.01467>
.
EdDSA over Curve25519 is specified in RFC 8032. This package contains an ed25519::Signature type which other packages can use in conjunction with the signature::Signer and signature::Verifier traits. It doesn't contain an implementation of Ed25519.
These traits allow packages which produce and consume Ed25519 signatures to be written abstractly in such a way that different signer/verifier providers can be plugged in, enabling support for using different Ed25519 implementations, including HSMs or Cloud KMS services.
EdDSA over Curve25519 is specified in RFC 8032. This package contains an ed25519::Signature type which other packages can use in conjunction with the signature::Signer and signature::Verifier traits. It doesn't contain an implementation of Ed25519.
These traits allow packages which produce and consume Ed25519 signatures to be written abstractly in such a way that different signer/verifier providers can be plugged in, enabling support for using different Ed25519 implementations, including HSMs or Cloud KMS services.
This package provides a number of functions to create and analyze factorial plans according to the Design of Experiments (DoE
) approach, with the addition of some utility function to perform some statistical analyses. DoE
approach follows the approach in "Design and Analysis of Experiments" by Douglas C. Montgomery (2019, ISBN:978-1-119-49244-3). The package also provides utilities used in the course "Analysis of Data and Statistics" at the University of Trento, Italy.
Blocks units into experimental blocks, with one unit per treatment condition, by creating a measure of multivariate distance between all possible pairs of units. Maximum, minimum, or an allowable range of differences between units on one variable can be set. Randomly assign units to treatment conditions. Diagnose potential interference between units assigned to different treatment conditions. Write outputs to .tex and .csv files. For more information on the methods implemented, see Moore (2012) <doi:10.1093/pan/mps025>.
Processing and analyzing omics data from genomics, transcriptomics, proteomics, and metabolomics platforms. It provides functions for preprocessing, normalization, visualization, and statistical analysis, as well as machine learning algorithms for predictive modeling. omicsTools
is an essential tool for researchers working with high-throughput omics data in fields such as biology, bioinformatics, and medicine.The QC-RLSC (quality controlâ based robust LOESS signal correction) algorithm is used for normalization. Dunn et al. (2011) <doi:10.1038/nprot.2011.335>.
ProTracker
is a popular music tracker to sequence music on a Commodore Amiga machine. This package offers the opportunity to import, export, manipulate and play ProTracker
module files. Even though the file format could be considered archaic, it still remains popular to this date. This package intends to contribute to this popularity and therewith keeping the legacy of ProTracker
and the Commodore Amiga alive. This package is the successor of ProTrackR
providing better performance.
Standardized accuracy (staccuracy) is a framework for expressing accuracy scores such that 50% represents a reference level of performance and 100% is a perfect prediction. The staccuracy package provides tools for creating staccuracy functions as well as some recommended staccuracy measures. It also provides functions for some classic performance metrics such as mean absolute error (MAE), root mean squared error (RMSE), and area under the receiver operating characteristic curve (AUCROC), as well as their winsorized versions when applicable.
ParMETIS is an MPI-based parallel library that implements a variety of algorithms for partitioning unstructured graphs, meshes, and for computing fill-reducing orderings of sparse matrices. ParMETIS extends the functionality provided by METIS and includes routines that are especially suited for parallel AMR computations and large scale numerical simulations. The algorithms implemented in ParMETIS are based on the parallel multilevel k-way graph-partitioning, adaptive repartitioning, and parallel multi-constrained partitioning schemes developed in our lab.
This package implements the hybrid framework for event prediction described in Fang & Zheng (2011, <doi:10.1016/j.cct.2011.05.013>). To estimate the survival function the event prediction is based on, a piecewise exponential hazard function is fit to the time-to-event data to infer the potential change points. Prior to the last identified change point, the survival function is estimated using Kaplan-Meier, and the tail after the change point is fit using piecewise exponential.
Multimodal mediation analysis is an emerging problem in microbiome data analysis. Multimedia make advanced mediation analysis techniques easy to use, ensuring that all statistical components are transparent and adaptable to specific problem contexts. The package provides a uniform interface to direct and indirect effect estimation, synthetic null hypothesis testing, bootstrap confidence interval construction, and sensitivity analysis. More details are available in Jiang et al. (2024) "multimedia: Multimodal Mediation Analysis of Microbiome Data" <doi:10.1101/2024.03.27.587024>.
This package provides a basic interface for accessing annotation data from the Multi-CAST collection, a database of spoken natural language texts edited by Geoffrey Haig and Stefan Schnell. The collection draws from a diverse set of languages and has been annotated across multiple levels. Annotation data is downloaded on request from the servers of the University of Bamberg. See the Multi-CAST website <https://multicast.aspra.uni-bamberg.de/> for more information and a list of related publications.
Estimates micro effects on macro structures (MEMS) and average micro mediated effects (AMME). URL: <https://github.com/sduxbury/netmediate>. BugReports
: <https://github.com/sduxbury/netmediate/issues>. Robins, Garry, Phillipa Pattison, and Jodie Woolcock (2005) <doi:10.1086/427322>. Snijders, Tom A. B., and Christian E. G. Steglich (2015) <doi:10.1177/0049124113494573>. Imai, Kosuke, Luke Keele, and Dustin Tingley (2010) <doi:10.1037/a0020761>. Duxbury, Scott (2023) <doi:10.1177/00811750231209040>. Duxbury, Scott (2024) <doi:10.1177/00811750231220950>.
The function generates and plots random snowflakes. Each snowflake is defined by a given diameter, width of the crystal, color, and random seed. Snowflakes are plotted in such way that they always remain round, no matter what the aspect ratio of the plot is. Snowflakes can be created using transparent colors, which creates a more interesting, somewhat realistic, image. Images of the snowflakes can be separately saved as svg files and used in websites as static or animated images.
Fast, lightweight toolkit for data splitting. Data sets can be partitioned into disjoint groups (e.g. into training, validation, and test) or into (repeated) k-folds for subsequent cross-validation. Besides basic splits, the package supports stratified, grouped as well as blocked splitting. Furthermore, cross-validation folds for time series data can be created. See e.g. Hastie et al. (2001) <doi:10.1007/978-0-387-84858-7> for the basic background on data partitioning and cross-validation.
With given inputs that include number of points, discrete design space, a measure of skewness, models and parameter value, this package calculates the objective value, optimal designs and plot the equivalence theory under A- and D-optimal criteria under the second-order Least squares estimator. This package is based on the paper "Properties of optimal regression designs under the second-order least squares estimator" by Chi-Kuang Yeh and Julie Zhou (2021) <doi:10.1007/s00362-018-01076-6>.
Privacy protected raster maps can be created from spatial point data. Protection methods include smoothing of dichotomous variables by de Jonge and de Wolf (2016) <doi:10.1007/978-3-319-45381-1_9>, continuous variables by de Wolf and de Jonge (2018) <doi:10.1007/978-3-319-99771-1_23>, suppressing revealing values and a generalization of the quad tree method by Suñé, Rovira, Ibáñez and Farré (2017) <doi:10.2901/EUROSTAT.C2017.001>.
The h5Seurat file format is specifically designed for the storage and analysis of multi-modal single-cell and spatially-resolved expression experiments, for example, from CITE-seq or 10X Visium technologies. It holds all molecular information and associated metadata, including (for example) nearest-neighbor graphs, dimensional reduction information, spatial coordinates and image data, and cluster labels. This package also supports rapid and on-disk conversion between h5Seurat and AnnData objects, with the goal of enhancing interoperability between Seurat and Scanpy.