An ASCII ruler is for measuring text and is especially useful for sequence analysis. Included in this package are methods to create ASCII rulers and associated GenBank
sequence blocks, multi-column text displays that make it easy for viewers to locate nucleotides by position.
This package implements methods for sample size reduction within Linear and Quadratic Discriminant Analysis in Lapanowski and Gaynanova (2020) <arXiv:2005.03858>
. Also includes methods for non-linear discriminant analysis with simultaneous sparse feature selection in Lapanowski and Gaynanova (2019) PMLR 89:1704-1713.
Flexible and efficient cleaning of data with interactivity. datacleanr facilitates best practices in data analyses and reproducibility with built-in features and by translating interactive/manual operations to code. The package is designed for interoperability, and so seamlessly fits into reproducible analyses pipelines in R'.
This package provides methods for efficient algebraic operations and factorization of dyadic matrices using Rcpp and RcppArmadillo
'. The details of dyadic matrices and the corresponding methodology are described in Kos, M., Podgórski, K., and Wu, H. (2025) <doi:10.48550/arXiv.2505.08144>
.
This package provides a disk-based data manipulation tool for working with large-than-RAM datasets. Aims to lower the barrier-to-entry for manipulating large datasets by adhering closely to popular and familiar data manipulation paradigms like dplyr verbs and data.table syntax.
The goal of forstringr is to enable complex string manipulation in R especially to those more familiar with LEFT()
, RIGHT()
, and MID()
functions in Microsoft Excel. The package combines the power of stringr with other manipulation packages such as dplyr and tidyr'.
Grows families of features by selecting features that maximize a weighted score calculated from empirical feature scores and graphical knowledge. The final weighted score for a feature is determined by summing a feature's family-weighted scores across all families in which the feature appears.
Graph clustering using an agglomerative algorithm to maximize the integrated classification likelihood criterion and a mixture of stochastic block models. The method is described in the article "Model-based clustering of multiple networks with a hierarchical algorithm" by T. Rebafka (2022) <arXiv:2211.02314>
.
This package provides functions and classes to compute, handle and visualise incidence from dated events for a defined time interval. Dates can be provided in various standard formats. The class incidence2 is used to store computed incidence and can be easily manipulated, subsetted, and plotted.
This is an extension package to logrx', which is a log creation program focused on Clinical Reporting within the Pharma Industry. This package enables a simple shiny-based Add-in that provides a point and click interface to produce a log for a single program.
This package provides a collection of moment-matching methods for computing the cumulative distribution function of a positively-weighted sum of chi-squared random variables. Methods include the Satterthwaite-Welch method, Hall-Buckley-Eagleson method, Wood's F method, and the Lindsay-Pilla-Basak method.
Regression methods for the meta-SDT model. The package implements methods for cognitive experiments of metacognition as described in Kristensen, S. B., Sandberg, K., & Bibby, B. M. (2020). Regression methods for metacognitive sensitivity. Journal of Mathematical Psychology, 94. <doi:10.1016/j.jmp.2019.102297>.
In this implementation of the Naive Bayes classifier following class conditional distributions are available: Bernoulli', Categorical', Gaussian', Poisson', Multinomial and non-parametric representation of the class conditional density estimated via Kernel Density Estimation. Implemented classifiers handle missing data and can take advantage of sparse data.
Implementation of PsychroLib
<https://github.com/psychrometrics/psychrolib> library which contains functions to enable the calculation properties of moist and dry air in both metric (SI) and imperial (IP) systems of units. References: Meyer, D. and Thevenard, D (2019) <doi:10.21105/joss.01137>.
This package performs minimax linkage hierarchical clustering. Every cluster has an associated prototype element that represents that cluster as described in Bien, J., and Tibshirani, R. (2011), "Hierarchical Clustering with Prototypes via Minimax Linkage," The Journal of the American Statistical Association, 106(495), 1075-1084.
Input/Output, processing and visualization of spectra taken with different spectrometers, including SVC (Spectra Vista), ASD and PSR (Spectral Evolution). Implements an S3 class spectra that other packages can build on. Provides methods to access, plot, manipulate, splice sensor overlap, vector normalize and smooth spectra.
This package provides a pipeline to perform small area estimation and prevalence mapping of binary indicators using health and demographic survey data, described in Fuglstad et al. (2022) <doi:10.48550/arXiv.2110.09576>
and Wakefield et al. (2020) <doi:10.1111/insr.12400>.
This package provides an intuitive interface for working with the competing risk endpoints. The package wraps the cmprsk package, and exports functions for univariate cumulative incidence estimates and competing risk regression. Methods follow those introduced in Fine and Gray (1999) <doi:10.1002/sim.7501>.
Converts XML documents to R dataframes and dataframes to XML documents. A wide variety of options allows for different XML formats and flexible control of the conversion process. Results can be exported to CSV and Excel, if desired. Also converts XML data to R lists.
This package provides an mlr3 extension that provides various resampling-based confidence interval (CI) methods for estimating the generalization error. These CI methods are implemented as mlr3 measures, enabling the evaluation of individual algorithms on specific tasks as well as the comparison of different learning algorithms.
Single cell RNA sequencing datasets can be large, consisting of matrices that contain expression data for several thousand features across several thousand cells. This package is designed to easily install, manage, and learn about various single-cell datasets, provided Seurat objects and distributed as independent packages.
This package provides a variety of descriptive multivariate analyses with the singular value decomposition, such as principal components analysis, correspondence analysis, and multidimensional scaling. See An ExPosition of the Singular Value Decomposition in R (Beaton et al 2014) <doi:10.1016/j.csda.2013.11.006>.
Clircle provides a cross-platform API to detect read or write cycles from your user-supplied arguments. You can get the important identifiers of a file (from a path) and for all three stdio streams, if they are piped from or to a file as well.
Clircle provides a cross-platform API to detect read or write cycles from your user-supplied arguments. You can get the important identifiers of a file (from a path) and for all three stdio streams, if they are piped from or to a file as well.