This package implements the truncated harmonic mean estimator (THAMES) of the reciprocal marginal likelihood for uni- and multivariate mixture models using posterior samples and unnormalized log posterior values via reciprocal importance sampling. Metodiev, Irons, Perrot-Dockès, Latouche & Raftery (2025) <doi:10.48550/arXiv.2504.21812>.
Disaggregates low frequency time series data to higher frequency series. Implements the following methods for temporal disaggregation: Boot, Feibes and Lisman (1967) <DOI:10.2307/2985238>, Chow and Lin (1971) <DOI:10.2307/1928739>, Fernandez (1981) <DOI:10.2307/1924371> and Litterman (1983) <DOI:10.2307/1391858>.
Non-imputational method for handling missing values in a prediction context, meaning that not only are there missing values in the training dataset, but also some values may be missing in future cases to be predicted. Based on the notion of regression averaging (Matloff (2017, ISBN: 9781498710916)).
The Gene Expression Omnibus (<https://www.ncbi.nlm.nih.gov/geo/>) and The Cancer Genome Atlas (<https://portal.gdc.cancer.gov/>) are widely used medical public databases. Our platform integrates routine analysis and visualization tools for expression data to provide concise and intuitive data analysis and presentation.
This package provides new classes for (rotated) BB1, BB6, BB7, BB8, and Tawn copulas, extends the existing Gumbel and Clayton families with rotations, and allows to set up a vine copula model using the copula API. Corresponding objects from the VineCopula API can easily be converted.
In this package, a Hidden Semi Markov Model (HSMM) and one homogeneous segmentation model are designed and implemented for segmentation genomic data, with the aim of assisting in transcripts detection using high throughput technology like RNA-seq or tiling array, and copy number analysis using aCGH or sequencing.
This package provides plotting functions for posterior analysis, model checking, and MCMC diagnostics. The package is designed not only to provide convenient functionality for users, but also a common set of functions that can be easily used by developers working on a variety of R packages for Bayesian modeling.
spacetime provides classes and methods for spatio-temporal data, including space-time regular lattices, sparse lattices, irregular data, and trajectories; utility functions for plotting data as map sequences (lattice or animation) or multiple time series; methods for spatial and temporal matching or aggregation, retrieving coordinates, print, summary, etc.
This package contains a number of common astronomy conversion routines, particularly the HMS and degrees schemes, which can be fiddly to convert between on mass due to the textural nature of the former. It allows users to coordinate match datasets quickly. It also contains functions for various cosmological calculations.
This package provides tools to create pretty tables for HTML documents and other formats. Functions are provided to let users create tables, modify and format their content. It extends the officer package and can be used within R markdown documents when rendering to HTML and to Word documents.
Doom Runner is yet another launcher of common Doom source ports (e.g. GZDoom, Zandronum, PrBoom) with a graphical user interface. It is written in C++ and Qt, and it is designed around the idea of presets for various multi-file modifications to allow one-click switching between them.
This Python module enables remote procedure calls, clustering, and distributed-computing. For this purpose, it makes use of object-proxying, a technique that employs python's dynamic nature, to overcome the physical boundaries between processes and computers, so that remote objects can be manipulated as if they were local.
Generate SuperSigs (supervised mutational signatures) from single nucleotide variants in the cancer genome. Functions included in the package allow the user to learn supervised mutational signatures from their data and apply them to new data. The methodology is based on the one described in Afsari (2021, ELife).
Read and manipulate Camera Trap Data Packages ('Camtrap DP'). Camtrap DP (<https://camtrap-dp.tdwg.org>) is a data exchange format for camera trap data. With camtrapdp you can read, filter and transform data (including to Darwin Core) before further analysis in e.g. camtraptor or camtrapR'.
Researchers carried out a series of experiments passing a number of essays to different GPT detection models. Juxtaposing detector predictions for papers written by native and non-native English writers, the authors argue that GPT detectors disproportionately classify real writing from non-native English writers as AI-generated.
Uses species occupancy at coarse grain sizes to predict species occupancy at fine grain sizes. Ten models are provided to fit and extrapolate the occupancy-area relationship, as well as methods for preparing atlas data for modelling. See Marsh et. al. (2018) <doi:10.18637/jss.v086.c03>.
This package provides a common interface for applying dimensionality reduction methods, such as Principal Component Analysis ('PCA'), Independent Component Analysis ('ICA'), diffusion maps, Locally-Linear Embedding ('LLE'), t-distributed Stochastic Neighbor Embedding ('t-SNE'), and Uniform Manifold Approximation and Projection ('UMAP'). Has built-in support for sparse matrices.
This package provides a consistent set of functions for enriching and analyzing sovereign-level economic data. Economists, data scientists, and financial professionals can use the package to add standardized identifiers, demographic and macroeconomic indicators, and derived metrics such as gross domestic product per capita or government expenditure shares.
Stores small spatial datasets used to teach basic spatial analysis concepts. Datasets are based off of the GeoDa software workbook and data site <https://geodacenter.github.io/data-and-lab/> developed by Luc Anselin and team at the University of Chicago. Datasets are stored as sf objects.
Given a high-dimensional dataset that typically represents a cytometry dataset, and a subset of the datapoints, this algorithm outputs an hyperrectangle so that datapoints within the hyperrectangle best correspond to the specified subset. In essence, this allows the conversion of clustering algorithms outputs to gating strategies outputs.
Quick indexation of any type of vector or of any combination of those. Indexation turns a vector into an integer vector going from 1 to the number of unique elements. Indexes are important building blocks for many algorithms. The method is described at <https://github.com/lrberge/indexthis/>.
Compute several variations of the Implicit Association Test (IAT) scores, including the D scores (Greenwald, Nosek, Banaji, 2003, <doi:10.1037/0022-3514.85.2.197>) and the new scores that were developed using robust statistics (Richetin, Costantini, Perugini, and Schonbrodt, 2015, <doi:10.1371/journal.pone.0129601>).
Clustering or classification of longitudinal data based on a mixture of multivariate t or Gaussian distributions with a Cholesky-decomposed covariance structure. Details in McNicholas and Murphy (2010) <doi:10.1002/cjs.10047> and McNicholas and Subedi (2012) <doi:10.1016/j.jspi.2011.11.026>.
This package provides a variety of association tests for microbiome data analysis including Quasi-Conditional Association Tests (QCAT) described in Tang Z.-Z. et al.(2017) <doi:10.1093/bioinformatics/btw804> and Zero-Inflated Generalized Dirichlet Multinomial (ZIGDM) tests described in Tang Z.-Z. & Chen G. (2017, submitted).