Highly multiplexed imaging acquires the single-cell expression of selected proteins in a spatially-resolved fashion. These measurements can be visualised across multiple length-scales. First, pixel-level intensities represent the spatial distributions of feature expression with highest resolution. Second, after segmentation, expression values or cell-level metadata (e.g. cell-type information) can be visualised on segmented cell areas. This package contains functions for the visualisation of multiplexed read-outs and cell-level information obtained by multiplexed imaging technologies. The main functions of this package allow 1. the visualisation of pixel-level information across multiple channels, 2. the display of cell-level information (expression and/or metadata) on segmentation masks and 3. gating and visualisation of single cells.
It is designed to work with text written in Bahasa Malaysia. We provide functions and data sets that will make working with Bahasa Malaysia text much easier. For word stemming in particular, we will look up the Malay words in a dictionary and then proceed to remove "extra suffix" as explained in Khan, Rehman Ullah, Fitri Suraya Mohamad, Muh Inam UlHaq
, Shahren Ahmad Zadi Adruce, Philip Nuli Anding, Sajjad Nawaz Khan, and Abdulrazak Yahya Saleh Al-Hababi (2017) <https://ijrest.net/vol-4-issue-12.html> . This package includes a dictionary of Malay words that may be used to perform word stemming, a dataset of Malay stop words, a dataset of sentiment words and a dataset of normalized words.
This package contains efficient implementations of Discrete Optimal Transport algorithms for the computation of Kantorovich-Wasserstein distances between pairs of large spatial maps (Bassetti, Gualandi, Veneroni (2020), <doi:10.1137/19M1261195>). All the algorithms are based on an ad-hoc implementation of the Network Simplex algorithm. The package has four main helper functions: compareOneToOne()
(to compare two spatial maps), compareOneToMany()
(to compare a reference map with a list of other maps), compareAll()
(to compute a matrix of distances between a list of maps), and focusArea()
(to compute the KWD distance within a focus area). In non-convex maps, the helper functions first build the convex-hull of the input bins and pad the weights with zeros.
This is a package for normalization, testing for differential variability and differential methylation and gene set testing for data from Illumina's Infinium HumanMethylation arrays. The normalization procedure is subset-quantile within-array normalization (SWAN), which allows Infinium I and II type probes on a single array to be normalized together. The test for differential variability is based on an empirical Bayes version of Levene's test. Differential methylation testing is performed using RUV, which can adjust for systematic errors of unknown origin in high-dimensional data by using negative control probes. Gene ontology analysis is performed by taking into account the number of probes per gene on the array, as well as taking into account multi-gene associated probes.
This package implements comprehensive test data engineering methods as described in Shojima (2022, ISBN:978-9811699856). Provides statistical techniques for engineering and processing test data: Classical Test Theory (CTT) with reliability coefficients for continuous ability assessment; Item Response Theory (IRT) including Rasch, 2PL, and 3PL models with item/test information functions; Latent Class Analysis (LCA) for nominal clustering; Latent Rank Analysis (LRA) for ordinal clustering with automatic determination of cluster numbers; Biclustering methods including infinite relational models for simultaneous clustering of examinees and items without predefined cluster numbers; and Bayesian Network Models (BNM) for visualizing inter-item dependencies. Features local dependence analysis through LRA and biclustering, parameter estimation, dimensionality assessment, and network structure visualization for educational, psychological, and social science research.
When added to an existing shiny app, users may subset any developer-chosen R data.frame on the fly. That is, users are empowered to slice & dice data by applying multiple (order specific) filters using the AND (&) operator between each, and getting real-time updates on the number of rows effected/available along the way. Thus, any downstream processes that leverage this data source (like tables, plots, or statistical procedures) will re-render after new filters are applied. The shiny moduleâ s user interface has a minimalist aesthetic so that the focus can be on the data & other visuals. In addition to returning a reactive (filtered) data.frame, IDEAFilter as also returns dplyr filter statements used to actually slice the data.
Manages, builds and computes statistics and datasets for the construction of quarterly (sub-annual) life tables by exploiting micro-data from either a general or an insured population. References: Pavà a and Lledó (2022) <doi:10.1111/rssa.12769>. Pavà a and Lledó (2023) <doi:10.1017/asb.2023.16>. Pavà a and Lledó (2025) <doi:10.1371/journal.pone.0315937>. Acknowledgements: The authors wish to thank Conselleria de Educación, Universidades y Empleo, Generalitat Valenciana (grants AICO/2021/257; CIAICO/2024/031), Ministerio de Ciencia e Innovación (grant PID2021-128228NB-I00) and Fundación Mapfre (grant Modelización espacial e intra-anual de la mortalidad en España. Una herramienta automática para el calculo de productos de vida') for supporting this research.
Sample surveys use scientific methods to draw inferences about population parameters by observing a representative part of the population, called sample. The SRSWOR (Simple Random Sampling Without Replacement) is one of the most widely used probability sampling designs, wherein every unit has an equal chance of being selected and units are not repeated.This function draws multiple SRSWOR samples from a finite population and estimates the population parameter i.e. total of HT, Ratio, and Regression estimators. Repeated simulations (e.g., 500 times) are used to assess and compare estimators using metrics such as percent relative bias (%RB), percent relative root means square error (%RRMSE).For details on sampling methodology, see, Cochran (1977) "Sampling Techniques" <https://archive.org/details/samplingtechniqu0000coch_t4x6>.
Datasets of accompany Harman, a PCA and constrained optimisation based technique. Contains three example datasets: IMR90, Human lung fibroblast cells exposed to nitric oxide; NPM, an experiment to test skin penetration of metal oxide nanoparticles following topical application of sunscreens in non-pregnant mice; OLF; an experiment to gauge the response of human olfactory neurosphere-derived (hONS
) cells to ZnO
nanoparticles. Since version 1.24, this package also contains the Infinium5 dataset, a set of batch correction adjustments across 5 Illumina Infinium Methylation BeadChip
datasets. This file does not contain methylation data, but summary statistics of 5 datasets after correction. There is also an EpiSCOPE_sample
file as exampling for the new methylation clustering functionality in Harman.
The design of this package allows us to run different clustering packages and compare the results between them, to determine which algorithm behaves best from the data provided. See Martos, L.A.P., Garcà a-Vico, à .M., González, P. et al.(2023) <doi:10.1007/s13748-022-00294-2> "Clustering: an R library to facilitate the analysis and comparison of cluster algorithms.", Martos, L.A.P., Garcà a-Vico, à .M., González, P. et al. "A Multiclustering Evolutionary Hyperrectangle-Based Algorithm" <doi:10.1007/s44196-023-00341-3> and L.A.P., Garcà a-Vico, à .M., González, P. et al. "An Evolutionary Fuzzy System for Multiclustering in Data Streaming" <doi:10.1016/j.procs.2023.12.058>.
This package provides a comprehensive suite of helper functions designed to facilitate the analysis of genomic annotations from the GENCODE database <https://www.gencodegenes.org/>, supporting both human and mouse genomes. This toolkit enables users to extract, filter, and analyze a wide range of annotation features including genes, transcripts, exons, and introns across different GENCODE releases. It provides functionality for cross-version comparisons, allowing researchers to systematically track annotation updates, structural changes, and feature-level differences between releases. In addition, the package can generate high-quality FASTA files containing donor and acceptor splice site motifs, which are formatted for direct input into the MaxEntScan
tool (Yeo and Burge, 2004 <doi:10.1089/1066527041410418>), enabling accurate calculation of splice site strength scores.
This package provides functions to estimate the disparities across categories (e.g. Black and white) that persists if a treatment variable (e.g. college) is equalized. Makes estimates by treatment modeling, outcome modeling, and doubly-robust augmented inverse probability weighting estimation, with standard errors calculated by a nonparametric bootstrap. Cross-fitting is supported. Survey weights are supported for point estimation but not for standard error estimation; those applying this package with complex survey samples should consult the data distributor to select an appropriate approach for standard error construction, which may involve calling the functions repeatedly for many sets of replicate weights provided by the data distributor. The methods in this package are described in Lundberg (2021) <doi:10.31235/osf.io/gx4y3>.
This package provides functions to estimate latent dimensions of choice and judgment using Aldrich-McKelvey
and Blackbox scaling methods, as described in Poole et al. (2016, <doi:10.18637/jss.v069.i07>). These techniques allow researchers (particularly those analyzing political attitudes, public opinion, and legislative behavior) to recover spatial estimates of political actors ideal points and stimuli from issue scale data, accounting for perceptual bias, multidimensional spaces, and missing data. The package uses singular value decomposition and alternating least squares (ALS) procedures to scale self-placement and perceptual data into a common latent space for the analysis of ideological or evaluative dimensions. Functionality also include tools for assessing model fit, handling complex survey data structures, and reproducing simulated datasets for methodological validation.
This package provides a compilation of tests for hypotheses regarding covariance and correlation matrices for one or more groups. The hypothesis can be specified through a corresponding hypothesis matrix and a vector or by choosing one of the basic hypotheses, while for the structure test, only the latter works. Thereby Monte-Carlo and Bootstrap-techniques are used, and the respective method must be chosen, and the functions provide p-values and mostly also estimators of calculated covariance matrices of test statistics. For more details on the methodology, see Sattler et al. (2022) <doi:10.1016/j.jspi.2021.12.001>, Sattler and Pauly (2024) <doi:10.1007/s11749-023-00906-6>, and Sattler and Dobler (2025) <doi:10.48550/arXiv.2310.11799>
.
Takes a distance matrix and plots it as an interactive graph. One point is focused at the center of the graph, around which all other points are plotted in their exact distances as given in the distance matrix. All other non-focus points are plotted as best as possible in relation to one another. Double click on any point to choose a new focus point, and hover over points to see their ID labels. If color label categories are given, hover over colors in the legend to highlight only those points and click on colors to highlight multiple groups. For more information on the rationale and mathematical background, as well as an interactive introduction, see <https://lea-urpa.github.io/focusedMDS.html>
.
This is an add-on package to gamlss'. The purpose of this package is to allow users to fit GAMLSS (Generalised Additive Models for Location Scale and Shape) models when the response variable is defined either in the intervals [0,1), (0,1] and [0,1] (inflated at zero and/or one distributions), or in the positive real line including zero (zero-adjusted distributions). The mass points at zero and/or one are treated as extra parameters with the possibility to include a linear predictor for both. The package also allows transformed or truncated distributions from the GAMLSS family to be used for the continuous part of the distribution. Standard methods and GAMLSS diagnostics can be used with the resulting fitted object.
Providing tools for microRNA
(miRNA
) text mining. miRetrieve
summarizes miRNA
literature by extracting, counting, and analyzing miRNA
names, thus aiming at gaining biological insights into a large amount of text within a short period of time. To do so, miRetrieve
uses regular expressions to extract miRNAs
and tokenization to identify meaningful miRNA
associations. In addition, miRetrieve
uses the latest miRTarBase
version 8.0 (Hsi-Yuan Huang et al. (2020) "miRTarBase
2020: updates to the experimentally validated microRNAâ target
interaction database" <doi:10.1093/nar/gkz896>) to display field-specific miRNA-mRNA
interactions. The most important functions are available as a Shiny web application under <https://miretrieve.shinyapps.io/miRetrieve/>
.
This package provides a seamless interface to access diverse public data about Colombia through the API-Colombia', a RESTful API. The package enables users to explore various aspects of Colombia, including general information, geography, and cultural insights. It includes five API-related functions to retrieve data on topics such as Colombia's general information, airports, departments, regions, and presidents. Additionally, ColombiAPI
offers a built-in function to view the datasets available within the package. The package also includes curated datasets covering Bogota air stations, business and holiday dates, public schools, Colombian coffee exports, cannabis licenses, Medellin rainfall, and malls in Bogota, making it a comprehensive tool for exploring Colombia's data. For more details on the API-Colombia', see <https://api-colombia.com/>.
This package provides a suite of computer model test functions that can be used to test and evaluate algorithms for Bayesian (also known as sequential) optimization. Some of the functions have known functional forms, however, most are intended to serve as black-box functions where evaluation requires running computer code that reveals little about the functional forms of the objective and/or constraints. The primary goal of the package is to provide users (especially those who do not have access to real computer models) a source of reproducible and shareable examples that can be used for benchmarking algorithms. The package is a living repository, and so more functions will be added over time. For function suggestions, please do contact the author of the package.
Alternative implementation of the beautiful MissForest
algorithm used to impute mixed-type data sets by chaining random forests, introduced by Stekhoven, D.J. and Buehlmann, P. (2012) <doi:10.1093/bioinformatics/btr597>. Under the hood, it uses the lightning fast random forest package ranger'. Between the iterative model fitting, we offer the option of using predictive mean matching. This firstly avoids imputation with values not already present in the original data (like a value 0.3334 in 0-1 coded variable). Secondly, predictive mean matching tries to raise the variance in the resulting conditional distributions to a realistic level. This would allow, e.g., to do multiple imputation when repeating the call to missRanger()
. Out-of-sample application is supported as well.
This software has evolved from fisheries research conducted at the Pacific Biological Station (PBS) in Nanaimo', British Columbia, Canada. It extends the R language to include two-dimensional plotting features similar to those commonly available in a Geographic Information System (GIS). Embedded C code speeds algorithms from computational geometry, such as finding polygons that contain specified point events or converting between longitude-latitude and Universal Transverse Mercator (UTM) coordinates. Additionally, we include C++ code developed by Angus Johnson for the Clipper library, data for a global shoreline, and other data sets in the public domain. Under the user's R library directory .libPaths()
', specifically in ./PBSmapping/doc', a complete user's guide is offered and should be consulted to use package functions effectively.
Providing equivalent functions for the dummy classifier and regressor used in Python scikit-learn library. Our goal is to allow R users to easily identify baseline performance for their classification and regression problems. Our baseline models use no predictors, and are useful in cases of class imbalance, multiclass classification, and when users want to quickly identify how much improvement their statistical and machine learning models are over several baseline models. We use a "better" default (proportional guessing) for the dummy classifier than the Python implementation ("prior", which is the most frequent class in the training set). The functions in the package can be used on their own, or introduce methods named dummy_regressor or dummy_classifier that can be used within the caret package pipeline.
It uses the first-order sensitivity index to measure whether the weights assigned by the creator of the composite indicator match the actual importance of the variables. Moreover, the variance inflation factor is used to reduce the set of correlated variables. In the case of a discrepancy between the importance and the assigned weight, the script determines weights that allow adjustment of the weights to the intended impact of variables. If the optimised weights are unable to reflect the desired importance, the highly correlated variables are reduced, taking into account variance inflation factor. The final outcome of the script is the calculated value of the composite indicator based on optimal weights and a reduced set of variables, and the linear ordering of the analysed objects.
This package provides a workflow for correction of Differential Interferometric Synthetic Aperture Radar (DInSAR
) atmospheric delay base on Generic Atmospheric Correction Online Service for InSAR
(GACOS) data and correction algorithms proposed by Chen Yu. This package calculate the Both Zenith and LOS direction (User Depend). You have to just download GACOS product on your area and preprocessed D-InSAR
unwrapped images. Cite those references and this package in your work, when using this framework. References: Yu, C., N. T. Penna, and Z. Li (2017) <doi:10.1016/j.rse.2017.10.038>. Yu, C., Li, Z., & Penna, N. T. (2017) <doi:10.1016/j.rse.2017.10.038>. Yu, C., Penna, N. T., and Li, Z. (2017) <doi:10.1002/2016JD025753>.