Data depth concept offers a variety of powerful and user friendly tools for robust exploration and inference for multivariate data. The offered techniques may be successfully used in cases of lack of our knowledge on parametric models generating data due to their nature. The package consist of among others implementations of several data depth techniques involving multivariate quantile-quantile plots, multivariate scatter estimators, multivariate Wilcoxon tests and robust regressions.
Includes functions that researchers or practitioners may use to clean raw data, transferring html, xlsx, txt data file into other formats. And it also can be used to manipulate text variables, extract numeric variables from text variables and other variable cleaning processes. It is originated from a author's project which focuses on creative performance in online education environment. The resulting paper of that study will be published soon.
Randomly generate a wide range of interaction networks with specified size, average degree, modularity, and topological structure. Sample nodes and links from within simulated networks randomly, by degree, by module, or by abundance. Simulations and sampling routines are implemented in FORTRAN', providing efficient generation times even for large networks. Basic visualization methods also included. Algorithms implemented here are described in de Aguiar et al. (2017) <arXiv:1708.01242>.
Applies sequential clustering algorithm to animal location data based on user-defined parameters. Plots interactive cluster maps and provides a summary dataframe with attributes for each cluster commonly used as covariates in subsequent modeling efforts. Additional functions provide individual keyhole markup language plots for quick assessment, and export of global positioning system exchange format files for navigation purposes. Methods can be found at <doi:10.1111/2041-210X.13572>.
This package provides an interface to the GenderAPI.io web service (<https://www.genderapi.io>) for determining gender from personal names, email addresses, or social media usernames. Functions are available to submit single or batch queries and retrieve additional information such as accuracy scores and country-specific gender predictions. This package simplifies integration of GenderAPI.io into R workflows for data cleaning, user profiling, and analytics tasks.
This package performs variable selection with data from Genome-wide association studies (GWAS), or other high-dimensional data with continuous, binary or survival outcomes, combining in an iterative framework the computational efficiency of the structured screen-and-select variable selection strategy based on some association learning and the parsimonious uncertainty quantification provided by the use of non-local priors (see Sanyal et al., 2019 <DOI:10.1093/bioinformatics/bty472>).
This package provides a genomic simulation approach for creating biologically informed individual genotypes from empirical data that 1) samples alleles from populations without replacement, 2) segregates alleles based on species-specific recombination rates. gscramble is a flexible simulation approach that allows users to create pedigrees of varying complexity in order to simulate admixed genotypes. Furthermore, it allows users to track haplotype blocks from the source populations through the pedigrees.
Uses a slice sampling-based Markov chain Monte Carlo to conduct Bayesian fitting and inference for generalized additive mixed models. Generalized linear mixed models and generalized additive models are also handled as special cases of generalized additive mixed models. The methodology and software is described in Pham, T.H. and Wand, M.P. (2018). Australian and New Zealand Journal of Statistics, 60, 279-330 <DOI:10.1111/ANZS.12241>.
It provides functions to design historical controlled trials with survival outcome by group sequential method. The options for interim look boundaries are efficacy only, efficacy & futility or futility only. It also provides the function to monitor the trial for any unplanned look. The package is based on Jianrong Wu, Xiaoping Xiong (2016) <doi:10.1002/pst.1756> and Jianrong Wu, Yimei Li (2020) <doi:10.1080/10543406.2019.1684305>.
The improved trimmed weighted Hochberg procedure provides increased statistical power and relaxes the dependence assumptions for familywise error rate control compared to the original weighted Hochberg procedure. This package computes the boundaries required for implementing the proposed methodology and includes sample size optimization methods. See Gou, J., Chang, Y., Li, T., and Zhang, F.(2025). Improved trimmed weighted Hochberg procedures with two endpoints and sample size optimization. Technical Report.
This package provides methods for analyzing DNA methylation data via Most Recurrent Methylation Patterns (MRMPs). Supports cell-type annotation, spatial deconvolution, unsupervised clustering, and cancer cell-of-origin inference. Includes C-backed summaries for YAME â .cg/.cmâ files (overlap counts, log2 odds ratios, beta/depth aggregation), an XGBoost classifier, NNLS deconvolution, and plotting utilities. Scales to large spatial and single-cell methylomes and is robust to extreme sparsity.
Mouse-tracking, the analysis of mouse movements in computerized experiments, is a method that is becoming increasingly popular in the cognitive sciences. The mousetrap package offers functions for importing, preprocessing, analyzing, aggregating, and visualizing mouse-tracking data. An introduction into mouse-tracking analyses using mousetrap can be found in Wulff, Kieslich, Henninger, Haslbeck, & Schulte-Mecklenbeck (2023) <doi:10.31234/osf.io/v685r> (preprint: <https://osf.io/preprints/psyarxiv/v685r>).
Utility functions that may be of general interest but are specifically required by the NeuroAnatomy Toolbox ('nat'). Includes functions to provide a basic make style system to update files based on timestamp information, file locking and touch utility. Convenience functions for working with file paths include abs2rel', split_path and common_path'. Finally there are utility functions for working with zip and gzip files including integrity tests.
Read Protein Data Bank (PDB) files, performs its analysis, and presents the result using different visualization types including 3D. The package also has additional capability for handling Virus Report data from the National Center for Biotechnology Information (NCBI) database. Nature Structural Biology 10, 980 (2003) <doi:10.1038/nsb1203-980>. US National Library of Medicine (2021) <https://www.ncbi.nlm.nih.gov/datasets/docs/reference-docs/data-reports/virus/>.
Calculate and optimize dynamic performance ratings of association football teams competing in matches, in accordance with the method used in the research paper "Determining the level of ability of football teams by dynamic ratings based on the relative discrepancies in scores between adversaries", by Constantinou and Fenton (2013) <doi:10.1515/jqas-2012-0036> This dynamic rating system has proven to provide superior results for predicting association football outcomes.
Computes sequential A-, MV-, D- and E-optimal or near-optimal block and row-column designs for two-colour cDNA microarray experiments using the linear fixed effects and mixed effects models where the interest is in a comparison of all possible elementary treatment contrasts. The package also provides an optional method of using the graphical user interface (GUI) R package tcltk to ensure that it is user friendly.
This package provides tools for making, retrieving, displaying and solving sudoku games. This package is an alternative to the earlier sudoku-solver package, sudoku'. The present package uses a slightly different algorithm, has a simpler coding and presents a few more sugar tools, such as plot and print methods. Solved sudoku games are of some interest in Experimental Design as examples of Latin Square designs with additional balance constraints.
This package provides a framework to work with decision rules. Rules can be extracted from supported models, augmented with (custom) metrics using validation data, manipulated using standard dataframe operations, reordered and pruned based on a metric, predict on unseen (test) data. Utilities include; Creating a rulelist manually, Exporting a rulelist as a SQL case statement and so on. The package offers two classes; rulelist and ruleset based on dataframe.
When using the R package exams to write mathematics questions in Sweave files, the output of a lot of R functions need to be adjusted for display in mathematical formulas. Specifically, the functions were accumulated when writing questions for the topics of the mathematics courses College Algebra, Precalculus, Calculus, Differential Equations, Introduction to Probability, and Linear Algebra. The output of the developed functions can be used in Sweave files.
This package performs Principal Components Analysis (also known as PCA) dimensionality reduction in the context of a linear regression. In most cases, PCA dimensionality reduction is performed independent of the response variable for a regression. This captures the majority of the variance of the model's predictors, but may not actually be the optimal dimensionality reduction solution for a regression against the response variable. An alternative method, optimized for a regression against the response variable, is to use both PCA and a relative importance measure. This package applies PCA to a given data frame of predictors, and then calculates the relative importance of each PCA factor against the response variable. It outputs ordered factors that are optimized for model fit. By performing dimensionality reduction with this method, an individual can achieve a the same r-squared value as performing just PCA, but with fewer PCA factors. References: Yuri Balasanov (2017) <https://ilykei.com>.
This Ruby library provides an implementation of the Matrix and Vector classes. The Matrix class represents a mathematical matrix. It provides methods for creating matrices, operating on them arithmetically and algebraically, and determining their mathematical properties (trace, rank, inverse, determinant, eigensystem, etc.). The Vector class represents a mathematical vector, which is useful in its own right, and also constitutes a row or column of a Matrix.
rga is a line-oriented search tool for searching in both text and binary formats. It is a wrapper for ripgrep with adapters for common binary formats, enabling it to search in multitude of file types: pdf, docx, sqlite, jpg, movie subtitles (mkv, mp4), etc.
This package also supports adding custom adapters in its configuration file, matching for mime types or extensions and executing arbitrary executables for the parsing.
Texinfo is the official documentation format of the GNU project. It uses a single source file using explicit commands to produce a final document in any of several supported output formats, such as HTML or PDF. This package includes both the tools necessary to produce Info documents from their source and the command-line Info reader. The emphasis of the language is on expressing the content semantically, avoiding physical markup commands.
hoodscanR is an user-friendly R package providing functions to assist cellular neighborhood analysis of any spatial transcriptomics data with single-cell resolution. All functions in the package are built based on the SpatialExperiment object, allowing integration into various spatial transcriptomics-related packages from Bioconductor. The package can result in cell-level neighborhood annotation output, along with funtions to perform neighborhood colocalization analysis and neighborhood-based cell clustering.