Color values in R are often represented as strings of hexadecimal colors or named colors. This package offers fast conversion of these color representations to either an array of red/green/blue/alpha values or to the packed integer format used in native raster objects. Functions for conversion are also exported at the C level for use in other packages. This fast conversion of colors is implemented using an order-preserving minimal perfect hash derived from Majewski et al (1996) "A Family of Perfect Hashing Methods" <doi:10.1093/comjnl/39.6.547>.
This package provides fast moving-window ("focal") and buffer-based extraction for raster data using the terra package. Automatically selects between a C++ backend (via terra') and a Fast Fourier Transform (FFT) backend depending on problem size. The FFT backend supports sum and mean, while other statistics (e.g., median, min, max, standard deviation) are handled by the terra backend. Supports multiple kernel types (e.g., circle, rectangle, gaussian), with NA handling consistent with terra via na.rm and na.policy'. Operates on SpatRaster objects and returns results with the same geometry.
Grey zones locally occur in an agreement table due to the subjective evaluation of raters based on various factors such as not having uniform guidelines, the differences between the raters level of expertise or low variability among the level of the categorical variable. It is important to detect grey zones since they cause a negative bias in the estimate of the agreement level. This package provides a function for detecting the existence of grey zones in two-way inter-rater agreement tables (Demirhan and Yilmaz (2023) <doi:10.1186/s12874-022-01759-7>).
Class imbalance usually damages the performance of classifiers. Thus, it is important to treat data before applying a classifier algorithm. This package includes recent resampling algorithms in the literature: (Barua et al. 2014) <doi:10.1109/tkde.2012.232>; (Das et al. 2015) <doi:10.1109/tkde.2014.2324567>, (Zhang et al. 2014) <doi:10.1016/j.inffus.2013.12.003>; (Gao et al. 2014) <doi:10.1016/j.neucom.2014.02.006>; (Almogahed et al. 2014) <doi:10.1007/s00500-014-1484-5>. It also includes an useful interface to perform oversampling.
Analysis of dichotomous and continuous response data using latent factor by both 1PL LSIRM and 2PL LSIRM as described in Jeon et al. (2021) <doi:10.1007/s11336-021-09762-5>. It includes original 1PL LSIRM and 2PL LSIRM provided for binary response data and its extension for continuous response data. Bayesian model selection with spike-and-slab prior and method for dealing data with missing value under missing at random, missing completely at random are also supported. Various diagnostic plots are available to inspect the latent space and summary of estimated parameters.
Vitamin and mineral deficiencies continue to be a significant public health problem. This is particularly critical in developing countries where deficiencies to vitamin A, iron, iodine, and other micronutrients lead to adverse health consequences. Cross-sectional surveys are helpful in answering questions related to the magnitude and distribution of deficiencies of selected vitamins and minerals. This package provides tools for calculating and determining select vitamin and mineral deficiencies based on World Health Organization (WHO) guidelines found at <https://www.who.int/teams/nutrition-and-food-safety/databases/vitamin-and-mineral-nutrition-information-system>.
Using the adjustment method from Benjamini & Hochberg (1995) <doi:10.1111/j.2517-6161.1995.tb02031.x>, this package determines which variables are significant under repeated testing with a given dataframe of p values and an user defined "q" threshold. It then returns the original dataframe along with a significance column where an asterisk denotes a significant p value after FDR calculation, and NA denotes all other p values. This package uses the Benjamini & Hochberg method specifically as described in Lee, S., & Lee, D. K. (2018) <doi:10.4097/kja.d.18.00242>.
The SoundexBR package provides an algorithm for decoding names into phonetic codes, as pronounced in Portuguese. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling. The algorithm mainly encodes consonants; a vowel will not be encoded unless it is the first letter. The soundex code resultant consists of a four digits long string composed by one letter followed by three numerical digits: the letter is the first letter of the name, and the digits encode the remaining consonants.
Finite mixture models are a popular technique for modelling unobserved heterogeneity or to approximate general distribution functions in a semi-parametric way. They are used in a lot of different areas such as astronomy, biology, economics, marketing or medicine. This package is the implementation of popular robust mixture regression methods based on different algorithms including: fleximix, finite mixture models and latent class regression; CTLERob, component-wise adaptive trimming likelihood estimation; mixbi, bi-square estimation; mixL, Laplacian distribution; mixt, t-distribution; TLE, trimmed likelihood estimation. The implemented algorithms includes: CTLERob stands for Component-wise adaptive Trimming Likelihood Estimation based mixture regression; mixbi stands for mixture regression based on bi-square estimation; mixLstands for mixture regression based on Laplacian distribution; TLE stands for Trimmed Likelihood Estimation based mixture regression. For more detail of the algorithms, please refer to below references. Reference: Chun Yu, Weixin Yao, Kun Chen (2017) <doi:10.1002/cjs.11310>. NeyKov N, Filzmoser P, Dimova R et al. (2007) <doi:10.1016/j.csda.2006.12.024>. Bai X, Yao W. Boyer JE (2012) <doi:10.1016/j.csda.2012.01.016>. Wennan Chang, Xinyu Zhou, Yong Zang, Chi Zhang, Sha Cao (2020) <arXiv:2005.11599>.
Reads, writes, and edits EXIF and other file metadata using ExifTool <https://exiftool.org/>, returning read results as a data frame. ExifTool supports many different metadata formats including EXIF, GPS, IPTC, XMP, JFIF, GeoTIFF, ICC Profile, Photoshop IRB, FlashPix, AFCP and ID3, Lyrics3, as well as the maker notes of many digital cameras by Canon, Casio, DJI, FLIR, FujiFilm, GE, GoPro, HP, JVC/Victor, Kodak, Leaf, Minolta/Konica-Minolta, Motorola, Nikon, Nintendo, Olympus/Epson, Panasonic/Leica, Pentax/Asahi, Phase One, Reconyx, Ricoh, Samsung, Sanyo, Sigma/Foveon and Sony.
This package provides unsupervised selection and clustering of microarray data using mixture models. Following the methods described in McLachlan, Bean and Peel (2002) <doi:10.1093/bioinformatics/18.3.413> a subset of genes are selected based one the likelihood ratio statistic for the test of one versus two components when fitting mixtures of t-distributions to the expression data for each gene. The dimensionality of this gene subset is further reduced through the use of mixtures of factor analyzers, allowing the tissue samples to be clustered by fitting mixtures of normal distributions.
DNA methylation of 5-methylcytosine (5mC) is the result of a multi-step, enzyme-dependent process. Predicting these sites in-vitro is laborious, time consuming as well as costly. This Gb5mC-Pred package is an in-silico pipeline for predicting DNA sequences containing the 5mC sites. It uses a machine learning approach which uses Stochastic Gradient Boosting approach for prediction of the sequences with 5mC sites. This package has been developed by using the concept of Navarez and Roxas (2022) <doi:10.1109/TCBB.2021.3082184>.
By analyzing time series, it is possible to observe significant changes in the behavior of observations that frequently characterize events. Events present themselves as anomalies, change points, or motifs. In the literature, there are several methods for detecting events. However, searching for a suitable time series method is a complex task, especially considering that the nature of events is often unknown. This work presents Harbinger, a framework for integrating and analyzing event detection methods. Harbinger contains several state-of-the-art methods described in Salles et al. (2020) <doi:10.5753/sbbd.2020.13626>.
This is an open-source software designed specifically for text mining in the Persian language. It allows users to examine word frequencies, download data for analysis, and generate word clouds. This tool is particularly useful for researchers and analysts working with Persian language data. This package mainly makes use of the PersianStemmer (Safshekan, R., et al. (2019). <https://CRAN.R-project.org/package=PersianStemmer>), udpipe (Wijffels, J., et al. (2023). <https://CRAN.R-project.org/package=udpipe>), and shiny (Chang, W., et al. (2023). <https://CRAN.R-project.org/package=shiny>) packages.
Several tests of quantitative palaeoenvironmental reconstructions from microfossil assemblages, including the null model tests of the statistically significant of reconstructions developed by Telford and Birks (2011) <doi:10.1016/j.quascirev.2011.03.002>, and tests of the effect of spatial autocorrelation on transfer function model performance using methods from Telford and Birks (2009) <doi:10.1016/j.quascirev.2008.12.020> and Trachsel and Telford (2016) <doi:10.5194/cp-12-1215-2016>. Age-depth models with generalized mixed-effect regression from Heegaard et al (2005) <doi:10.1191/0959683605hl836rr> are also included.
Plot both fixed and random effects of linear mixed models, multilevel models in a single spaghetti plot. The package allows to visualize the effect of a predictor on a criterion between different levels of a grouping variable. Additionally, confidence intervals can be displayed for fixed effects. Calculation of predicted values of random effects allows only models with one random intercept and/or one random slope to be plotted. Confidence intervals and predicted values of fixed effects are computed using the ggpredict function from the ggeffects package. Lüdecke, D. (2018) <doi:10.21105/joss.00638>.
For making Trellis-type conditioning plots without strip labels. This is useful for displaying the structure of results from factorial designs and other studies when many conditioning variables would clutter the display with layers of redundant strip labels. Settings of the variables are encoded by layout and spacing in the trellis array and decoded by a separate legend. The functionality is implemented by a single S3 generic strucplot() function that is a wrapper for the Lattice package's xyplot() function. This allows access to all Lattice graphics capabilities in the usual way.
Some R functions, such as optim(), require a function its gradient passed as separate arguments. When these are expensive to calculate it may be much faster to calculate the function (fn) and gradient (gr) together since they often share many calculations (chain rule). This package allows the user to pass in a single function that returns both the function and gradient, then splits (hence splitfngr') them so the results can be accessed separately. The functions provided allow this to be done with any number of functions/values, not just for functions and gradients.
Estimates the time-varying (tv) parameters of the GARCH(1,1) model, enabling the modeling of non-stationary volatilities by allowing the model parameters to change gradually over time. The estimation and prediction processes are facilitated through the application of the Kalman filter and state-space equations. This package supports the estimation of tv parameters for various deterministic functions, which can be identified through exploratory analysis of different time periods or segments of return data. The methodology is grounded in the framework presented by Ferreira et al. (2017) <doi:10.1080/00949655.2017.1334778>.
EDIRquery provides a tool to search for genes of interest within the Exome Database of Interspersed Repeats (EDIR). A gene name is a required input, and users can additionally specify repeat sequence lengths, minimum and maximum distance between sequences, and whether to allow a 1-bp mismatch. Outputs include a summary of results by repeat length, as well as a dataframe of query results. Example data provided includes a subset of the data for the gene GAA (ENSG00000171298). To query the full database requires providing a path to the downloaded database files as a parameter.
This package provides functionality to combine the existing pieces of the transcriptome data and results, making it easier to generate insightful observations and hypothesis. Its usage is made easy with a Shiny application, combining the benefits of interactivity and reproducibility e.g. by capturing the features and gene sets of interest highlighted during the live session, and creating an HTML report as an artifact where text, code, and output coexist. Using the GeneTonicList as a standardized container for all the required components, it is possible to simplify the generation of multiple visualizations and summaries.
Pigengene package provides an efficient way to infer biological signatures from gene expression profiles. The signatures are independent from the underlying platform, e.g., the input can be microarray or RNA Seq data. It can even infer the signatures using data from one platform, and evaluate them on the other. Pigengene identifies the modules (clusters) of highly coexpressed genes using coexpression network analysis, summarizes the biological information of each module in an eigengene, learns a Bayesian network that models the probabilistic dependencies between modules, and builds a decision tree based on the expression of eigengenes.
This package provides a client for the OmniPath web service and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation nichenetr.
The flexibility and excellence of ggplot2 is unquestionable, so many drawing tools basically need ggplot2 as the operating object. In order to develop a heatmap drawing system based on ggplot2, we developed this tool, mainly to solve the heatmap puzzle problem and the flexible connection between the heatmap and the ggplot2 object. The advantages of this tool are as follows: 1. More flexible label settings; 2. Realize the linkage of heatmap and ggplot2 drawing system, which is helpful for operations such as puzzles; 3. Simple and easy to operate; 4. Optimization of clustering tree visualization.