Recent technological advances have enable the simultaneous collection of multi-omics data i.e., different types or modalities of molecular data, presenting challenges for integrative prediction modeling due to the heterogeneous, high-dimensional nature and possible missing modalities of some individuals. We introduce this package for late integrative prediction modeling, enabling modality-specific variable selection and prediction modeling, followed by the aggregation of the modality-specific predictions to train a final meta-model. This package facilitates conducting late integration predictive modeling in a systematic, structured, and reproducible way.
This package provides tools to streamline the extraction, processing, and visualization of Computable General Equilibrium (CGE) results from GTAP models. Designed for compatibility with both .har and .sl4 files, the package enables users to automate data preparation, apply mapping metadata, and generate high-quality plots and summary tables with minimal coding. GTAPViz supports flexible export options (e.g., Text, CSV, Stata', or Excel formats). This facilitates efficient post-simulation analysis for economic research and policy reporting. Includes helper functions to filter, format, and customize outputs with reproducible styling.
Collection of packages for work with API Google Ads <https://developers.google.com/google-ads/api/docs/start>, Yandex Direct <https://yandex.ru/dev/direct/>, Yandex Metrica <https://yandex.ru/dev/metrika/>, MyTarget <https://target.my.com/help/advertisers/api_arrangement/ru>, Vkontakte <https://vk.com/dev/methods>, Facebook <https://developers.facebook.com/docs/marketing-apis/> and AppsFlyer <https://support.appsflyer.com/hc/en-us/articles/207034346-Using-Pull-API-aggregate-data>. This packages allows you loading data from ads account and manage your ads materials.
This package contains most of the hex font files from the GNU Unifont Project <https://unifoundry.com/unifont/> compressed by xz'. GNU Unifont is a duospaced bitmap font that attempts to cover all the official Unicode glyphs plus several of the artificial scripts in the (Under-)ConScript Unicode Registry <https://www.kreativekorp.com/ucsur/>. Provides a convenience function for loading in several of them at the same time as a bittermelon bitmap font object for easy rendering of the glyphs in an R terminal or graphics device.
This package provides functions for normalizing standard laboratory measurements (e.g. hemoglobin, cholesterol levels) according to age and sex, based on the algorithms described in "Personalized lab test models to quantify disease potentials in healthy individuals" (Netta Mendelson Cohen, Omer Schwartzman, Ram Jaschek, Aviezer Lifshitz, Michael Hoichman, Ran Balicer, Liran I. Shlush, Gabi Barbash & Amos Tanay, <doi:10.1038/s41591-021-01468-6>). Allows users to easily obtain normalized values for standard lab results, and to visualize their distributions. See more at <https://tanaylab.weizmann.ac.il/labs/>.
Traditional methods typically detect breakpoints from individual signals, which means that when applied separately to multiple signals, the breakpoints are not aligned. However, this package implements a common breakpoint detection approach for multiple piecewise constant signals, resulting in increased detection sensitivity and specificity. By employing various techniques, optimal performance is ensured, and computation is accelerated. We hope that this package will be beneficial for researchers in signal processing, bioinformatics, economy, and other related fields. The segmentation(), lambda_estimator() functions are the main functions of this package.
Calculates the number of four-taxon subtrees consistent with a pair of cladograms, calculating the symmetric quartet distance of Bandelt & Dress (1986), Reconstructing the shape of a tree from observed dissimilarity data, Advances in Applied Mathematics, 7, 309-343 <doi:10.1016/0196-8858(86)90038-2>, and using the tqDist algorithm of Sand et al. (2014), tqDist: a library for computing the quartet and triplet distances between binary or general trees, Bioinformatics, 30, 2079â 2080 <doi:10.1093/bioinformatics/btu157> for pairs of binary trees.
Finding hidden clusters in structured data can be hindered by the presence of masking variables. If not detected, masking variables are used to calculate the overall similarities between units, and therefore the cluster attribution is more imprecise. The algorithm q-vars implements an optimization method to find the variables that most separate units between clusters. In this way, masking variables can be discarded from the data frame and the clustering is more accurate. Tests can be found in Benati et al.(2017) <doi:10.1080/01605682.2017.1398206>.
This package provides the SMOTE with Boosting (SMOTEWB) algorithm. See F. SaÄ lam, M. A. Cengiz (2022) <doi:10.1016/j.eswa.2022.117023>. It is a SMOTE-based resampling technique which creates synthetic data on the links between nearest neighbors. SMOTEWB uses boosting weights to determine where to generate new samples and automatically decides the number of neighbors for each sample. It is robust to noise and outperforms most of the alternatives according to Matthew Correlation Coefficient metric. Alternative resampling methods are also available in the package.
Using principal component analysis as a base model, SCOUTer offers a new approach to simulate outliers in a simple and precise way. The user can generate new observations defining them by a pair of well-known statistics: the Squared Prediction Error (SPE) and the Hotelling's T^2 (T^2) statistics. Just by introducing the target values of the SPE and T^2, SCOUTer returns a new set of observations with the desired target properties. Authors: Alba González, Abel Folch-Fortuny, Francisco Arteaga and Alberto Ferrer (2020).
Providing convenience functions to connect R with the Spotify application programming interface ('API'). At first it aims to help setting up the OAuth2.0 Authentication flow. The default output of the get_*() functions is tidy, but optionally the functions could return the raw response from the API as well. The search_*() and get_*() functions can be combined. See the vignette for more information and examples and the official Spotify for Developers website <https://developer.spotify.com/documentation/web-api/> for information about the Web API'.
Cancer is a genetic disease caused by somatic mutations in genes controlling key biological functions such as cellular growth and division. Such mutations may arise both through cell-intrinsic and exogenous processes, generating characteristic mutational patterns over the genome named mutational signatures. The study of mutational signatures have become a standard component of modern genomics studies, since it can reveal which (environmental and endogenous) mutagenic processes are active in a tumor, and may highlight markers for therapeutic response. Mutational signatures computational analysis presents many pitfalls. First, the task of determining the number of signatures is very complex and depends on heuristics. Second, several signatures have no clear etiology, casting doubt on them being computational artifacts rather than due to mutagenic processes. Last, approaches for signatures assignment are greatly influenced by the set of signatures used for the analysis. To overcome these limitations, we developed RESOLVE (Robust EStimation Of mutationaL signatures Via rEgularization), a framework that allows the efficient extraction and assignment of mutational signatures. RESOLVE implements a novel algorithm that enables (i) the efficient extraction, (ii) exposure estimation, and (iii) confidence assessment during the computational inference of mutational signatures.
This package implements functions to find influential TF and target based on different input type. It have five module: Multi-peak multi-gene annotaion(mmPeakAnno module), Calculate regulation potential(calcRP module), Find influential Target based on ChIP-Seq and RNA-Seq data(Find influential Target module), Find influential TF based on different input(Find influential TF module), Calculate peak-gene or peak-peak correlation(peakGeneCor module). And there are also some other useful function like integrate different source information, calculate jaccard similarity for your TF.
SpikeLI is a package that performs the analysis of the Affymetrix spike-in data using the Langmuir Isotherm. The aim of this package is to show the advantages of a physical-chemistry based analysis of the Affymetrix microarray data compared to the traditional methods. The spike-in (or Latin square) data for the HGU95 and HGU133 chipsets have been downloaded from the Affymetrix web site. The model used in the spikeLI package is described in details in E. Carlon and T. Heim, Physica A 362, 433 (2006).
Fits cumulative link models (CLMs) for ordinal categorical data using CmdStanR'. Supports various link functions including logit, probit, cloglog, loglog, cauchit, and flexible parametric links such as Generalized Extreme Value (GEV), Asymmetric Exponential Power (AEP), and Symmetric Power. Models are pre-compiled using the instantiate package for fast execution without runtime compilation. Methods are described in Agresti (2010, ISBN:978-0-470-08289-8), Wang and Dey (2011) <doi:10.1007/s10651-010-0154-8>, and Naranjo, Perez, and Martin (2015) <doi:10.1007/s11222-014-9449-1>.
This package provides a fast, flexible and transparent framework to estimate context-specific word and short document embeddings using the a la carte embeddings approach developed by Khodak et al. (2018) <doi:10.48550/arXiv.1805.05388> and evaluate hypotheses about covariate effects on embeddings using the regression framework developed by Rodriguez et al. (2021)<doi:10.1017/S0003055422001228>. New version of the package applies a new estimator to measure the distance between word embeddings as described in Green et al. (2025) <doi:10.1017/pan.2024.22>.
Estimate sample sizes needed to capture target levels of genetic diversity from a population (multivariate allele frequencies) for applications like germplasm conservation and breeding efforts. Compares bootstrap samples to a full population using linear regression, employing the R-squared value to represent the proportion of diversity captured. Iteratively increases sample size until a user-defined target R-squared is met. Offers a parallelized R implementation of a previously developed python method. All ploidy levels are supported. For more details, see Sandercock et al. (2024) <doi:10.1073/pnas.2403505121>.
Algebra of operations for blending, copying, adjusting, and compositing layers in ggplot2'. Supports copying and adjusting the aesthetics or parameters of an existing layer, partitioning a layer into multiple pieces for re-composition, applying affine transformations to layers, and combining layers (or partitions of layers) using blend modes (including commutative blend modes, like multiply and darken). Blend mode support is particularly useful for creating plots with overlapping groups where the layer drawing order does not change the output; see Kindlmann and Scheidegger (2014) <doi:10.1109/TVCG.2014.2346325>.
An R port of the hashids library. hashids generates YouTube-like hashes from integers or vector of integers. Hashes generated from integers are relatively short, unique and non-seqential. hashids can be used to generate unique ids for URLs and hide database row numbers from the user. By default hashids will avoid generating common English cursewords by preventing certain letters being next to each other. hashids are not one-way: it is easy to encode an integer to a hashid and decode a hashid back into an integer.
Generates the Langa-Weir classification of cognitive function for the 2022 Health and Retirement Study (HRS) cognition data. It is particularly useful for researchers studying cognitive aging who wish to work with the most recent release of HRS data. The package provides user-friendly functions for data preprocessing, scoring, and classification allowing users to easily apply the Langa-Weir classification system. For details regarding the; HRS <https://hrsdata.isr.umich.edu/> and Langa-Weir classifications <https://hrsdata.isr.umich.edu/data-products/langa-weir-classification-cognitive-function-1995-2020>.
This package performs monotonic binning of numeric risk factor in credit rating models (PD, LGD, EAD) development. All functions handle both binary and continuous target variable. Functions that use isotonic regression in the first stage of binning process have an additional feature for correction of minimum percentage of observations and minimum target rate per bin. Additionally, monotonic trend can be identified based on raw data or, if known in advance, forced by functions argument. Missing values and other possible special values are treated separately from so-called complete cases.
User-friendly package for reporting replicability-analysis methods, affixed to meta-analyses summary. The replicability-analysis output provides an assessment of the investigated intervention, where it offers quantification of effect replicability and assessment of the consistency of findings. - Replicability-analysis for fixed-effects and random-effect meta analysis: - r(u)-value; - lower bounds on the number of studies with replicated positive and\or negative effect; - Allows detecting inconsistency of signals; - forest plots with the summary of replicability analysis results; - Allows Replicability-analysis with or without the common-effect assumption.
This package is deprecated. Please use redatamx instead. Provides an API to work with Redatam (see <https://redatam.org>) databases in both formats: RXDB (new format) and DICX (old format) and running Redatam programs written in SPC language. It's a wrapper around Redatam core and provides functions to open/close a database (redatam_open()/redatam_close()), list entities and variables from the database (redatam_entities(), redatam_variables()) and execute a SPC program and gets the results as data frames (redatam_query(), redatam_run()).
Similarity plots based on correlation and median absolute deviation (MAD); adjusting colors for heatmaps; aggregate technical replicates; calculate pairwise fold-changes and log fold-changes; compute one- and two-way ANOVA; simplified interface to package limma (Ritchie et al. (2015), <doi:10.1093/nar/gkv007> ) for moderated t-test and one-way ANOVA; Hamming and Levenshtein (edit) distance of strings as well as optimal alignment scores for global (Needleman-Wunsch) and local (Smith-Waterman) alignments with constant gap penalties (Merkl and Waack (2009), ISBN:978-3-527-32594-8).