Machine Learning models are widely used and have various applications in classification or regression. Models created with boosting, bagging, stacking or similar techniques are often used due to their high performance, but such black-box models usually lack interpretability. The DALEX package contains various explainers that help to understand the link between input variables and model output.
This package provides tools for making the descriptive "Table 1" used in medical articles, a transition plot for showing changes between categories (also known as a Sankey diagram), flow charts by extending the grid package, a method for variable selection based on the SVD, Bezier lines with arrows complementing the ones in the grid package, and more.
Ggdag is built on top of dagitty, an R package that uses the DAGitty web tool for creating and analyzing DAGs. ggdag makes it easy to tidy and plot dagitty objects using ggplot2 and ggraph, as well as common analytic and graphical functions, such as determining adjustment sets and node relationships.
Compare differential gene expression results with those from known cellular perturbations (such as gene knock-down, overexpression or small molecules) derived from the Connectivity Map. Such analyses allow not only to infer the molecular causes of the observed difference in gene expression but also to identify small molecules that could drive or revert specific transcriptomic alterations.
The R implementation of mCOPA package published by Wang et al. (2012). Oppar provides methods for Cancer Outlier profile Analysis. Although initially developed to detect outlier genes in cancer studies, methods presented in oppar can be used for outlier profile analysis in general. In addition, tools are provided for gene set enrichment and pathway analysis.
The package provides `rlang` data masks for the SummarizedExperiment class. The enables the evaluation of unquoted expression in different contexts of the SummarizedExperiment object with optional access to other contexts. The goal for `plyxp` is for evaluation to feel like a data.frame object without ever needing to unwind to a rectangular data.frame.
The method models RNA-seq reads using a mixture of 3 beta-binomial distributions to generate posterior probabilities for genotyping bi-allelic single nucleotide polymorphisms. Elena Vigorito, Anne Barton, Costantino Pitzalis, Myles J. Lewis and Chris Wallace (2023) <doi:10.1093/bioinformatics/btad393> "BBmix: a Bayesian beta-binomial mixture model for accurate genotyping from RNA-sequencing.".
This package provides functions for cobin and micobin regression models, a new family of generalized linear models for continuous proportional data (Y in the closed unit interval [0, 1]). It also includes an exact, efficient sampler for the Kolmogorov-Gamma random variable. For details, see Lee et al. (2025+) <doi:10.48550/arXiv.2504.15269>.
Este pacote traduz os seguintes conjuntos de dados: airlines', airports', ames_raw', AwardsManagers', babynames', Batting', diamonds', faithful', fueleconomy', Fielding', flights', gapminder', gss_cat', iris', Managers', mpg', mtcars', atmos', penguins', People, Pitching', pixarfilms','planes', presidential', table1', table2', table3', table4a', table4b', table5', vehicles', weather', who'. English: It provides a Portuguese translated version of the datasets listed above.
This package provides functions to compute state-specific and marginal life expectancies. The computation is based on a fitted continuous-time multi-state model that includes an absorbing death state; see Van den Hout (2017, ISBN:9781466568402). The fitted multi-state model model should be estimated using the msm package using age as the time-scale.
Given a set of parameters describing model dynamics and a corresponding cost function, FAMoS performs a dynamic forward-backward model selection on a specified selection criterion. It also applies a non-local swap search method. Works on any cost function. For detailed information see Gabel et al. (2019) <doi:10.1371/journal.pcbi.1007230>.
Analysis of Bayesian adaptive enrichment clinical trial using Free-Knot Bayesian Model Averaging (FK-BMA) method of Maleyeff et al. (2024) for Gaussian data. Maleyeff, L., Golchi, S., Moodie, E. E. M., & Hudson, M. (2024) "An adaptive enrichment design using Bayesian model averaging for selection and threshold-identification of predictive variables" <doi:10.1093/biomtc/ujae141>.
Wrapper for computing parameters for univariate distributions using MLE. It creates an object that stores d, p, q, r functions as well as parameters and statistics for diagnostics. Currently supports automated fitting from base and actuar packages. A manually fitting distribution fitting function is included to support directly specifying parameters for any distribution from ancillary packages.
This package provides optimized C++ code for computing the partial Receiver Operating Characteristic (ROC) test used in niche and species distribution modeling. The implementation follows Peterson et al. (2008) <doi:10.1016/j.ecolmodel.2007.11.008>. Parallelization via OpenMP was implemented with assistance from the DeepSeek Artificial Intelligence Assistant (<https://www.deepseek.com/>).
Utilizing Generative Artificial Intelligence models like GPT-4 and Gemini Pro as coding and writing assistants for R users. Through these models, GenAI offers a variety of functions, encompassing text generation, code optimization, natural language processing, chat, and image interpretation. The goal is to aid R users in streamlining laborious coding and language processing tasks.
Apply an adaptation of the SuperFastHash algorithm to any R object. Hash whole R objects or, for vectors or lists, hash R objects to obtain a set of hash values that is stored in a structure equivalent to the input. See <http://www.azillionmonkeys.com/qed/hash.html> for a description of the hash algorithm.
These are data and functions to support quantitative peace science research. The data are important state-year information on democracy and wealth, which require periodic updates and regular maintenance. The functions permit some exploratory and diagnostic assessment of the kinds of data in demand by the community, but do not impose many dependencies on the user.
This package provides tools for analysing multivariate time series with wavelets. This includes: simulation of a multivariate locally stationary wavelet (mvLSW) process from a multivariate evolutionary wavelet spectrum (mvEWS); estimation of the mvEWS, local coherence and local partial coherence. See Park, Eckley and Ombao (2014) <doi:10.1109/TSP.2014.2343937> for details.
This package provides tools for data analysis with multivariate Bayesian structural time series (MBSTS) models. Specifically, the package provides facilities for implementing general structural time series models, flexibly adding on different time series components (trend, season, cycle, and regression), simulating them, fitting them to multivariate correlated time series data, conducting feature selection on the regression component.
An S4 implementation of the unbiased extension of the model- assisted synthetic-regression estimator proposed by Mandallaz (2013) <DOI:10.1139/cjfr-2012-0381>, Mandallaz et al. (2013) <DOI:10.1139/cjfr-2013-0181> and Mandallaz (2014) <DOI:10.1139/cjfr-2013-0449>. It yields smaller variances than the standard bias correction, the generalised regression estimator.
Multiple imputation using XGBoost', subsampling, and predictive mean matching as described in Deng and Lumley (2024) <doi:10.1080/10618600.2023.2252501>. The package supports various types of variables, offers flexible settings, and enables saving an imputation model to impute new data. Data processing and memory usage have been optimised to speed up the imputation process.
Permutation based non-parametric analysis of CRISPR screen data. Details about this algorithm are published in the following paper published on BMC genomics, Jia et al. (2017) <doi:10.1186/s12864-017-3938-5>: A permutation-based non-parametric analysis of CRISPR screen data. Please cite this paper if you use this algorithm for your paper.
This package creates images that are the proper size for social media. Beautiful plots, charts and graphs wither and die if they are not shared. Social media is perfect for this but every platform has its own image dimensions. With smpic you can easily save your plots with the exact dimensions needed for the different platforms.
The spork syntax describes label formatting concisely, supporting mixed nesting of subscripts and superscripts to arbitrary depth. It intends to be easy to read and write in plain text, and easy to convert to equivalent presentations in plotmath', latex', and html'. Greek symbols and a multiplication symbol are explicitly supported. See ?as_spork and ?as_previews.