Requires rooted phylogeny as input and creates a table of genera, their monophyly-status, which taxa cause problems in monophyly etc. Different information can be extracted from the output and a plot function allows visualization of the results in a number of ways. "MonoPhy
: a simple R package to find and visualize monophyly issues." Schwery, O. & O'Meara, B.C. (2016) <doi:10.7717/peerj-cs.56>.
Analyzing longitudinal clinical data from Electronic Health Records (EHRs) using linear mixed models (LMM) and visualizing the results as networks. It includes functions for fitting LMM, normalizing adjacency matrices, and comparing networks. The package is designed for researchers in clinical and biomedical fields who need to model longitudinal data and explore relationships between variables For more details see Bates et al. (2015) <doi:10.18637/jss.v067.i01>.
Enhances mlexperiments <https://CRAN.R-project.org/package=mlexperiments> with additional machine learning ('ML') learners. The package provides R6-based learners for the following algorithms: glmnet <https://CRAN.R-project.org/package=glmnet>, ranger <https://CRAN.R-project.org/package=ranger>, xgboost <https://CRAN.R-project.org/package=xgboost>, and lightgbm <https://CRAN.R-project.org/package=lightgbm>. These can be used directly with the mlexperiments R package.
This package provides a compilation of functions to create visually appealing and information-rich plots of meta-analytic data using ggplot2'. Currently allows to create forest plots, funnel plots, and many of their variants, such as rainforest plots, thick forest plots, additional evidence contour funnel plots, and sunset funnel plots. In addition, functionalities for visual inference with the funnel plot in the context of meta-analysis are provided.
An open-source implementation of latent variable methods and multivariate modeling tools. The focus is on exploratory analyses using dimensionality reduction methods including low dimensional embedding, classical multivariate statistical tools, and tools for enhanced interpretation of machine learning methods (i.e. intelligible models to provide important information for end-users). Target domains include extension to dedicated applications e.g. for manufacturing process modeling, spectroscopic analyses, and data mining.
Generates design matrix for analysing real paired comparisons and derived paired comparison data (Likert type items/ratings or rankings) using a loglinear approach. Fits loglinear Bradley-Terry model (LLBT) exploiting an eliminate feature. Computes pattern models for paired comparisons, rankings, and ratings. Some treatment of missing values (MCAR and MNAR). Fits latent class (mixture) models for paired comparison, rating and ranking patterns using a non-parametric ML approach.
Fits and evaluates three-state partitioned survival analyses (PartSAs
) and Markov models (clock forward or clock reset) to progression and overall survival data typically collected in oncology clinical trials. These model structures are typically considered in cost-effectiveness modeling in advanced/metastatic cancer indications. Muston (2024). "Informing structural assumptions for three state oncology cost-effectiveness models through model efficiency and fit". Applied Health Economics and Health Policy.
This package implements sparse regression with paired covariates (<doi:10.1007/s11634-019-00375-6>). The paired lasso is designed for settings where each covariate in one set forms a pair with a covariate in the other set (one-to-one correspondence). For the optional correlation shrinkage, install ashr (<https://github.com/stephens999/ashr>) and CorShrink
(<https://github.com/kkdey/CorShrink>
) from GitHub
(see README).
This package provides a Shiny input widget, pasteBoxInput
, that allows users to paste images directly into a Shiny application. The pasted images are captured as Base64 encoded strings and can be used within the application for various purposes, such as display or further processing. This package is particularly useful for applications that require easy and quick image uploads without the need for traditional file selection dialog boxes.
Perform analysis of variance when the experimental units are spatially correlated. There are two methods to deal with spatial dependence: Spatial autoregressive models (see Rossoni, D. F., & Lima, R. R. (2019) <doi:10.28951/rbb.v37i2.388>) and geostatistics (see Pontes, J. M., & Oliveira, M. S. D. (2004) <doi:10.1590/S1413-70542004000100018>). For both methods, there are three multicomparison procedure available: Tukey, multivariate T, and Scott-Knott.
The Wordle game. Players have six attempts to guess a five-letter word. After each guess, the player is informed which letters in their guess are either: anywhere in the word; in the right position in the word. This can be used to inform the next guess. Can be played interactively in the console, or programmatically. Based on Josh Wardle's game <https://www.powerlanguage.co.uk/wordle/>.
Permite obtener rápidamente una serie de medidas de resumen y gráficos para datos numéricos discretos o continuos en series simples. También permite obtener tablas de frecuencia clásicas y gráficos cuando se desea realizar un análisis de series agrupadas. Su objetivo es de aplicación didáctica para un curso introductorio de Bioestadà stica utilizando el software R, para las carreras de grado las carreras de grado y otras ofertas educativas de la Facultad de Ciencias Agrarias de la UNJu / It generates summary measures and graphs for discrete or continuous numerical data in simple series. It also enables the creation of classic frequency tables and graphs when analyzing grouped series. Its purpose is for educational application in an introductory Biostatistics course using the R software, aimed at undergraduate programs and other educational offerings of the Faculty of Agricultural Sciences at the National University of Jujuy (UNJu).
This package provides functions to identify Homozygous-by-Descent (HBD) segments associated with runs of homozygosity (ROH) and to estimate individual autozygosity (or inbreeding coefficient). HBD segments and autozygosity are assigned to multiple HBD classes with a model-based approach relying on a mixture of exponential distributions. The rate of the exponential distribution is distinct for each HBD class and defines the expected length of the HBD segments. These HBD classes are therefore related to the age of the segments (longer segments and smaller rates for recent autozygosity / recent common ancestor). The functions allow to estimate the parameters of the model (rates of the exponential distributions, mixing proportions), to estimate global and local autozygosity probabilities and to identify HBD segments with the Viterbi decoding. The method is fully described in Druet and Gautier (2017) <doi:10.1111/mec.14324> and Druet and Gautier (2022) <doi:10.1016/j.tpb.2022.03.001>.
LLVM is a compiler infrastructure designed for compile-time, link-time, runtime, and idle-time optimization of programs from arbitrary programming languages. It currently supports compilation of C and C++ programs, using front-ends derived from GCC 4.0.1. A new front-end for the C family of languages is in development. The compiler infrastructure includes mirror sets of programming tools as well as libraries with equivalent functionality.
Combination of results for meta-analysis using significance and effect size only. P-values and fold-change are combined to obtain a global significance on each metabolite. Produces a volcano plot summarising the relevant results from meta-analysis. Vote-counting reports for metabolites. And explore plot to detect discrepancies between studies at a first glance. Methodology is described in the Llambrich et al. (2021) <doi:10.1093/bioinformatics/btab591>.
Twelve confidence intervals for one binomial proportion or a vector of binomial proportions are computed. The confidence intervals are: Jeffreys, Wald, Wald corrected, Wald, Blyth and Still, Agresti and Coull, Wilson, Score, Score corrected, Wald logit, Wald logit corrected, Arcsine and Exact binomial. References include, among others: Vollset, S. E. (1993). "Confidence intervals for a binomial proportion". Statistics in Medicine, 12(9): 809-824. <doi:10.1002/sim.4780120902>.
This package provides a first-principle, phylogeny-aware comparative genomics tool for investigating associations between terms used to annotate genomic components (e.g., Pfam IDs, Gene Ontology terms,) with quantitative or rank variables such as number of cell types, genome size, or density of specific genomic elements. See the project website for more information, documentation and examples, and <doi:10.1016/j.patter.2023.100728> for the full paper.
An intuitive, cross-platform graphical data analysis system. It uses menus and dialogs to guide the user efficiently through the data manipulation and analysis process, and has an excel like spreadsheet for easy data frame visualization and editing. Deducer works best when used with the Java based R GUI JGR, but the dialogs can be called from the command line. Dialogs have also been integrated into the Windows Rgui.
Spatial analyses involving binning require that every bin have the same area, but this is impossible using a rectangular grid laid over the Earth or over any projection of the Earth. Discrete global grids use hexagons, triangles, and diamonds to overcome this issue, overlaying the Earth with equally-sized bins. This package provides utilities for working with discrete global grids, along with utilities to aid in plotting such data.
This package provides a collection of machine learning helper functions, particularly assisting in the Exploratory Data Analysis phase. Makes heavy use of the data.table package for optimal speed and memory efficiency. Highlights include a versatile bin_data()
function, sparsify()
for converting a data.table to sparse matrix format with one-hot encoding, fast evaluation metrics, and empirical_cdf()
for calculating empirical Multivariate Cumulative Distribution Functions.
This package provides tools to help convert credit risk data at two timepoints into traditional credit state migration (aka, "transition") matrices. At a higher level, migrate is intended to help an analyst understand how risk moved in their credit portfolio over a time interval. References to this methodology include: 1. Schuermann, T. (2008) <doi:10.1002/9780470061596.risk0409>. 2. Perederiy, V. (2017) <doi:10.48550/arXiv.1708.00062>
.
An implementation of 14 parsimonious mixture models for model-based clustering or model-based classification. Gaussian, Student's t, generalized hyperbolic, variance-gamma or skew-t mixtures are available. All approaches work with missing data. Celeux and Govaert (1995) <doi:10.1016/0031-3203(94)00125-6>, Browne and McNicholas
(2014) <doi:10.1007/s11634-013-0139-1>, Browne and McNicholas
(2015) <doi:10.1002/cjs.11246>.
Format numbers and plots for publication; includes the removal of leading zeros, standardization of number of digits, addition of affixes, and a p-value formatter. These tools combine the functionality of several base functions such as paste()
', format()
', and sprintf()
into specific use case functions that are named in a way that is consistent with usage, making their names easy to remember and easy to deploy.
Simulate DNA sequences for the node substitution model. In the node substitution model, substitutions accumulate additionally during a speciation event, providing a potential mechanistic explanation for substitution rate variation. This package provides tools to simulate such a process, simulate a reference process with only substitutions along the branches, and provides tools to infer phylogenies from alignments. More information can be found in Janzen (2021) <doi:10.1093/sysbio/syab085>.