Add mean comparison annotations to a ggplot'. This package provides an easy way to indicate if two or more groups are significantly different in a ggplot'. Usually you do not need to specify the test method, you only need to tell stat_compare()
whether you want to perform a parametric test or a nonparametric test, and stat_compare()
will automatically choose the appropriate test method based on your data. For comparisons between two groups, the p-value is calculated by t-test (parametric) or Wilcoxon rank sum test (nonparametric). For comparisons among more than two groups, the p-value is calculated by One-way ANOVA (parametric) or Kruskal-Wallis test (nonparametric).
Raman and (FT)IR spectral analysis tool for plastic particles and other environmental samples (Cowger et al. 2021, <doi:10.1021/acs.analchem.1c00123>). With read_any()
, Open Specy provides a single function for reading individual, batch, or map spectral data files like .asp, .csv, .jdx, .spc, .spa, .0, and .zip. process_spec()
simplifies processing spectra, including smoothing, baseline correction, range restriction and flattening, intensity conversions, wavenumber alignment, and min-max normalization. Spectra can be identified in batch using an onboard reference library (Cowger et al. 2020, <doi:10.1177/0003702820929064>) using match_spec()
. A Shiny app is available via run_app()
or online at <https://openanalysis.org/openspecy/>.
Analysis tools to investigate changes in intercellular communication from scRNA-seq
data. Using a Seurat object as input, the package infers which cell-cell interactions are present in the dataset and how these interactions change between two conditions of interest (e.g. young vs old). It relies on an internal database of ligand-receptor interactions (available for human, mouse and rat) that have been gathered from several published studies. Detection and differential analyses rely on permutation tests. The package also contains several tools to perform over-representation analysis and visualize the results. See Lagger, C. et al. (2023) <doi:10.1038/s43587-023-00514-x> for a full description of the methodology.
The extrafont package makes it easier to use fonts other than the basic PostScript fonts that R uses. Fonts that are imported into extrafont can be used with PDF or PostScript output files. There are two hurdles for using fonts in PDF (or Postscript) output files:
Making R aware of the font and the dimensions of the characters.
Embedding the fonts in the PDF file so that the PDF can be displayed properly on a device that doesn't have the font. This is usually needed if you want to print the PDF file or share it with others.
The extrafont package makes both of these things easier.
This package provides a novel interpretable machine learning-based framework to automate the development of a clinical scoring model for predefined outcomes. Our novel framework consists of six modules: variable ranking with machine learning, variable transformation, score derivation, model selection, domain knowledge-based score fine-tuning, and performance evaluation.The The original AutoScore
structure is described in the research paper<doi:10.2196/21798>. A full tutorial can be found here<https://nliulab.github.io/AutoScore/>
. Users or clinicians could seamlessly generate parsimonious sparse-score risk models (i.e., risk scores), which can be easily implemented and validated in clinical practice. We hope to see its application in various medical case studies.
Enables off-the-shelf functionality for fully Bayesian, nonstationary Gaussian process modeling. The approach to nonstationary modeling involves a closed-form, convolution-based covariance function with spatially-varying parameters; these parameter processes can be specified either deterministically (using covariates or basis functions) or stochastically (using approximate Gaussian processes). Stationary Gaussian processes are a special case of our methodology, and we furthermore implement approximate Gaussian process inference to account for very large spatial data sets (Finley, et al (2017) <arXiv:1702.00434v2>
). Bayesian inference is carried out using Markov chain Monte Carlo methods via the nimble package, and posterior prediction for the Gaussian process at unobserved locations is provided as a post-processing step.
This package provides a toolbox for implementing the Ecological Dynamic Regime framework (Sánchez-Pinillos et al., 2023 <doi:10.1002/ecm.1589>) to characterize and compare groups of ecological trajectories in multidimensional spaces defined by state variables. The package includes the RETRA-EDR algorithm to identify representative trajectories, functions to generate, summarize, and visualize representative trajectories, and several metrics to quantify the distribution and heterogeneity of trajectories in an ecological dynamic regime and quantify the dissimilarity between two or more ecological dynamic regimes. The package also includes a set of functions to assess ecological resilience based on ecological dynamic regimes (Sánchez-Pinillos et al., 2024 <doi:10.1016/j.biocon.2023.110409>).
Training and predict functions for Single Hidden-layer Feedforward Neural Networks (SLFN) using the Extreme Learning Machine (ELM) algorithm. The ELM algorithm differs from the traditional gradient-based algorithms for very short training times (it doesn't need any iterative tuning, this makes learning time very fast) and there is no need to set any other parameters like learning rate, momentum, epochs, etc. This is a reimplementation of the elmNN
package using RcppArmadillo
after the elmNN
package was archived. For more information, see "Extreme learning machine: Theory and applications" by Guang-Bin Huang, Qin-Yu Zhu, Chee-Kheong Siew (2006), Elsevier B.V, <doi:10.1016/j.neucom.2005.12.126>.
This package provides elastic net penalized maximum likelihood estimator for structural equation models (SEM). The package implements `lasso` and `elastic net` (l1/l2) penalized SEM and estimates the model parameters with an efficient block coordinate ascent algorithm that maximizes the penalized likelihood of the SEM. Hyperparameters are inferred from cross-validation (CV). A Stability Selection (STS) function is also available to provide accurate causal effect selection. The software achieves high accuracy performance through a `Network Generative Pre-trained Transformer` (Network GPT) Framework with two steps: 1) pre-trains the model to generate a complete (fully connected) graph; and 2) uses the complete graph as the initial state to fit the `elastic net` penalized SEM.
Launch an application by a simple click without opening R or RStudio. The package has 3 functions of which only one is essential in its use, `shiny.exe()
`. It generates a script in the open shiny project then create a shortcut in the same folder that allows you to launch the app by clicking.If you set `host = public'`, the application will be launched on the public server to which you are connected. Thus, all other devices connected to the same server will be able to access the application through the link of your `IPv4` extended by the port. You can stop the application by leaving the terminal opened by the shortcut.
This is a data only package providing the algorithmic complexity of short strings, computed using the coding theorem method. For a given set of symbols in a string, all possible or a large number of random samples of Turing machines with a given number of states (e.g., 5) and number of symbols corresponding to the number of symbols in the strings were simulated until they reached a halting state or failed to end. This package contains data on 4.5 million strings from length 1 to 12 simulated on Turing machines with 2, 4, 5, 6, and 9 symbols. The complexity of the string corresponds to the distribution of the halting states.
Enable users to evaluate long-term trends using a Generalized Additive Modeling (GAM) approach. The model development includes selecting a GAM structure to describe nonlinear seasonally-varying changes over time, incorporation of hydrologic variability via either a river flow or salinity, the use of an intervention to deal with method or laboratory changes suspected to impact data values, and representation of left- and interval-censored data. The approach has been applied to water quality data in the Chesapeake Bay, a major estuary on the east coast of the United States to provide insights to a range of management- and research-focused questions. Methodology described in Murphy (2019) <doi:10.1016/j.envsoft.2019.03.027>.
This package provides functions for generating tables required for drawing and calculating hypsometric curves and hypsometric integrals. These functions accept as input the DEM of the region of interest (your watershed) and a spatial data frame file specifying delineation of sub-catchments within the watershed. They then generate output in the form of PNG images and HTML files contained in a folder named "HYPSO_OUTPUT" created in the current directory. S. K. Sharma, S. Gajbhiye, et al. (2018) <doi:10.1007/978-981-10-5801-1_19>. Omvir Singh, A. Sarangi, and Milap C. Sharma (2006) <doi:10.1007/s11269-008-9242-z>. James A. Vanderwaal and Herbert Ssegane (2013) <doi:10.1111/jawr.12089>.
Fits Bayesian dose-response model-based network meta-analysis (MBNMA) that incorporate multiple doses within an agent by modelling different dose-response functions, as described by Mawdsley et al. (2016) <doi:10.1002/psp4.12091>. By modelling dose-response relationships this can connect networks of evidence that might otherwise be disconnected, and can improve precision on treatment estimates. Several common dose-response functions are provided; others may be added by the user. Various characteristics and assumptions can be flexibly added to the models, such as shared class effects. The consistency of direct and indirect evidence in the network can be assessed using unrelated mean effects models and/or by node-splitting at the treatment level.
This package provides functions used to fit and test the phenology of species based on counts. Based on Girondot, M. (2010) <doi:10.3354/esr00292> for the phenology function, Girondot, M. (2017) <doi:10.1016/j.ecolind.2017.05.063> for the convolution of negative binomial, Girondot, M. and Rizzo, A. (2015) <doi:10.2993/etbi-35-02-337-353.1> for Bayesian estimate, Pfaller JB, ..., Girondot M (2019) <doi:10.1007/s00227-019-3545-x> for tag-loss estimate, Hancock J, ..., Girondot M (2019) <doi:10.1016/j.ecolmodel.2019.04.013> for nesting history, Laloe J-O, ..., Girondot M, Hays GC (2020) <doi:10.1007/s00227-020-03686-x> for aggregating several seasons.
Systematic 3D interaction calls and differential analysis for Hi-C and HiChIP
. The HiC-DC+
(Hi-C/HiChIP
direct caller plus) package enables principled statistical analysis of Hi-C and HiChIP
data sets – including calling significant interactions within a single experiment and performing differential analysis between conditions given replicate experiments – to facilitate global integrative studies. HiC-DC+
estimates significant interactions in a Hi-C or HiChIP
experiment directly from the raw contact matrix for each chromosome up to a specified genomic distance, binned by uniform genomic intervals or restriction enzyme fragments, by training a background model to account for random polymer ligation and systematic sources of read count variation.
Many methods allow us to extract biological activities from omics data using information from prior knowledge resources, reducing the dimensionality for increased statistical power and better interpretability. decoupleR is a Bioconductor package containing different statistical methods to extract these signatures within a unified framework. decoupleR allows the user to flexibly test any method with any resource. It incorporates methods that take into account the sign and weight of network interactions. decoupleR can be used with any omic, as long as its features can be linked to a biological process based on prior knowledge. For example, in transcriptomics gene sets regulated by a transcription factor, or in phospho-proteomics phosphosites that are targeted by a kinase.
Dual Scaling, developed by Professor Shizuhiko Nishisato (1994, ISBN: 0-9691785-3-6), is a fundamental technique in multivariate analysis used for data scaling and correspondence analysis. Its utility lies in its ability to represent multidimensional data in a lower-dimensional space, making it easier to visualize and understand underlying patterns in complex data. This technique has been implemented to handle various types of data, including Contingency and Frequency data (CF), Multiple-Choice data (MC), Sorting data (SO), Paired-Comparison data (PC), and Rank-Order data (RO), providing users with a powerful tool to explore relationships between variables and observations in various fields, from sociology to ecology, enabling deeper and more efficient analysis of multivariate datasets.
This package implements two estimations related to the foundations of info metrics applied to ecological inference. These methodologies assess the lack of disaggregated data and provide an approach to obtaining disaggregated territorial-level data. For more details, see the following references: Fernández-Vázquez, E., Dà az-Dapena, A., Rubiera-Morollón, F. et al. (2020) "Spatial Disaggregation of Social Indicators: An Info-Metrics Approach." <doi:10.1007/s11205-020-02455-z>. Dà az-Dapena, A., Fernández-Vázquez, E., Rubiera-Morollón, F., & Vinuela, A. (2021) "Mapping poverty at the local level in Europe: A consistent spatial disaggregation of the AROPE indicator for France, Spain, Portugal and the United Kingdom." <doi:10.1111/rsp3.12379>.
Package for training interpretable machine learning models. Historically, the most interpretable machine learning models were not very accurate, and the most accurate models were not very interpretable. Microsoft Research has developed an algorithm called the Explainable Boosting Machine (EBM) which has both high accuracy and interpretable characteristics. EBM uses machine learning techniques like bagging and boosting to breathe new life into traditional GAMs (Generalized Additive Models). This makes them as accurate as random forests and gradient boosted trees, and also enhances their intelligibility and editability. Details on the EBM algorithm can be found in the paper by Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad (2015, <doi:10.1145/2783258.2788613>).
The Piece-wise exponential (Additive Mixed) Model (PAMM; Bender and others (2018) <doi: 10.1177/1471082X17748083>) is a powerful model class for the analysis of survival (or time-to-event) data, based on Generalized Additive (Mixed) Models (GA(M)Ms). It offers intuitive specification and robust estimation of complex survival models with stratified baseline hazards, random effects, time-varying effects, time-dependent covariates and cumulative effects (Bender and others (2019)), as well as support for left-truncated data as well as competing risks, recurrent events and multi-state settings. pammtools provides tidy workflow for survival analysis with PAMMs, including data simulation, transformation and other functions for data preprocessing and model post-processing as well as visualization.
In the single cell World, which includes flow cytometry, mass cytometry, single-cell RNA-seq (scRNA-seq
), and others, there is a need to improve data visualisation and to bring analysis capabilities to researchers even from non-technical backgrounds. scDataviz
attempts to fit into this space, while also catering for advanced users. Additonally, due to the way that scDataviz
is designed, which is based on SingleCellExperiment
, it has a plug and play feel, and immediately lends itself as flexibile and compatibile with studies that go beyond scDataviz
. Finally, the graphics in scDataviz
are generated via the ggplot engine, which means that users can add on features to these with ease.
Variable selection for latent class analysis for model-based clustering of multivariate categorical data. The package implements a general framework for selecting the subset of variables with relevant clustering information and discard those that are redundant and/or not informative. The variable selection method is based on the approach of Fop et al. (2017) <doi:10.1214/17-AOAS1061> and Dean and Raftery (2010) <doi:10.1007/s10463-009-0258-9>. Different algorithms are available to perform the selection: stepwise, swap-stepwise and evolutionary stochastic search. Concomitant covariates used to predict the class membership probabilities can also be included in the latent class analysis model. The selection procedure can be run in parallel on multiple cores machines.
This package provides a toolbox to facilitate the calculation of political system indicators for researchers. This package offers a variety of basic indicators related to electoral systems, party systems, elections, and parliamentary studies, as well as others. Main references are: Loosemore and Hanby (1971) <doi:10.1017/S000712340000925X>; Gallagher (1991) <doi:10.1016/0261-3794(91)90004-C>; Laakso and Taagepera (1979) <doi:10.1177/001041407901200101>; Rae (1968) <doi:10.1177/001041406800100305>; HirschmaÅ (1945) <ISBN:0-520-04082-1>; Kesselman (1966) <doi:10.2307/1953769>; Jones and Mainwaring (2003) <doi:10.1177/13540688030092002>; Rice (1925) <doi:10.2307/2142407>; Pedersen (1979) <doi:10.1111/j.1475-6765.1979.tb01267.x>; SANTOS (2002) <ISBN:85-225-0395-8>.