Herein, we provide a broad variety of functions which are useful for handling, manipulating, and visualizing satellite-based remote sensing data. These operations range from mere data import and layer handling (eg subsetting), over Raster* typical data wrangling (eg crop, extend), to more sophisticated (pre-)processing tasks typically applied to satellite imagery (eg atmospheric and topographic correction). This functionality is complemented by a full access to the satellite layers metadata at any stage and the documentation of performed actions in a separate log file. Currently available sensors include Landsat 4-5 (TM), 7 (ETM+), and 8 (OLI/TIRS Combined), and additional compatibility is ensured for the Landsat Global Land Survey data set.
This package provides a collection of functions which (i) assess the quality of variable subsets as surrogates for a full data set, in either an exploratory data analysis or in the context of a multivariate linear model, and (ii) search for subsets which are optimal under various criteria. Theoretical support for the heuristic search methods and exploratory data analysis criteria is in Cadima, Cerdeira, Minhoto (2003, <doi:10.1016/j.csda.2003.11.001>). Theoretical support for the leap and bounds algorithm and the criteria for the general multivariate linear model is in Duarte Silva (2001, <doi:10.1006/jmva.2000.1920>). There is a package vignette "subselect", which includes additional references.
Mixed models for repeated measures (MMRM) are a popular choice for analyzing longitudinal continuous outcomes in randomized clinical trials and beyond; see for example Cnaan, Laird and Slasor (1997) <doi:10.1002/(SICI)1097-0258(19971030)16:20%3C2349::AID-SIM667%3E3.0.CO;2-E>. This package provides an interface for fitting MMRM within the tern <https://cran.r-project.org/package=tern> framework by Zhu et al. (2023) and tabulate results easily using rtables <https://cran.r-project.org/package=rtables> by Becker et al. (2023). It builds on mmrm <https://cran.r-project.org/package=mmrm> by Sabanés Bové et al. (2023) for the actual MMRM computations.
This package provides functions to create descriptive statistics tables for continuous and categorical variables. By default, summary statistics such as mean, standard deviation, quantiles, minimum and maximum for continuous variables and relative and absolute frequencies for categorical variables are calculated. DescrTab2
features a sophisticated algorithm to choose appropriate test statistics for your data and provides p-values. On top of this, confidence intervals for group differences of appropriated summary measures are automatically produces for two-group comparison. Tables generated by DescrTab2
can be integrated in a variety of document formats, including .html, .tex and .docx documents. DescrTab2
also allows printing tables to console and saving table objects for later use.
Models integrate environmental DNA (eDNA
) detection data and traditional survey data to jointly estimate species catch rate (see package vignette: <https://ednajoint.netlify.app/>). Models can be used with count data via traditional survey methods (i.e., trapping, electrofishing, visual) and replicated eDNA
detection/nondetection data via polymerase chain reaction (i.e., PCR or qPCR
) from multiple survey locations. Estimated parameters include probability of a false positive eDNA
detection, a site-level covariates that scale the sensitivity of eDNA
surveys relative to traditional surveys, and catchability coefficients for traditional gear types. Models are implemented with a Bayesian framework (Markov chain Monte Carlo) using the Stan probabilistic programming language.
Add mean comparison annotations to a ggplot'. This package provides an easy way to indicate if two or more groups are significantly different in a ggplot'. Usually you do not need to specify the test method, you only need to tell stat_compare()
whether you want to perform a parametric test or a nonparametric test, and stat_compare()
will automatically choose the appropriate test method based on your data. For comparisons between two groups, the p-value is calculated by t-test (parametric) or Wilcoxon rank sum test (nonparametric). For comparisons among more than two groups, the p-value is calculated by One-way ANOVA (parametric) or Kruskal-Wallis test (nonparametric).
Analysis tools to investigate changes in intercellular communication from scRNA-seq
data. Using a Seurat object as input, the package infers which cell-cell interactions are present in the dataset and how these interactions change between two conditions of interest (e.g. young vs old). It relies on an internal database of ligand-receptor interactions (available for human, mouse and rat) that have been gathered from several published studies. Detection and differential analyses rely on permutation tests. The package also contains several tools to perform over-representation analysis and visualize the results. See Lagger, C. et al. (2023) <doi:10.1038/s43587-023-00514-x> for a full description of the methodology.
The extrafont package makes it easier to use fonts other than the basic PostScript fonts that R uses. Fonts that are imported into extrafont can be used with PDF or PostScript output files. There are two hurdles for using fonts in PDF (or Postscript) output files:
Making R aware of the font and the dimensions of the characters.
Embedding the fonts in the PDF file so that the PDF can be displayed properly on a device that doesn't have the font. This is usually needed if you want to print the PDF file or share it with others.
The extrafont package makes both of these things easier.
This package provides a novel interpretable machine learning-based framework to automate the development of a clinical scoring model for predefined outcomes. Our novel framework consists of six modules: variable ranking with machine learning, variable transformation, score derivation, model selection, domain knowledge-based score fine-tuning, and performance evaluation.The The original AutoScore
structure is described in the research paper<doi:10.2196/21798>. A full tutorial can be found here<https://nliulab.github.io/AutoScore/>
. Users or clinicians could seamlessly generate parsimonious sparse-score risk models (i.e., risk scores), which can be easily implemented and validated in clinical practice. We hope to see its application in various medical case studies.
Enables off-the-shelf functionality for fully Bayesian, nonstationary Gaussian process modeling. The approach to nonstationary modeling involves a closed-form, convolution-based covariance function with spatially-varying parameters; these parameter processes can be specified either deterministically (using covariates or basis functions) or stochastically (using approximate Gaussian processes). Stationary Gaussian processes are a special case of our methodology, and we furthermore implement approximate Gaussian process inference to account for very large spatial data sets (Finley, et al (2017) <arXiv:1702.00434v2>
). Bayesian inference is carried out using Markov chain Monte Carlo methods via the nimble package, and posterior prediction for the Gaussian process at unobserved locations is provided as a post-processing step.
This package provides a toolbox for implementing the Ecological Dynamic Regime framework (Sánchez-Pinillos et al., 2023 <doi:10.1002/ecm.1589>) to characterize and compare groups of ecological trajectories in multidimensional spaces defined by state variables. The package includes the RETRA-EDR algorithm to identify representative trajectories, functions to generate, summarize, and visualize representative trajectories, and several metrics to quantify the distribution and heterogeneity of trajectories in an ecological dynamic regime and quantify the dissimilarity between two or more ecological dynamic regimes. The package also includes a set of functions to assess ecological resilience based on ecological dynamic regimes (Sánchez-Pinillos et al., 2024 <doi:10.1016/j.biocon.2023.110409>).
Training and predict functions for Single Hidden-layer Feedforward Neural Networks (SLFN) using the Extreme Learning Machine (ELM) algorithm. The ELM algorithm differs from the traditional gradient-based algorithms for very short training times (it doesn't need any iterative tuning, this makes learning time very fast) and there is no need to set any other parameters like learning rate, momentum, epochs, etc. This is a reimplementation of the elmNN
package using RcppArmadillo
after the elmNN
package was archived. For more information, see "Extreme learning machine: Theory and applications" by Guang-Bin Huang, Qin-Yu Zhu, Chee-Kheong Siew (2006), Elsevier B.V, <doi:10.1016/j.neucom.2005.12.126>.
Método simples e eficiente de geolocalizar dados no Brasil. O pacote é baseado em conjuntos de dados espaciais abertos de endereços brasileiros, utilizando como fonte principal o Cadastro Nacional de Endereços para Fins Estatà sticos (CNEFE). O CNEFE é publicado pelo Instituto Brasileiro de Geografia e Estatà stica (IBGE), órgão oficial de estatà sticas e geografia do Brasil. (A simple and efficient method for geolocating data in Brazil. The package is based on open spatial datasets of Brazilian addresses, primarily using the Cadastro Nacional de Endereços para Fins Estatà sticos (CNEFE), published by the Instituto Brasileiro de Geografia e Estatà stica (IBGE), Brazil's official statistics and geography agency.).
Raman and (FT)IR spectral analysis tool for plastic particles and other environmental samples (Cowger et al. 2021, <doi:10.1021/acs.analchem.1c00123>). With read_any()
, Open Specy provides a single function for reading individual, batch, or map spectral data files like .asp, .csv, .jdx, .spc, .spa, .0, and .zip. process_spec()
simplifies processing spectra, including smoothing, baseline correction, range restriction and flattening, intensity conversions, wavenumber alignment, and min-max normalization. Spectra can be identified in batch using an onboard reference library (Cowger et al. 2020, <doi:10.1177/0003702820929064>) using match_spec()
. A Shiny app is available via run_app()
or online at <https://www.openanalysis.org/openspecy/>.
Launch an application by a simple click without opening R or RStudio. The package has 3 functions of which only one is essential in its use, `shiny.exe()
`. It generates a script in the open shiny project then create a shortcut in the same folder that allows you to launch the app by clicking.If you set `host = public'`, the application will be launched on the public server to which you are connected. Thus, all other devices connected to the same server will be able to access the application through the link of your `IPv4` extended by the port. You can stop the application by leaving the terminal opened by the shortcut.
This package provides elastic net penalized maximum likelihood estimator for structural equation models (SEM). The package implements `lasso` and `elastic net` (l1/l2) penalized SEM and estimates the model parameters with an efficient block coordinate ascent algorithm that maximizes the penalized likelihood of the SEM. Hyperparameters are inferred from cross-validation (CV). A Stability Selection (STS) function is also available to provide accurate causal effect selection. The software achieves high accuracy performance through a `Network Generative Pre-trained Transformer` (Network GPT) Framework with two steps: 1) pre-trains the model to generate a complete (fully connected) graph; and 2) uses the complete graph as the initial state to fit the `elastic net` penalized SEM.
This is a data only package providing the algorithmic complexity of short strings, computed using the coding theorem method. For a given set of symbols in a string, all possible or a large number of random samples of Turing machines with a given number of states (e.g., 5) and number of symbols corresponding to the number of symbols in the strings were simulated until they reached a halting state or failed to end. This package contains data on 4.5 million strings from length 1 to 12 simulated on Turing machines with 2, 4, 5, 6, and 9 symbols. The complexity of the string corresponds to the distribution of the halting states.
Enable users to evaluate long-term trends using a Generalized Additive Modeling (GAM) approach. The model development includes selecting a GAM structure to describe nonlinear seasonally-varying changes over time, incorporation of hydrologic variability via either a river flow or salinity, the use of an intervention to deal with method or laboratory changes suspected to impact data values, and representation of left- and interval-censored data. The approach has been applied to water quality data in the Chesapeake Bay, a major estuary on the east coast of the United States to provide insights to a range of management- and research-focused questions. Methodology described in Murphy (2019) <doi:10.1016/j.envsoft.2019.03.027>.
This package provides functions for generating tables required for drawing and calculating hypsometric curves and hypsometric integrals. These functions accept as input the DEM of the region of interest (your watershed) and a spatial data frame file specifying delineation of sub-catchments within the watershed. They then generate output in the form of PNG images and HTML files contained in a folder named "HYPSO_OUTPUT" created in the current directory. S. K. Sharma, S. Gajbhiye, et al. (2018) <doi:10.1007/978-981-10-5801-1_19>. Omvir Singh, A. Sarangi, and Milap C. Sharma (2006) <doi:10.1007/s11269-008-9242-z>. James A. Vanderwaal and Herbert Ssegane (2013) <doi:10.1111/jawr.12089>.
Fits Bayesian dose-response model-based network meta-analysis (MBNMA) that incorporate multiple doses within an agent by modelling different dose-response functions, as described by Mawdsley et al. (2016) <doi:10.1002/psp4.12091>. By modelling dose-response relationships this can connect networks of evidence that might otherwise be disconnected, and can improve precision on treatment estimates. Several common dose-response functions are provided; others may be added by the user. Various characteristics and assumptions can be flexibly added to the models, such as shared class effects. The consistency of direct and indirect evidence in the network can be assessed using unrelated mean effects models and/or by node-splitting at the treatment level.
This package provides functions used to fit and test the phenology of species based on counts. Based on Girondot, M. (2010) <doi:10.3354/esr00292> for the phenology function, Girondot, M. (2017) <doi:10.1016/j.ecolind.2017.05.063> for the convolution of negative binomial, Girondot, M. and Rizzo, A. (2015) <doi:10.2993/etbi-35-02-337-353.1> for Bayesian estimate, Pfaller JB, ..., Girondot M (2019) <doi:10.1007/s00227-019-3545-x> for tag-loss estimate, Hancock J, ..., Girondot M (2019) <doi:10.1016/j.ecolmodel.2019.04.013> for nesting history, Laloe J-O, ..., Girondot M, Hays GC (2020) <doi:10.1007/s00227-020-03686-x> for aggregating several seasons.
Systematic 3D interaction calls and differential analysis for Hi-C and HiChIP
. The HiC-DC+
(Hi-C/HiChIP
direct caller plus) package enables principled statistical analysis of Hi-C and HiChIP
data sets – including calling significant interactions within a single experiment and performing differential analysis between conditions given replicate experiments – to facilitate global integrative studies. HiC-DC+
estimates significant interactions in a Hi-C or HiChIP
experiment directly from the raw contact matrix for each chromosome up to a specified genomic distance, binned by uniform genomic intervals or restriction enzyme fragments, by training a background model to account for random polymer ligation and systematic sources of read count variation.
Many methods allow us to extract biological activities from omics data using information from prior knowledge resources, reducing the dimensionality for increased statistical power and better interpretability. decoupleR is a Bioconductor package containing different statistical methods to extract these signatures within a unified framework. decoupleR allows the user to flexibly test any method with any resource. It incorporates methods that take into account the sign and weight of network interactions. decoupleR can be used with any omic, as long as its features can be linked to a biological process based on prior knowledge. For example, in transcriptomics gene sets regulated by a transcription factor, or in phospho-proteomics phosphosites that are targeted by a kinase.
Dual Scaling, developed by Professor Shizuhiko Nishisato (1994, ISBN: 0-9691785-3-6), is a fundamental technique in multivariate analysis used for data scaling and correspondence analysis. Its utility lies in its ability to represent multidimensional data in a lower-dimensional space, making it easier to visualize and understand underlying patterns in complex data. This technique has been implemented to handle various types of data, including Contingency and Frequency data (CF), Multiple-Choice data (MC), Sorting data (SO), Paired-Comparison data (PC), and Rank-Order data (RO), providing users with a powerful tool to explore relationships between variables and observations in various fields, from sociology to ecology, enabling deeper and more efficient analysis of multivariate datasets.