This package contains efficient implementations of Discrete Optimal Transport algorithms for the computation of Kantorovich-Wasserstein distances between pairs of large spatial maps (Bassetti, Gualandi, Veneroni (2020), <doi:10.1137/19M1261195>). All the algorithms are based on an ad-hoc implementation of the Network Simplex algorithm. The package has four main helper functions: compareOneToOne()
(to compare two spatial maps), compareOneToMany()
(to compare a reference map with a list of other maps), compareAll()
(to compute a matrix of distances between a list of maps), and focusArea()
(to compute the KWD distance within a focus area). In non-convex maps, the helper functions first build the convex-hull of the input bins and pad the weights with zeros.
This package provides a graph community detection algorithm that aims to be performant on large graphs and robust, returning consistent results across runs. SpeakEasy
2 (SE2), the underlying algorithm, is described in Chris Gaiteri, David R. Connell & Faraz A. Sultan et al. (2023) <doi:10.1186/s13059-023-03062-0>. The core algorithm is written in C', providing speed and keeping the memory requirements low. This implementation can take advantage of multiple computing cores without increasing memory usage. SE2 can detect community structure across scales, making it a good choice for biological data, which often has hierarchical structure. Graphs can be passed to the algorithm as adjacency matrices using base R matrices, the Matrix library, igraph graphs, or any data that can be coerced into a matrix.
Fit latent variable models with the GEV distribution as the data likelihood and the GEV parameters following latent Gaussian processes. The models in this package are built using the template model builder TMB in R, which has the fast ability to integrate out the latent variables using Laplace approximation. This package allows the users to choose in the fit function which GEV parameter(s) is considered as a spatially varying random effect following a Gaussian process, so the users can fit spatial GEV models with different complexities to their dataset without having to write the models in TMB by themselves. This package also offers methods to sample from both fixed and random effects posteriors as well as the posterior predictive distributions at different spatial locations. Methods for fitting this class of models are described in Chen, Ramezan, and Lysy (2024) <doi:10.48550/arXiv.2110.07051>
.
This package provides a multidimensional dataset of students performance assessment in high school physics. The SPHERE dataset was collected from 497 students in four public high schools specifically measuring their conceptual understanding, scientific ability, and attitude toward physics [see Santoso et al. (2024) <doi:10.17632/88d7m2fv7p.1>]. The data collection was conducted using some research based assessments established by the physics education research community. They include the Force Concept Inventory, the Force and Motion Conceptual Evaluation, the Rotational and Rolling Motion Conceptual Survey, the Fluid Mechanics Concept Inventory, the Mechanical Waves Conceptual Survey, the Thermal Concept Evaluation, the Survey of Thermodynamic Processes and First and Second Laws, the Scientific Abilities Assessment Rubrics, and the Colorado Learning Attitudes about Science Survey. Students attributes related to gender, age, socioeconomic status, domicile, literacy, physics identity, and test results administered using teachers developed items are also reported in this dataset.
Spatial downscaling of climate data (Global Circulation Models/Regional Climate Models) using quantile-quantile bias correction technique.
This package implements the diffusion map method of dimensionality reduction and spectral method of combining multiple diffusion maps, including creation of the spectra and visualization of maps.
This package provides a set of functions for obtaining positional parameters and magnitude difference between components of binary and multiple stellar systems from series of speckle images.
Utility functions for spectroscopy. 1. Functions to simulate spectra for use in teaching or testing. 2. Functions to process files created by LoggerPro
and SpectraSuite
software.
An R data package containing setlists from all Bruce Springsteen concerts over 1973-2021. Also includes all his song details such as lyrics and albums. Data extracted from: <http://brucebase.wikidot.com/>.
Visualization and analysis of Vectra Immunoflourescent data. Options for calculating both the univariate and bivariate Ripley's K are included. Calculations are performed using a permutation-based approach presented by Wilson et al. <doi:10.1101/2021.04.27.21256104>.
This package provides functions to generate or sample from all possible splits of features or variables into a number of specified groups. Also computes the best split selection estimator (for low-dimensional data) as defined in Christidis, Van Aelst and Zamar (2019) <arXiv:1812.05678>
.
This package provides sparse vectors powered by ALTREP (Alternative Representations for R Objects) that behave like regular vectors, and can thus be used in data frames. It also provides tools to convert between sparse matrices and data frames with sparse columns and functions to interact with sparse vectors.
Corrects the spelling of a given word in English using a modification of Peter Norvig's spell correct algorithm (see <http://norvig.com/spell-correct.html>) which handles up to three edits. The algorithm tries to find the spelling with maximum probability of intended correction out of all possible candidate corrections from the original word.
Making specification curve analysis easy, fast, and pretty. It improves upon existing offerings with additional features and tidyverse integration. Users can easily visualize and evaluate how their models behave under different specifications with a high degree of customization. For a description and applications of specification curve analysis see Simonsohn, Simmons, and Nelson (2020) <doi:10.1038/s41562-020-0912-z>.
This package performs estimation and testing of the treatment effect in a 2-group randomized clinical trial with a quantitative, dichotomous, or right-censored time-to-event endpoint. The method improves efficiency by leveraging baseline predictors of the endpoint. The inverse probability weighting technique of Robins, Rotnitzky, and Zhao (JASA, 1994) is used to provide unbiased estimation when the endpoint is missing at random.
This package provides methods for spatial risk calculations. It offers an efficient approach to determine the sum of all observations within a circle of a certain radius. This might be beneficial for insurers who are required (by a recent European Commission regulation) to determine the maximum value of insured fire risk policies of all buildings that are partly or fully located within a circle of a radius of 200m. See Church (1974) <doi:10.1007/BF01942293> for a description of the problem.
Deconvolution of spatial transcriptomics data based on neural networks and single-cell RNA-seq data. SpatialDDLS
implements a workflow to create neural network models able to make accurate estimates of cell composition of spots from spatial transcriptomics data using deep learning and the meaningful information provided by single-cell RNA-seq data. See Torroja and Sanchez-Cabo (2019) <doi:10.3389/fgene.2019.00978> and Mañanes et al. (2024) <doi:10.1093/bioinformatics/btae072> to get an overview of the method and see some examples of its performance.
Selection of spatially balanced samples. In particular, the implemented sampling designs allow to select probability samples well spread over the population of interest, in any dimension and using any distance function (e.g. Euclidean distance, Manhattan distance). For more details, Pantalone F, Benedetti R, and Piersimoni F (2022) <doi:10.18637/jss.v103.c02>, Benedetti R and Piersimoni F (2017) <doi:10.1002/bimj.201600194>, and Benedetti R and Piersimoni F (2017) <arXiv:1710.09116>
. The implementation has been done in C++ through the use of Rcpp and RcppArmadillo
'.
From output files obtained from the software ModestR
', the relative contribution of factors to explain species distribution is depicted using several plots. A global geographic raster file for each environmental variable may be also obtained with the mean relative contribution, considering all species present in each raster cell, of the factor to explain species distribution. Finally, for each variable it is also possible to compare the frequencies of any variable obtained in the cells where the species is present with the frequencies of the same variable in the cells of the extent.
This package provides the core framework for a discrete event system to implement a complete data-to-decisions, reproducible workflow. The core components facilitate the development of modular pieces, and enable the user to include additional functionality by running user-built modules. Includes conditional scheduling, restart after interruption, packaging of reusable modules, tools for developing arbitrary automated workflows, automated interweaving of modules of different temporal resolution, and tools for visualizing and understanding the within-project dependencies. The suggested package NLMR can be installed from the repository (<https://PredictiveEcology.r-universe.dev>
).
Programs to find the sample size or power of studies using the Sequential Parallel Comparison Design (SPCD) and programs to analyze such studies. This is a clinical trial design where patients initially on placebo who did not respond are re-randomized between placebo and active drug in a second phase and the results of the two phases are pooled. The method of analyzing binary data with this design is described in Fava,Evins, Dorer and Schoenfeld(2003) <doi:10.1159/000069738>, and the method of analyzing continuous data is described in Chen, Yang, Hung and Wang (2011) <doi:10.1016/j.cct.2011.04.006>.
This package provides tools to assess the association between two spatial processes. Currently, several methodologies are implemented: A modified t-test to perform hypothesis testing about the independence between the processes, a suitable nonparametric correlation coefficient, the codispersion coefficient, and an F test for assessing the multiple correlation between one spatial process and several others. Functions for image processing and computing the spatial association between images are also provided. Functions contained in the package are intended to accompany Vallejos, R., Osorio, F., Bevilacqua, M. (2020). Spatial Relationships Between Two Georeferenced Variables: With Applications in R. Springer, Cham <doi:10.1007/978-3-030-56681-4>.
The SparseArray
package is an infrastructure package that provides an array-like container for efficient in-memory representation of multidimensional sparse data in R. The package defines the SparseArray
virtual class and two concrete subclasses: COO_SparseArray
and SVT_SparseArray
. Each subclass uses its own internal representation of the nonzero multidimensional data, the "COO layout" and the "SVT layout", respectively. SVT_SparseArray
objects mimic as much as possible the behavior of ordinary matrix and array objects in base R. In particular, they support most of the "standard matrix and array API" defined in base R and in the matrixStats
package from CRAN.
This package provides several Bayesian survival models for spatial/non-spatial survival data: proportional hazards (PH), accelerated failure time (AFT), proportional odds (PO), and accelerated hazards (AH), a super model that includes PH, AFT, PO and AH as special cases, Bayesian nonparametric nonproportional hazards (LDDPM), generalized accelerated failure time (GAFT), and spatially smoothed Polya tree density estimation. The spatial dependence is modeled via frailties under PH, AFT, PO, AH and GAFT, and via copulas under LDDPM and PH. Model choice is carried out via the logarithm of the pseudo marginal likelihood (LPML), the deviance information criterion (DIC), and the Watanabe-Akaike information criterion (WAIC). See Zhou, Hanson and Zhang (2020) <doi:10.18637/jss.v092.i09>.