This package provides a fast parallelized alternative to R's native dist
function to calculate distance matrices for continuous, binary, and multi-dimensional input matrices, which supports a broad variety of predefined distance functions from other R packages, as well as user- defined functions written in C++. For ease of use, the parDist
function extends the signature of the dist
function and uses the same parameter naming conventions as distance methods of existing R packages.
Estimate and return the needed parameters for visualisations designed for OpenBudgets
<http://openbudgets.eu/> data. Calculate cluster analysis measures in Budget data of municipalities across Europe, according to the OpenBudgets
data model. It involves a set of techniques and algorithms used to find and divide the data into groups of similar observations. Also, can be used generally to extract visualisation parameters convert them to JSON format and use them as input in a different graphical interface.
Analysing data from evaluations of educational interventions using a randomised controlled trial design. Various analytical tools to perform sensitivity analysis using different methods are supported (e.g. frequentist models with bootstrapping and permutations options, Bayesian models). The included commands can be used for simple randomised trials, cluster randomised trials and multisite trials. The methods can also be used more widely beyond education trials. This package can be used to evaluate other intervention designs using Frequentist and Bayesian multilevel models.
Implementation of the Future API <doi:10.32614/RJ-2021-048> on top of the mirai package <doi:10.5281/zenodo.7912722>. This allows you to process futures, as defined by the future package, in parallel out of the box, on your local machine or across remote machines. Contrary to back-ends relying on the parallel package (e.g. multisession') and socket connections, mirai_cluster and mirai_multisession', provided here, can run more than 125 parallel R processes.
This package implements marker-based estimation of heritability when observations on genetically identical replicates are available. These can be either observations on individual plants or plot-level data in a field trial. Heritability can then be estimated using a mixed model for the individual plant or plot data. For comparison, also mixed-model based estimation using genotypic means and estimation of repeatability with ANOVA are implemented. For illustration the package contains several datasets for the model species Arabidopsis thaliana.
This package provides a suite of functions for conducting and interpreting analysis of statistical interaction in regression models that was formerly part of the jtools package. Functionality includes visualization of two- and three-way interactions among continuous and/or categorical variables as well as calculation of "simple slopes" and Johnson-Neyman intervals (see e.g., Bauer & Curran, 2005 <doi:10.1207/s15327906mbr4003_5>). These capabilities are implemented for generalized linear models in addition to the standard linear regression context.
This package provides functions to generate incidence matrices and bipartite graphs that have (1) a fixed fill rate, (2) given marginal sums, (3) marginal sums that follow given distributions, or (4) represent bill sponsorships in the US Congress <doi:10.31219/osf.io/ectms>. It can also generate an incidence matrix from an adjacency matrix, or bipartite graph from a unipartite graph, via a social process mirroring team, group, or organization formation <doi:10.48550/arXiv.2204.13670>
.
The package compiles functions for calculating prices of American put options with Least Squares Monte Carlo method. The option types are plain vanilla American put, Asian American put, and Quanto American put. The pricing algorithms include variance reduction techniques such as Antithetic Variates and Control Variates. Additional functions are given to derive "price surfaces" at different volatilities and strikes, create 3-D plots, quickly generate Geometric Brownian motion, and calculate prices of European options with Black & Scholes analytical solution.
Analysis and visualisation of synchrony, interaction, and joint movements from audio and video movement data of a group of music performers. The demo is data described in Clayton, Leante, and Tarsitani (2021) <doi:10.17605/OSF.IO/KS325>, while example analyses can be found in Clayton, Jakubowski, and Eerola (2019) <doi:10.1177/1029864919844809>. Additionally, wavelet analysis techniques have been applied to examine movement-related musical interactions, as shown in Eerola et al. (2018) <doi:10.1098/rsos.171520>.
This package provides functions for computing fit indices for evaluating the path component of latent variable structural equation models. Available fit indices include RMSEA-P and NSCI-P originally presented and evaluated by Williams and O'Boyle (2011) <doi:10.1177/1094428110391472> and demonstrated by O'Boyle and Williams (2011) <doi:10.1037/a0020539> and Williams, O'Boyle, & Yu (2020) <doi:10.1177/1094428117736137>. Also included are fit indices described by Hancock and Mueller (2011) <doi:10.1177/0013164410384856>.
This package provides methods for generating .dat files for use with the AMPL software using spatial data, particularly rasters. It includes support for various spatial data formats and different problem types. By automating the process of generating AMPL datasets, this package can help streamline optimization workflows and make it easier to solve complex optimization problems. The methods implemented in this package are described in detail in a publication by Fourer et al. (<doi:10.1287/mnsc.36.5.519>).
This package supports the computation of an F-test for the association between expression values and clinical entities. In many cases a two way layout with gene and a dichotomous group as factors will be considered. However, adjustment for other covariates and the analysis of arbitrary clinical variables, interactions, gene co-expression, time series data and so on is also possible. The test is carried out by comparison of corresponding linear models via the extra sum of squares principle.
To help you access, transform, analyze, and visualize ForestGEO
data, we developed a collection of R packages (<https://forestgeo.github.io/fgeo/>). This package, in particular, helps you to implement analyses of plot species distributions, topography, demography, and biomass. It also includes a torus translation test to determine habitat associations of tree species as described by Zuleta et al. (2018) <doi:10.1007/s11104-018-3878-0>. To learn more about ForestGEO
visit <https://forestgeo.si.edu/>.
This package provides a method that allows for the use of a collection of non-matched normal tissue samples. Our approach uses a non-parametric bootstrap subsampling of the available reference samples to estimate the distribution of read counts from targeted sequencing. As inspired by random forest, this is combined with a procedure that subsamples the amplicons associated with each of the targeted genes. The obtained information allows us to reliably classify the copy number aberrations on the gene level.
Aids in analysing data from a food frequency questionnaire known as the Harvard Service Food Frequency Questionnaire (HSFFQ). Functions from this package use answers from the HSFFQ to generate estimates of daily consumed micronutrients, calories, macronutrients on an individual level. The package also calculates food quotients on individual and group levels. Foodquotient calculation is an often tedious step in the calculation of total human energy expenditure (TEE) using the doubly labeled water method, which is the gold standard for measuring TEE.
Determine a Prototype from a number of runs of Latent Dirichlet Allocation (LDA) measuring its similarities with S-CLOP: A procedure to select the LDA run with highest mean pairwise similarity, which is measured by S-CLOP (Similarity of multiple sets by Clustering with Local Pruning), to all other runs. LDA runs are specified by its assignments leading to estimators for distribution parameters. Repeated runs lead to different results, which we encounter by choosing the most representative LDA run as prototype.
The package includes functions to retrieve the sequences around the peak, obtain enriched Gene Ontology (GO) terms, find the nearest gene, exon, miRNA or custom features such as most conserved elements and other transcription factor binding sites supplied by users. Starting 2.0.5, new functions have been added for finding the peaks with bi-directional promoters with summary statistics (peaksNearBDP), for summarizing the occurrence of motifs in peaks (summarizePatternInPeaks) and for adding other IDs to annotated peaks or enrichedGO (addGeneIDs).
Wrapping an array-like object (typically an on-disk object) in a DelayedArray
object allows one to perform common array operations on it without loading the object in memory. In order to reduce memory usage and optimize performance, operations on the object are either delayed or executed using a block processing mechanism. Note that this also works on in-memory array-like objects like DataFrame
objects (typically with Rle columns), Matrix
objects, and ordinary arrays and data frames.
The ArcGIS
Places service is a ready-to-use location service that can search for businesses and geographic locations around the world. It allows you to find, locate, and discover detailed information about each place. Query for places near a point, within a bounding box, filter based on categories, or provide search text. arcgisplaces integrates with sf for out of the box compatibility with other spatial libraries. Learn more in the Places service API reference <https://developers.arcgis.com/rest/places/>.
This package provides tools for calculating evolvability parameters from estimated G-matrices as defined in Hansen and Houle (2008) <doi:10.1111/j.1420-9101.2008.01573.x> and fits phylogenetic comparative models that link the rate of evolution of a trait to the state of another evolving trait (see Hansen et al. 2021 Systematic Biology <doi:10.1093/sysbio/syab079>). The package was released with Bolstad et al. (2014) <doi:10.1098/rstb.2013.0255>, which contains some examples of use.
The main function, ProtectTable()
, performs table suppression according to a frequency rule with a data set as the only required input. Within this function, protectTable()
, protect_linked_tables()
or runArgusBatchFile()
in package sdcTable
is called. Lists of level-hierarchy (parameter dimList
') and other required input to these functions are created automatically. The suppression method Gauss (default) is implemented independently of sdcTable
'. The function, PTgui()
, starts a graphical user interface based on the shiny package.
Simple plotting function(s) for exploratory data analysis with flexible options allowing for easy plot customisation. The goal is to make it easy for beginners to start exploring a dataset through simple R function calls, as well as provide a similar interface to summary statistics and inference information. Includes functionality to generate interactive HTML-driven graphs. Used by iNZight
', a graphical user interface providing easy exploration and visualisation of data for students of statistics, available in both desktop and online versions.
Facilities to work with vector and raster data in efficient repeatable and systematic work flow. Missing functionality in existing packages is included here to allow extraction from raster data with simple features and Spatial types and to make extraction consistent and straightforward. Extract cell numbers from raster data and return the cells as a data frame rather than as lists of matrices or vectors. The functions here allow spatial data to be used without special handling for the format currently in use.
The aim of TCGAbiolinks is:
facilitate GDC open-access data retrieval;
prepare the data using the appropriate pre-processing strategies;
provide the means to carry out different standard analyses, and;
to easily reproduce earlier research results.
In more detail, the package provides multiple methods for analysis (e.g., differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.