An add-on to the party package, with a faster implementation of the partial-conditional permutation importance for random forests. The standard permutation importance is implemented exactly the same as in the party package. The conditional permutation importance can be computed faster, with an option to be backward compatible to the party implementation. The package is compatible with random forests fit using the party and the randomForest package. The methods are described in Strobl et al. (2007) <doi:10.1186/1471-2105-8-25> and Debeer and Strobl (2020) <doi:10.1186/s12859-020-03622-2>.
Two versions of sample variance plots, Sv-plot1 and Sv-plot2, will be provided illustrating the squared deviations from sample variance. Besides indicating the contribution of squared deviations for the sample variability, these plots are capable of detecting characteristics of the distribution such as symmetry, skewness and outliers. A remarkable graphical method based on Sv-plot2 can determine the decision on testing hypotheses over one or two population means. In sum, Sv-plots will be appealing visualization tools. Complete description of this methodology can be found in the article, Wijesuriya (2020) <doi:10.1080/03610918.2020.1851716>.
This package detects significant differentially methylated regions (for both qualitative and quantitative traits), using a scan statistic with underlying Poisson heuristics. The scan statistic will depend on a sequence of window sizes (# of CpGs within each window) and on a threshold for each window size. This threshold can be calculated by three different means: i) analytically using Siegmund et.al (2012) solution (preferred), ii) an important sampling as suggested by Zhang (2008), and a iii) full MCMC modeling of the data, choosing between a number of different options for modeling the dependency between each CpG.
StabMap performs single cell mosaic data integration by first building a mosaic data topology, and for each reference dataset, traverses the topology to project and predict data onto a common embedding. Mosaic data should be provided in a list format, with all relevant features included in the data matrices within each list object. The output of stabMap is a joint low-dimensional embedding taking into account all available relevant features. Expression imputation can also be performed using the StabMap embedding and any of the original data matrices for given reference and query cell lists.
The `TrIdent` R package automates the analysis of transductomics data by detecting, classifying, and characterizing read coverage patterns associated with potential transduction events. Transductomics is a DNA sequencing-based method for the detection and characterization of transduction events in pure cultures and complex communities. Transductomics relies on mapping sequencing reads from a viral-like particle (VLP)-fraction of a sample to contigs assembled from the metagenome (whole-community) of the same sample. Reads from bacterial DNA carried by VLPs will map back to the bacterial contigs of origin creating read coverage patterns indicative of ongoing transduction.
Structural equation modeling (SEM) has a long history of representing models graphically as path diagrams. The semPlot package for R fills the gap between advanced, but time-consuming, graphical software and the limited graphics produced automatically by SEM software. In addition, semPlot offers more functionality than drawing path diagrams: it can act as a common ground for importing SEM results into R. Any result usable as input to semPlot can also be represented in any of the three popular SEM frame-works, as well as translated to input syntax for the R packages sem and lavaan.
This package provides R-implementation of Decision forest algorithm, which combines the predictions of multiple independent decision tree models for a consensus decision. In particular, Decision Forest is a novel pattern-recognition method which can be used to analyze: (1) DNA microarray data; (2) Surface-Enhanced Laser Desorption/Ionization Time-of-Flight Mass Spectrometry (SELDI-TOF-MS) data; and (3) Structure-Activity Relation (SAR) data. In this package, three fundamental functions are provided, as (1)DF_train, (2)DF_pred, and (3)DF_CV. run Dforest() to see more instructions. Weida Tong (2003) <doi:10.1021/ci020058s>.
This package provides tools for exploration of R package dependencies. The main deepdep() function allows to acquire deep dependencies of any package and plot them in an elegant way. It also adds some popularity measures for the packages e.g. in the form of download count through the cranlogs package. Uses the CRAN metadata database <http://crandb.r-pkg.org> and Bioconductor metadata <https://bioconductor.org>. Other data acquire functions are: get_dependencies(), get_downloads() and get_description(). The deepdep_shiny() function runs shiny application that helps to produce a nice deepdep plot.
Analysis of temporal changes (i.e. dynamics) of ecological entities, defined as trajectories on a chosen multivariate space, by providing a set of trajectory metrics and visual representations [De Caceres et al. (2019) <doi:10.1002/ecm.1350>; and Sturbois et al. (2021) <doi:10.1016/j.ecolmodel.2020.109400>]. Includes functions to estimate metrics for individual trajectories (length, directionality, angles, ...) as well as metrics to relate pairs of trajectories (dissimilarity and convergence). Functions are also provided to estimate the ecological quality of ecosystem with respect to reference conditions [Sturbois et al. (2023) <doi:10.1002/ecs2.4726>].
FASTQC is the most widely used tool for evaluating the quality of high throughput sequencing data. It produces, for each sample, an html report and a compressed file containing the raw data. If you have hundreds of samples, you are not going to open up each HTML page. You need some way of looking at these data in aggregate. fastqcr Provides helper functions to easily parse, aggregate and analyze FastQC reports for large numbers of samples. It provides a convenient solution for building a Multi-QC report, as well as, a one-sample report with result interpretations.
Manages a file system cache. Regular files can be moved or copied to the cache folder. Sub-folders can be created in order to organize the files. Files can be located inside the cache using a glob function. Text contents can be easily stored in and retrieved from the cache using dedicated functions. It can be used for an application or a package, as a global cache, or as a per-user cache, in which case the standard OS user cache folder will be used (e.g.: on Linux $HOME/.cache/R/my_app_or_pkg_cache_folder).
This package provides tools to build and work with bilateral generalized-mean price indexes (and by extension quantity indexes), and indexes composed of generalized-mean indexes (e.g., superlative quadratic-mean indexes, GEKS). Covers the core mathematical machinery for making bilateral price indexes, computing price relatives, detecting outliers, and decomposing indexes, with wrappers for all common (and many uncommon) index-number formulas. Implements and extends many of the methods in Balk (2008, <doi:10.1017/CBO9780511720758>), von der Lippe (2007, <doi:10.3726/978-3-653-01120-3>), and the CPI manual (2020, <doi:10.5089/9781484354841.069>).
This package provides a framework to assist creation of marine ecosystem models, generating either R or C++ code which can then be optimised using the TMB package and standard R tools. Principally designed to reproduce gadget2 models in TMB', but can be extended beyond gadget2's capabilities. Kasper Kristensen, Anders Nielsen, Casper W. Berg, Hans Skaug, Bradley M. Bell (2016) <doi:10.18637/jss.v070.i05> "TMB: Automatic Differentiation and Laplace Approximation.". Begley, J., & Howell, D. (2004) <https://core.ac.uk/download/pdf/225936648.pdf> "An overview of Gadget, the globally applicable area-disaggregated general ecosystem toolbox. ICES.".
Following the common types of measures of uncertainty for parameter estimation, two measures of uncertainty were proposed for model selection, see Liu, Li and Jiang (2020) <doi:10.1007/s11749-020-00737-9>. The first measure is a kind of model confidence set that relates to the variation of model selection, called Mac. The second measure focuses on error of model selection, called LogP. They are all computed via bootstrapping. This package provides functions to compute these two measures. Furthermore, a similar model confidence set adapted from Bayesian Model Averaging can also be computed using this package.
This package provides a collection of functions for the analysis of archaeological mortality data (on the topic see e.g. Chamberlain 2006 <https://books.google.de/books?id=nG5FoO_becAC&lpg=PA27&ots=LG0b_xrx6O&dq=life%20table%20archaeology&pg=PA27#v=onepage&q&f=false>). It takes demographic data in different formats and displays the result in a standard life table as well as plots the relevant indices (percentage of deaths, survivorship, probability of death, life expectancy, percentage of population). It also checks for possible biases in the age structure and applies corrections to life tables.
This package implements random number generation, plotting, and estimation algorithms for the two-parameter one-sided and two-sided M-Wright (Mainardi-Wright) family. The M-Wright distributions naturally generalize the widely used one-sided (Airy and half-normal or half-Gaussian) and symmetric (Airy and Gaussian or normal) models. These are widely studied in time-fractional differential equations. References: Cahoy and Minkabo (2017) <doi:10.3233/MAS-170388>; Cahoy (2012) <doi:10.1007/s00180-011-0269-x>; Cahoy (2012) <doi:10.1080/03610926.2010.543299>; Cahoy (2011); Mainardi, Mura, and Pagnini (2010) <doi:10.1155/2010/104505>.
This package provides a set of tools for likelihood-based estimation, model selection and testing of two- and three-range shift and migration models for animal movement data as described in Gurarie et al. (2017) <doi: 10.1111/1365-2656.12674>. Provided movement data (X, Y and Time), including irregularly sampled data, functions estimate the time, duration and location of one or two range shifts, as well as the ranging area and auto-correlation structure of the movment. Tests assess, for example, whether the shift was "significant", and whether a two-shift migration was a true return migration.
Package containing example and annotation data for Hipathia package. Hipathia is a method for the computation of signal transduction along signaling pathways from transcriptomic data. The method is based on an iterative algorithm which is able to compute the signal intensity passing through the nodes of a network by taking into account the level of expression of each gene and the intensity of the signal arriving to it. It also provides a new approach to functional analysis allowing to compute the signal arriving to the functions annotated to each pathway. Hipathia depends on this package to be functional.
Reads files exported from QX Manager or QuantaSoft containing amplitude values from a run of ddPCR (96 well plate) and robustly sets thresholds to determine positive droplets for each channel of each individual well. Concentration and normalized concentration in addition to other metrics is then calculated for each well. Results are returned as a table, optionally written to file, as well as optional plots (scatterplot and histogram) for both channels per well written to file. The package includes a shiny application which provides an interactive and user-friendly interface to the full functionality of PoDCall.
This package provides a system for querying, retrieving and analyzing protocol- and results-related information on clinical trials from three public registers, the European Union Clinical Trials Register (EUCTR), ClinicalTrials.gov (CTGOV) and the ISRCTN. Trial information is downloaded, converted and stored in a database. Functions are included to identify deduplicated records, to easily find and extract variables (fields) of interest even from complex nesting as used by the registers, and to update previous queries. The package can be used for meta-analysis and trend-analysis of the design and conduct as well as for results of clinical trials.
At the Swiss Federal Statistical Office (SFSO), spatial maps of Switzerland are available free of charge as Cartographic bases for small-scale thematic mapping'. This package contains convenience functions to import ESRI (Environmental Systems Research Institute) shape files using the package sf and to plot them easily and quickly without having to worry too much about the technical details. It contains utilities to combine multiple areas to one single polygon and to find neighbours for single regions. For any point on a map, a special locator can be used to determine to which municipality, district or canton it belongs.
This package performs simulation-based inference as an alternative to the delta method for obtaining valid confidence intervals and p-values for regression post-estimation quantities, such as average marginal effects and predictions at representative values. This framework for simulation-based inference is especially useful when the resulting quantity is not normally distributed and the delta method approximation fails. The methodology is described in Greifer, et al. (2025) <doi:10.32614/RJ-2024-015>. clarify is meant to replace some of the functionality of the archived package Zelig'; see the vignette "Translating Zelig to clarify" for replicating this functionality.
Several web services are available that provide access to elevation data. This package provides access to many of those services and returns elevation data either as an sf simple features object from point elevation services or as a raster object from raster elevation services. In future versions, elevatr will drop support for raster and will instead return terra objects. Currently, the package supports access to the Amazon Web Services Terrain Tiles <https://registry.opendata.aws/terrain-tiles/>, the Open Topography Global Datasets API <https://opentopography.org/developers/>, and the USGS Elevation Point Query Service <https://apps.nationalmap.gov/epqs/>.
This package provides a procedure for comparing multivariate samples associated with different groups. It uses principal component analysis to convert multivariate observations into a set of linearly uncorrelated statistical measures, which are then compared using a number of statistical methods. The procedure is independent of the distributional properties of samples and automatically selects features that best explain their differences, avoiding manual selection of specific points or summary statistics. It is appropriate for comparing samples of time series, images, spectrometric measures or similar multivariate observations. This package is described in Fachada et al. (2016) <doi:10.32614/RJ-2016-055>.