A streamgraph is a type of stacked area chart. It represents the evolution of a numeric variable for several groups. Areas are usually displayed around a central axis, and edges are rounded to give a flowing shape. This package provides an htmlwidget
for building streamgraph visualizations.
This package provides a collection of functions to search and download street view imagery ('Mapilary <https://www.mapillary.com/developer/api-documentation>) and to extract, quantify, and visualize visual features. Moreover, there are functions provided to generate Qualtrics survey in TXT format using the collection of street views for various research purposes.
Offers a comprehensive approach for analysing stratified 2x2 contingency tables. It facilitates the calculation of odds ratios, 95% confidence intervals, and conducts chi-squared, Cochran-Mantel-Haenszel, Mantel-Haenszel, and Breslow-Day-Tarone tests. The package is particularly useful in fields like epidemiology and social sciences where stratified analysis is essential. The package also provides interpretative insights into the results, aiding in the understanding of statistical outcomes.
This package provides a pilot matching design to automatically stratify and match large datasets. The manual_stratify()
function allows users to manually stratify a dataset based on categorical variables of interest, while the auto_stratify()
function does automatically by allocating a held-aside (pilot) data set, fitting a prognostic score (see Hansen (2008) <doi:10.1093/biomet/asn004>) on the pilot set, and stratifying the data set based on prognostic score quantiles. The strata_match()
function then does optimal matching of the data set in parallel within strata.
This package provides tools for testing, monitoring and dating structural changes in (linear) regression models. It features tests/methods from the generalized fluctuation test framework as well as from the F test (Chow test) framework. This includes methods to fit, plot and test fluctuation processes (e.g., CUSUM, MOSUM, recursive/moving estimates) and F statistics, respectively. It is possible to monitor incoming data online using fluctuation processes. Finally, the breakpoints in regression models with structural changes can be estimated together with confidence intervals. Emphasis is always given to methods for visualizing the data.
This package provides an efficient method to recover the missing block of an approximately low-rank matrix. Current literature on matrix completion focuses primarily on independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured missingness by design [Cai T, Cai TT, Zhang A (2016) <doi:10.1080/01621459.2015.1021005>]. Specifically, our proposed method aims at efficient matrix recovery when a subset of the rows and columns of an approximately low-rank matrix are observed. The main function in our package, smc.FUN()
, is for recovery of the missing block A22 of an approximately low-rank matrix A given the other blocks A11, A12, A21.
This package provides drop-in replacements for functions from the stringr package, with the same user interface. These functions have no external dependencies and can be copied directly into your package code using the staticimports package.
This package provides functions for stratified sampling and assigning custom labels to data, ensuring randomness within groups. The package supports various sampling methods such as stratified, cluster, and systematic sampling. It allows users to apply transformations and customize the sampling process. This package can be useful for statistical analysis and data preparation tasks.
This package aims to quantify and remove putative double strand DNA from a strand-specific RNA sample. There are also options and methods to plot the positive/negative proportions of all sliding windows, which allow users to have an idea of how much the sample was contaminated and the appropriate threshold to be used for filtering.
An open source platform for validation and process control. Tools to analyze data from internal validation of forensic short tandem repeat (STR) kits are provided. The tools are developed to provide the necessary data to conform with guidelines for internal validation issued by the European Network of Forensic Science Institutes (ENFSI) DNA Working Group, and the Scientific Working Group on DNA Analysis Methods (SWGDAM). A front-end graphical user interface is provided. More information about each function can be found in the respective help documentation.
Random Forest-like tree ensemble that works with groups of predictor variables. When building a tree, a number of variables is taken randomly from each group separately, thus ensuring that it considers variables from each group for the splits. Useful when rows contain information about different things (e.g. user information and product information) and it's not sensible to make a prediction with information from only one group of variables, or when there are far more variables from one group than the other and it's desired to have groups appear evenly on trees. Trees are grown using the C5.0 algorithm rather than the usual CART algorithm. Supports parallelization (multithreaded), missing values in predictors, and categorical variables (without doing One-Hot encoding in the processing). Can also be used to create a regular (non-stratified) Random Forest-like model, but made up of C5.0 trees and with some additional control options. As it's built with C5.0 trees, it works only for classification (not for regression).
Regression-based ranking of pathogen strains with respect to their contributions to natural epidemics, using demographic and genetic data sampled in the curse of the epidemics. This package also includes the GMCPIC test.
The Structstrings package implements the widely used dot bracket annotation for storing base pairing information in structured RNA. Structstrings uses the infrastructure provided by the Biostrings package and derives the DotBracketString
and related classes from the BString class. From these, base pair tables can be produced for in depth analysis. In addition, the loop indices of the base pairs can be retrieved as well. For better efficiency, information conversion is implemented in C, inspired to a large extend by the ViennaRNA
package.
Implementation of analytical models for estimating streamflow depletion due to groundwater pumping, and other related tools. Functions are broadly split into two groups: (1) analytical streamflow depletion models, which estimate streamflow depletion for a single stream reach resulting from groundwater pumping; and (2) depletion apportionment equations, which distribute estimated streamflow depletion among multiple stream reaches within a stream network. See Zipper et al. (2018) <doi:10.1029/2018WR022707> for more information on depletion apportionment equations and Zipper et al. (2019) <doi:10.1029/2018WR024403> for more information on analytical depletion functions, which combine analytical models and depletion apportionment equations.
An extensive set of data (pre-)processing and analysis methods and tools for metabolomics and other omics, with a strong emphasis on statistics and machine learning. This toolbox allows the user to build extensive and standardised workflows for data analysis. The methods and tools have been implemented using class-based templates provided by the struct (Statistics in R Using Class-based Templates) package. The toolbox includes pre-processing methods (e.g. signal drift and batch correction, normalisation, missing value imputation and scaling), univariate (e.g. ttest, various forms of ANOVA, Kruskal–Wallis test and more) and multivariate statistical methods (e.g. PCA and PLS, including cross-validation and permutation testing) as well as machine learning methods (e.g. Support Vector Machines). The STATistics Ontology (STATO) has been integrated and implemented to provide standardised definitions for the different methods, inputs and outputs.
Includes bases for litholog generation: graphical functions based on R base graphics, interval management functions and svg importation functions among others. Also include stereographic projection functions, and other functions made to deal with large datasets while keeping options to get into the details of the data. When using for publication please cite Sebastien Wouters, Anne-Christine Da Silva, Frederic Boulvain and Xavier Devleeschouwer, 2021. The R Journal 13:2, 153-178. The palaeomagnetism functions are based on: Tauxe, L., 2010. Essentials of Paleomagnetism. University of California Press. <https://earthref.org/MagIC/books/Tauxe/Essentials/>
; Allmendinger, R. W., Cardozo, N. C., and Fisher, D., 2013, Structural Geology Algorithms: Vectors & Tensors: Cambridge, England, Cambridge University Press, 289 pp.; Cardozo, N., and Allmendinger, R. W., 2013, Spherical projections with OSXStereonet: Computers & Geosciences, v. 51, no. 0, p. 193 - 205, <doi: 10.1016/j.cageo.2012.07.021>.
Reliability of (normal) stress-strength models and for building two-sided or one-sided confidence intervals according to different approximate procedures.
Pass named and unnamed character vectors into specified positions in strings. This represents an attempt to replicate some of python's string formatting.
The stress addition approach is an alternative to the traditional concentration addition or effect addition models. It allows the modelling of tri-phasic concentration-response relationships either as single toxicant experiments, in combination with an environmental stressor or as mixtures of two toxicants. See Liess et al. (2019) <doi:10.1038/s41598-019-51645-4> and Liess et al. (2020) <doi:10.1186/s12302-020-00394-7>.
Univariate stratification of survey populations with a generalization of the Lavallee-Hidiroglou method of stratum construction. The generalized method takes into account a discrepancy between the stratification variable and the survey variable. The determination of the optimal boundaries also incorporate, if desired, an anticipated non-response, a take-all stratum for large units, a take-none stratum for small units, and a certainty stratum to ensure that some specific units are in the sample. The well known cumulative root frequency rule of Dalenius and Hodges and the geometric rule of Gunning and Horgan are also implemented.
This package provides a fast implementation with additional experimental features for testing, monitoring and dating structural changes in (linear) regression models. strucchangeRcpp
features tests/methods from the generalized fluctuation test framework as well as from the F test (Chow test) framework. This includes methods to fit, plot and test fluctuation processes (e.g. cumulative/moving sum, recursive/moving estimates) and F statistics, respectively. These methods are described in Zeileis et al. (2002) <doi:10.18637/jss.v007.i02>. Finally, the breakpoints in regression models with structural changes can be estimated together with confidence intervals, and their magnitude as well as the model fit can be evaluated using a variety of statistical measures.
Non-proportional hazard (NPH) is commonly observed in immuno-oncology studies, where the survival curves of the treatment and control groups show delayed separation. To properly account for NPH, several statistical methods have been developed. One such method is Max-Combo test, which is a straightforward and flexible hypothesis testing method that can simultaneously test for constant, early, middle, and late treatment effects. However, the majority of the Max-Combo test performed in clinical studies are unstratified, ignoring the important prognostic stratification factors. To fill this gap, we have developed an R package for stratified Max-Combo testing that accounts for stratified baseline factors. Our package explores various methods for calculating combined test statistics, estimating joint distributions, and determining the p-values.
I provide functions to calculate Gross Primary Productivity, Net Ecosystem Production, and Ecosystem Respiration from single station diurnal Oxygen curves.
Identifies individuals in a social network who should be the intervention subjects for a network intervention in which you have a group of targets, a group of avoiders, and a group that is neither.