This package implements methods for inference on potential waning of vaccine efficacy and for estimation of vaccine efficacy at a user-specified time after vaccination based on data from a randomized, double-blind, placebo-controlled vaccine trial in which participants may be unblinded and placebo subjects may be crossed over to the study vaccine. The methods also allow adjustment for possible confounding via inverse probability weighting through specification of models for the trial entry process, unblinding mechanisms, and the probability an unblinded placebo participant accepts study vaccine: Tsiatis, A. A. and Davidian, M. (2022) <doi:10.1111/biom.13509>.
Director is an R package designed to streamline the visualization of molecular effects in regulatory cascades. It utilizes the R package htmltools and a modified Sankey plugin of the JavaScript
library D3 to provide a fast and easy, browser-enabled solution to discovering potentially interesting downstream effects of regulatory and/or co-expressed molecules. The diagrams are robust, interactive, and packaged as highly-portable HTML files that eliminate the need for third-party software to view. This enables a straightforward approach for scientists to interpret the data produced, and bioinformatics developers an alternative means to present relevant data.
MiDAS
is a R package for immunogenetics data transformation and statistical analysis. MiDAS
accepts input data in the form of HLA alleles and KIR types, and can transform it into biologically meaningful variables, enabling HLA amino acid fine mapping, analyses of HLA evolutionary divergence, KIR gene presence, as well as validated HLA-KIR interactions. Further, it allows comprehensive statistical association analysis workflows with phenotypes of diverse measurement scales. MiDAS
closes a gap between the inference of immunogenetic variation and its efficient utilization to make relevant discoveries related to T cell, Natural Killer cell, and disease biology.
InferCNV
is used to explore tumor single cell RNA-Seq data to identify evidence for somatic large-scale chromosomal copy number alterations, such as gains or deletions of entire chromosomes or large segments of chromosomes. This is done by exploring expression intensity of genes across positions of a tumor genome in comparison to a set of reference "normal" cells. A heatmap is generated illustrating the relative expression intensities across each chromosome, and it often becomes readily apparent as to which regions of the tumor genome are over-abundant or less-abundant as compared to that of normal cells.
This package makes the qhull library available in R, in a similar manner as in Octave. Qhull computes convex hulls, Delaunay triangulations, halfspace intersections about a point, Voronoi diagrams, furthest-site Delaunay triangulations, and furthest-site Voronoi diagrams. It runs in 2-d, 3-d, 4-d, and higher dimensions. It implements the Quickhull algorithm for computing the convex hull. Qhull does not support constrained Delaunay triangulations, or mesh generation of non-convex objects, but the package does include some R functions that allow for this. Currently the package only gives access to Delaunay triangulation and convex hull computation.
This package provides a versatile interior point solver that solves linear programs (LPs), quadratic programs (QPs), second-order cone programs (SOCPs), semidefinite programs (SDPs), and problems with exponential and power cone constraints (https://clarabel.org/stable/). For quadratic objectives, unlike interior point solvers based on the standard homogeneous self-dual embedding (HSDE) model, Clarabel handles quadratic objective without requiring any epigraphical reformulation of its objective function. It can therefore be significantly faster than other HSDE-based solvers for problems with quadratic objective functions. Infeasible problems are detected using using a homogeneous embedding technique.
This package provides a collection of miscellaneous helper function for running multilevel/mixed models in lme4'. This package aims to provide functions to compute common tasks when estimating multilevel models such as computing the intraclass correlation and design effect, centering variables, estimating the proportion of variance explained at each level, pseudo-R squared, random intercept and slope reliabilities, tests for homogeneity of variance at level-1, and cluster robust and bootstrap standard errors. The tests and statistics reported in the package are from Raudenbush & Bryk (2002, ISBN:9780761919049), Hox et al. (2018, ISBN:9781138121362), and Snijders & Bosker (2012, ISBN:9781849202015).
This package provides tools for data-driven statistical analysis using local polynomial regression and kernel density estimation methods as described in Calonico, Cattaneo and Farrell (2018, <doi:10.1080/01621459.2017.1285776>): lprobust()
for local polynomial point estimation and robust bias-corrected inference, lpbwselect()
for local polynomial bandwidth selection, kdrobust()
for kernel density point estimation and robust bias-corrected inference, kdbwselect()
for kernel density bandwidth selection, and nprobust.plot()
for plotting results. The main methodological and numerical features of this package are described in Calonico, Cattaneo and Farrell (2019, <doi:10.18637/jss.v091.i08>).
Calculate common types of tables for weighted survey data. Options include topline and (2-way and 3-way) crosstab tables of categorical or ordinal data as well as summary tables of weighted numeric variables. Optionally, include the margin of error at selected confidence intervals including the design effect. The design effect is calculated as described by Kish (1965) <doi:10.1002/bimj.19680100122> beginning on page 257. Output takes the form of tibbles (simple data frames). This package conveniently handles labelled data, such as that commonly used by Stata and SPSS. Complex survey design is not supported at this time.
The software application Praat can be used to annotate waveform data (e.g., to mark intervals of interest or to label events). (See <http://www.fon.hum.uva.nl/praat/> for more information about Praat.) These annotations are stored in a Praat TextGrid
object, which consists of a number of interval tiers and point tiers. An interval tier consists of sequential (i.e., not overlapping) labeled intervals. A point tier consists of labeled events that have no duration. The textgRid
package provides S4 classes, generics, and methods for accessing information that is stored in Praat TextGrid
objects.
This package implements two tests for same-source of toolmarks. The chumbley_non_random()
test follows the paper "An Improved Version of a Tool Mark Comparison Algorithm" by Hadler and Morris (2017) <doi:10.1111/1556-4029.13640>. This is an extension of the Chumbley score as previously described in "Validation of Tool Mark Comparisons Obtained Using a Quantitative, Comparative, Statistical Algorithm" by Chumbley et al (2010) <doi:10.1111/j.1556-4029.2010.01424.x>. fixed_width_no_modeling()
is based on correlation measures in a diamond shaped area of the toolmark as described in Hadler (2017).
The model, developed at the Vienna University of Technology, is a lumped conceptual rainfall-runoff model, following the structure of the HBV model. The model can also be run in a semi-distributed fashion and with dual representation of soil layer. The model runs on a daily or shorter time step and consists of a snow routine, a soil moisture routine and a flow routing routine. See Parajka, J., R. Merz, G. Bloeschl (2007) <DOI:10.1002/hyp.6253> Uncertainty and multiple objective calibration in regional water balance modelling: case study in 320 Austrian catchments, Hydrological Processes, 21, 435-446.
This package provides a spline based scRNA-seq
method for identifying differentially variable (DV) genes across two experimental conditions. Spline-DV constructs a 3D spline from 3 key gene statistics: mean expression, coefficient of variance, and dropout rate. This is done for both conditions. The 3D spline provides the “expected” behavior of genes in each condition. The distance of the observed mean, CV and dropout rate of each gene from the expected 3D spline is used to measure variability. As the final step, the spline-DV method compares the variabilities of each condition to identify differentially variable (DV) genes.
This package provides functions for performing quick observations or evaluations of data, including a variety of ways to list objects by size, class, etc. The functions seqle and reverse.seqle mimic the base rle but can search for linear sequences. The function splatnd allows the user to generate zero-argument commands without the need for makeActiveBinding
. Functions provided to convert from any base to any other base, and to find the n-th greatest max or n-th least min. In addition, functions which mimic Unix shell commands, including head', tail ,'pushd ,and popd'. Various other goodies included as well.
Single unified interface for end-to-end modelling of regression, categorical and time-to-event (survival) outcomes. Models created using familiar are self-containing, and their use does not require additional information such as baseline survival, feature clustering, or feature transformation and normalisation parameters. Model performance, calibration, risk group stratification, (permutation) variable importance, individual conditional expectation, partial dependence, and more, are assessed automatically as part of the evaluation process and exported in tabular format and plotted, and may also be computed manually using export and plot functions. Where possible, metrics and values obtained during the evaluation process come with confidence intervals.
An interface to the fastText
<https://github.com/facebookresearch/fastText>
library for efficient learning of word representations and sentence classification. The fastText
algorithm is explained in detail in (i) "Enriching Word Vectors with subword Information", Piotr Bojanowski, Edouard Grave, Armand Joulin, Tomas Mikolov, 2017, <doi:10.1162/tacl_a_00051>; (ii) "Bag of Tricks for Efficient Text Classification", Armand Joulin, Edouard Grave, Piotr Bojanowski, Tomas Mikolov, 2017, <doi:10.18653/v1/e17-2068>; (iii) "FastText.zip
: Compressing text classification models", Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Herve Jegou, Tomas Mikolov, 2016, <arXiv:1612.03651>
.
An optim-style implementation of the Stochastic Quasi-Gradient Differential Evolution (SQG-DE) optimization algorithm first published by Sala, Baldanzini, and Pierini (2018; <doi:10.1007/978-3-319-72926-8_27>). This optimization algorithm fuses the robustness of the population-based global optimization algorithm "Differential Evolution" with the efficiency of gradient-based optimization. The derivative-free algorithm uses population members to build stochastic gradient estimates, without any additional objective function evaluations. Sala, Baldanzini, and Pierini argue this algorithm is useful for difficult optimization problems under a tight function evaluation budget. This package can run SQG-DE in parallel and sequentially.
This package provides a comprehensive collection of tools for creating, manipulating and visualising pedigrees and genetic marker data. Pedigrees can be read from text files or created on the fly with built-in functions. A range of utilities enable modifications like adding or removing individuals, breaking loops, and merging pedigrees. An online tool for creating pedigrees interactively, based on pedtools', is available at <https://magnusdv.shinyapps.io/quickped>. pedtools is the hub of the pedsuite', a collection of packages for pedigree analysis. A detailed presentation of the pedsuite is given in the book Pedigree Analysis in R (Vigeland, 2021, ISBN:9780128244302).
The algorithm implemented in this package was designed to quickly estimates the distribution of the log-rank especially for heavy unbalanced groups. VALORATE estimates the null distribution and the p-value of the log-rank test based on a recent formulation. For a given number of alterations that define the size of survival groups, the estimation involves a weighted sum of distributions that are conditional on a co-occurrence term where mutations and events are both present. The estimation of conditional distributions is quite fast allowing the analysis of large datasets in few minutes <https://bioinformatics.mx/index.php/bioinfo-tools/>.
RNA degradation is monitored through measurement of RNA abundance after inhibiting RNA synthesis. This package has functions and example scripts to facilitate (1) data normalization, (2) data modeling using constant decay rate or time-dependent decay rate models, (3) the evaluation of treatment or genotype effects, and (4) plotting of the data and models. Data Normalization: functions and scripts make easy the normalization to the initial (T0) RNA abundance, as well as a method to correct for artificial inflation of Reads per Million (RPM) abundance in global assessments as the total size of the RNA pool decreases. Modeling: Normalized data is then modeled using maximum likelihood to fit parameters. For making treatment or genotype comparisons (up to four), the modeling step models all possible treatment effects on each gene by repeating the modeling with constraints on the model parameters (i.e., the decay rate of treatments A and B are modeled once with them being equal and again allowing them to both vary independently). Model Selection: The AICc value is calculated for each model, and the model with the lowest AICc is chosen. Modeling results of selected models are then compiled into a single data frame. Graphical Plotting: functions are provided to easily visualize decay data model, or half-life distributions using ggplot2 package functions.
Implementation of the nonparametric bounds for the average causal effect under an instrumental variable model by Balke and Pearl (Bounds on Treatment Effects from Studies with Imperfect Compliance, JASA, 1997, 92, 439, 1171-1176, <doi:10.2307/2965583>). The package can calculate bounds for a binary outcome, a binary treatment/phenotype, and an instrument with either 2 or 3 categories. The package implements bounds for situations where these 3 variables are measured in the same dataset (trivariate data) or where the outcome and instrument are measured in one study and the treatment/phenotype and instrument are measured in another study (bivariate data).
This package provides a Bayesian framework for parameter inference in differential equations. This approach offers a rigorous methodology for parameter inference as well as modeling the link between unobservable model states and parameters, and observable quantities. Provides templates for the DE model, the observation model and data likelihood, and the model parameters and their prior distributions. A Markov chain Monte Carlo (MCMC) procedure processes these inputs to estimate the posterior distributions of the parameters and any derived quantities, including the model trajectories. Further functionality is provided to facilitate MCMC diagnostics and the visualisation of the posterior distributions of model parameters and trajectories.
Easily perform a Monte Carlo simulation to evaluate the cost and carbon, ecological, and water footprints of a set of ideal diets. Pre-processing tools are also available to quickly treat the data, along with basic statistical features to analyze the simulation results â including the ability to establish confidence intervals for selected parameters, such as nutrients and price/emissions. A standard version of the datasets employed is included as well, allowing users easy access to customization. This package brings to R the Python software initially developed by Vandevijvere, Young, Mackay, Swinburn and Gahegan (2018) <doi:10.1186/s12966-018-0648-6>.
Tailored explicitly for Experience Sampling Method (ESM) data, it contains a suite of functions designed to simplify preprocessing steps and create subsequent reporting. It empowers users with capabilities to extract critical insights during preprocessing, conducts thorough data quality assessments (e.g., design and sampling scheme checks, compliance rate, careless responses), and generates visualizations and concise summary tables tailored specifically for ESM data. Additionally, it streamlines the creation of informative and interactive preprocessing reports, enabling researchers to transparently share their dataset preprocessing methodologies. Finally, it is part of a larger ecosystem which includes a framework and a web gallery (<https://preprocess.esmtools.com/>).