Inference by sequential Monte Carlo for dynamic tree regression and classification models with hooks provided for sequential design and optimization, fully online learning with drift, variable selection, and sensitivity analysis of inputs. Illustrative examples from the original dynamic trees paper (Gramacy, Taddy & Polson (2011); <doi:10.1198/jasa.2011.ap09769>) are facilitated by demos in the package; see demo(package="dynaTree
").
Work with the Ecological Community Data Design Pattern. ecocomDP
is a flexible data model for harmonizing ecological community surveys, in a research question agnostic format, from source data published across repositories, and with methods that keep the derived data up-to-date as the underlying sources change. Described in O'Brien et al. (2021), <doi:10.1016/j.ecoinf.2021.101374>.
This package provides a collection of four datasets based around the population dynamics of migratory fish. Datasets contain both basic size information on a per fish basis, as well as otolith data that contains a per day record of fish growth history. All data in this package was collected by the author, from 2015-2016, in the Wellington region of New Zealand.
To provide a comprehensive analysis of high dimensional longitudinal data,this package provides analysis for any combination of 1) simultaneous variable selection and estimation, 2) mean regression or quantile regression for heterogeneous data, 3) cross-sectional or longitudinal data, 4) balanced or imbalanced data, 5) moderate, high or even ultra-high dimensional data, via computationally efficient implementations of penalized generalized estimating equations.
This package provides a collection of Geoms for R's ggplot2 library. geom_shadowpath()
, geom_shadowline()
, geom_shadowstep()
and geom_shadowpoint()
functions draw a shadow below lines to make busy plots more aesthetically pleasing. geom_glowpath()
, geom_glowline()
, geom_glowstep()
and geom_glowpoint()
add a neon glow around lines to get a steampunk style.
Statistical framework to analyze heritability of gene expression based on next-generation sequencing data and simulating sequencing reads. Variance partition coefficients (VPC) are computed using linear mixed effects and generalized linear mixed effects models. Compound Poisson and negative binomial models are included. Reference: Rudra, Pratyaydipta, et al. "Model based heritability scores for high-throughput sequencing data." BMC bioinformatics 18.1 (2017): 143.
Calculate and visualize Healthy Eating Index (HEI) scores from National Health and Nutrition Examination Survey 24-hour dietary recall data utilizing three methods recommended by the National Cancer Institute (2024) <https://epi.grants.cancer.gov/hei/hei-methods-and-calculations.html#:~:text=To%20use%20the%20simple%20HEI,the%20total%20scores%20across%20individuals.>. Effortlessly analyze HEI scores across different demographic groups and years.
Develops a General Equilibrium (GE) Model, which estimates key variables such as wages, the number of residents and workers, the prices of the floor space, and its distribution between commercial and residential use, as in Ahlfeldt et al., (2015) <https://onlinelibrary.wiley.com/doi/abs/10.3982/ECTA10876>. By doing so, the model allows understanding the economic influence of different urban policies.
This package provides methods for fast segmentation of multivariate signals into piecewise constant profiles and for generating realistic copy-number profiles. A typical application is the joint segmentation of total DNA copy numbers and allelic ratios obtained from Single Nucleotide Polymorphism (SNP) microarrays in cancer studies. The methods are described in Pierre-Jean, Rigaill and Neuvial (2015) <doi:10.1093/bib/bbu026>.
This package provides functions to implement K Nearest Neighbor forecasting using a weighted similarity metric tailored to the problem of forecasting univariate time series where recent observations, seasonal patterns, and exogenous predictors are all relevant in predicting future observations of the series in question. For more information on the formulation of this similarity metric please see Trupiano (2021) <arXiv:2112.06266>
.
This package provides tools to help storing and handling case line list data. The linelist class adds a tagging system to classical data.frame objects to identify key epidemiological data such as dates of symptom onset, epidemiological case definition, age, gender or disease outcome. Once tagged, these variables can be seamlessly used in downstream analyses, making data pipelines more robust and reliable.
BEAST2 (<https://www.beast2.org>) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. BEAST2 is commonly accompanied by BEAUti 2 (<https://www.beast2.org>), which, among others, allows one to install BEAST2 package. This package allows to work with BEAST2 packages from R'.
This package provides a framework based on S3 dispatch for constructing models of mosquito-borne pathogen transmission which are constructed from submodels of various components (i.e. immature and adult mosquitoes, human populations). A consistent mathematical expression for the distribution of bites on hosts means that different models (stochastic, deterministic, etc.) can be coherently incorporated and updated over a discrete time step.
This package provides an algorithm for creating mandalas. From the perspective of classic mathematical curves and rigid movements on the plane, the package allows you to select curves and produce mandalas from the curve. The algorithm was developed based on the book by Alcoforado et. al. entitled "Art, Geometry and Mandalas with R" (2022) in press by the USP Open Books Portal.
Cross-Entropy optimisation of unconstrained deterministic and noisy functions illustrated in Rubinstein and Kroese (2004, ISBN: 978-1-4419-1940-3) through a highly flexible and customisable function which allows user to define custom variable domains, sampling distributions, updating and smoothing rules, and stopping criteria. Several built-in methods and settings make the package very easy-to-use under standard optimisation problems.
This package provides methods for testing the goodness-of-fit of generalized linear models (GLMs) using random projections. It is specifically designed for high-dimensional scenarios where the number of predictors substantially exceeds the sample size. The statistical methodologies implemented in this package are detailed in the paper by Wen Chen and Falong Tan (2024, <doi:10.48550/arXiv.2412.10721>
).
This package provides a set of statistical tools for spatio-temporal data exploration. Includes simple plotting functions, covariance calculations and computations similar to principal component analysis for spatio-temporal data. Can use both dataframes and stars objects for all plots and computations. For more details refer Spatio-Temporal Statistics with R (Christopher K. Wikle, Andrew Zammit-Mangion, Noel Cressie, 2019, ISBN:9781138711136).
An implementation of local and global statistical complexity measures (aka Information Theory Quantifiers, ITQ) for time series analysis based on ordinal statistics (Bandt and Pompe (2002) <DOI:10.1103/PhysRevLett.88.174102>
). Several distance measures that operate on ordinal pattern distributions, auxiliary functions for ordinal pattern analysis, and generating functions for stochastic and deterministic-chaotic processes for ITQ testing are provided.
Enables drag-and-drop behaviour in Shiny apps, by exposing the functionality of the SortableJS
<https://sortablejs.github.io/Sortable/> JavaScript
library as an htmlwidget'. You can use this in Shiny apps and widgets, learnr tutorials as well as R Markdown. In addition, provides a custom learnr question type - question_rank()
- that allows ranking questions with drag-and-drop.
Apache Drill is a low-latency distributed query engine designed to enable data exploration and analysis on both relational and non-relational data stores, scaling to petabytes of data. Methods are provided that enable working with Apache Drill instances via the REST API, DBI methods and using dplyr'/'dbplyr idioms. Helper functions are included to facilitate using official Drill Docker images/containers.
This package implements the Vector Matching algorithm to match multiple treatment groups based on previously estimated generalized propensity scores. The package includes tools for visualizing initial confounder imbalances, estimating treatment assignment probabilities using various methods, defining the common support region, performing matching across multiple groups, and evaluating matching quality. For more details, see Lopez and Gutman (2017) <doi:10.1214/17-STS612>.
This package provides basic functions for analyzing shallow whole-genome sequencing (~0.3X or more) of cell-free DNA (cfDNA
). The package basically extracts the length of cfDNA
fragments and aids the vistualization of fragment-length information. The package also extract fragment-length information per non-overlapping fixed-sized bins and used it for calculating ctDNA
estimation score (CES).
The package implements GUIDE-seq and PEtag-seq analysis workflow including functions for filtering UMI and reads with low coverage, obtaining unique insertion sites (proxy of cleavage sites), estimating the locations of the insertion sites, aka, peaks, merging estimated insertion sites from plus and minus strand, and performing off target search of the extended regions around insertion sites with mismatches and indels.
MaAsLin
3 refines and extends generalized multivariate linear models for meta-omicron association discovery. It finds abundance and prevalence associations between microbiome meta-omics features and complex metadata in population-scale epidemiological studies. The software includes multiple analysis methods (including support for multiple covariates, repeated measures, and ordered predictors), filtering, normalization, and transform options to customize analysis for your specific study.