Estimates heterogeneous treatment effects using tidy semantics on experimental or observational data. Methods are based on the doubly-robust learner of Kennedy (2023) <doi:10.1214/23-EJS2157>. You provide a simple recipe for what machine learning algorithms to use in estimating the nuisance functions and tidyhte will take care of cross-validation, estimation, model selection, diagnostics and construction of relevant quantities of interest about the variability of treatment effects.
There are two new network metrics, RWC (random walk centrality) and CBET (counting betweenness). Also available are the normalized versions of those metrics. These measures of centrality and betweenness are particularly useful for the analysis of very dense weighted networks which include loops. Traditional measures do not work as well for those network characteristics. The main reference is DePaolis at al (2022) <doi:10.1007/s41109-022-00519-2>.
This package provides tools to analyze sex differences in omics data for complex diseases. It includes functions for differential expression analysis using the limma method <doi:10.1093/nar/gkv007>, interaction testing between sex and disease, pathway enrichment with clusterProfiler <doi:10.1089/omi.2011.0118>, and gene regulatory network (GRN) construction and analysis using igraph'. The package enables a reproducible workflow from raw data processing to biological interpretation.
This package provides functions and command-line user interface to generate allocation sequence by response-adaptive randomization for clinical trials. The package currently supports two families of frequentist response-adaptive randomization procedures, Doubly Adaptive Biased Coin Design ('DBCD') and Sequential Estimation-adjusted Urn Model ('SEU'), for binary and normal endpoints. One-sided proportion (or mean) difference and Chi-square (or ANOVA') hypothesis testing methods are also available in the package to facilitate the inference for treatment effect. Additionally, the package provides comprehensive and efficient tools to allow one to evaluate and compare the performance of randomization procedures and tests based on various criteria. For example, plots for relationship among assumed treatment effects, sample size, and power are provided. Five allocation functions for DBCD and six addition rule functions for SEU are implemented to target allocations such as Neyman', Rosenberger Rosenberger et al. (2001) <doi:10.1111/j.0006-341X.2001.00909.x> and Urn allocations.
This package provides the cumulative distribution function (CDF), quantile, and statistical power calculator for a collection of thresholding Fisher's p-value combination methods, including Fisher's p-value combination method, truncated product method and, in particular, soft-thresholding Fisher's p-value combination method which is proven to be optimal in some context of signal detection. The p-value calculator for the omnibus version of these tests are also included.
This package provides a Bayesian method for quantifying the liklihood that a given plasma mutation arises from clonal hematopoesis or the underlying tumor. It requires sequencing data of the mutation in plasma and white blood cells with the number of distinct and mutant reads in both tissues. We implement a Monte Carlo importance sampling method to assess the likelihood that a mutation arises from the tumor relative to non-tumor origin.
This package provides a set of tools for working with miRNA affinity models (KdModels), efficiently scanning for miRNA binding sites, and predicting target repression. It supports scanning using miRNA seeds, full miRNA sequences (enabling 3 alignment) and KdModels, and includes the prediction of slicing and TDMD sites. Finally, it includes utility and plotting functions (e.g. for the visual representation of miRNA-target alignment).
STADyUM is a package with functionality for analyzing nascent RNA read counts to infer transcription rates. This includes utilities for processing experimental nascent RNA read counts as well as for simulating PRO-seq data. Rates such as initiation, pause release and landing pad occupancy are estimated from either synthetic or experimental data. There are also options for varying pause sites and including steric hindrance of initiation in the model.
Download data from the Access to Opportunities Project (AOP)'. The aopdata package brings annual estimates of access to employment, health, education and social assistance services by transport mode, as well as data on the spatial distribution of population, jobs, health care, schools and social assistance facilities at a fine spatial resolution for all cities included in the project. More info on the AOP website <https://www.ipea.gov.br/acessooportunidades/en/>.
This package implements the Bayesian FDR control described by Newton et al. (2004), <doi:10.1093/biostatistics/5.2.155>. Allows optimisation and visualisation of expected error rates based on tail posterior probability tests. Based on code written by Catalina Vallejos for BASiCS, see Beyond comparisons of means: understanding changes in gene expression at the single-cell level Vallejos et al. (2016) <doi:10.1186/s13059-016-0930-3>.
This package provides a ggplot2 centric approach to bivariate mapping. This is a technique that maps two quantities simultaneously rather than the single value that most thematic maps display. The package provides a suite of tools for calculating breaks using multiple different approaches, a selection of palettes appropriate for bivariate mapping and scale functions for ggplot2 calls that adds those palettes to maps. Tools for creating bivariate legends are also included.
This package implements non-parametric analyses for clustered binary and multinomial data. The elements of the cluster are assumed exchangeable, and identical joint distribution (also known as marginal compatibility, or reproducibility) is assumed for clusters of different sizes. A trend test based on stochastic ordering is implemented. Szabo A, George EO. (2010) <doi:10.1093/biomet/asp077>; George EO, Cheon K, Yuan Y, Szabo A (2016) <doi:10.1093/biomet/asw009>.
While autoregressive distributed lag (ARDL) models allow for extremely flexible dynamics, interpreting substantive significance of complex lag structures remains difficult. This package is designed to assist users in dynamically simulating and plotting the results of various ARDL models. It also contains post-estimation diagnostics, including a test for cointegration when estimating the error-correction variant of the autoregressive distributed lag model (Pesaran, Shin, and Smith 2001 <doi:10.1002/jae.616>).
This package provides various tools for analysing density profiles obtained by resistance drilling. It can load individual or multiple files and trim the starting and ending part of each density profile. Tools are also provided to trim profiles manually, to remove the trend from measurements using several methods, to plot the profiles and to detect tree rings automatically. Written with a focus on forestry use of resistance drilling in standing trees.
Fit and explore Drift Diffusion Models (DDMs), a common tool in psychology for describing decision processes in simple tasks. It can handle both time-independent and time-dependent DDMs. You either choose prebuilt models or create your own, and the package takes care of model predictions and parameter estimation. Model predictions are derived via the numerical solutions provided by Richter, Ulrich, and Janczyk (2023, <doi:10.1016/j.jmp.2023.102756>).
Streamlines Quarto workflows by providing tools for consistent project setup and documentation. Enables portability through reusable metadata, automated project structure creation, and standardized templates. Features include enhanced project initialization, pre-formatted Quarto documents, inclusion of Quarto brand functionality, comprehensive data protection settings, custom styling, and structured documentation generation. Designed to improve efficiency and collaboration in R data science projects by reducing repetitive setup tasks while maintaining consistent formatting across multiple documents.
Estimation of life expectancy and Life Years Lost (LYL, or lillies for short) for a given population, for example those with a given disease or condition. In addition, the package can be used to compare estimates from different populations, or to estimate confidence intervals. Technical details of the method are available in Plana-Ripoll et al. (2020) <doi:10.1371/journal.pone.0228073> and Andersen (2017) <doi:10.1002/sim.7357>.
This package provides tools to quantify ecological memory in long time-series with Random Forest models (Breiman 2001 <doi:10.1023/A:1010933404324>) fitted with the ranger library (Wright and Ziegler 2017 <doi:10.18637/jss.v077.i01>). Particularly oriented to palaeoecological datasets and simulated pollen curves produced by the virtualPollen package, but also applicable to other long time-series involving a set of environmental drivers and a biotic response.
Estimating the force of infection from time varying, age varying, or constant serocatalytic models from population based seroprevalence studies using a Bayesian framework, including data simulation functions enabling the generation of serological surveys based on this models. This tool also provides a flexible prior specification syntax for the force of infection and the seroreversion rate, as well as methods to assess model convergence and comparison criteria along with useful visualisation functions.
Calculates a modified Simplified Surface Energy Balance Index (SSEBI) and the Evaporative Fraction (EF) using geospatial raster data such as albedo and surface-air temperature difference (TSâ TA). The SSEBI is computed from albedo and TSâ TA to estimate surface moisture and evaporative dynamics, providing a robust assessment of surface dryness while accounting for atmospheric variations. Based on Roerink, Su, and Menenti (2000) <doi:10.1016/S1464-1909(99)00128-8>.
Implementation of the SSR-Algorithm. The Sign-Simplicity-Regression model is a nonparametric statistical model which is based on residual signs and simplicity assumptions on the regression function. Goal is to calculate the most parsimonious regression function satisfying the statistical adequacy requirements. Theory and functions are specified in Metzner (2020, ISBN: 979-8-68239-420-3, "Trendbasierte Prognostik") and Metzner (2021, ISBN: 979-8-59347-027-0, "Adäquates Maschinelles Lernen").
The functions sp() and sp_seq() compute the support points in Mak and Joseph (2018) <DOI:10.1214/17-AOS1629>. Support points can be used as a representative sample of a desired distribution, or a representative reduction of a big dataset (e.g., an "optimal" thinning of Markov-chain Monte Carlo sample chains). This work was supported by USARO grant W911NF-14-1-0024 and NSF DMS grant 1712642.
Includes: (i) tests and visualisations that can help the modeller explore time series components and perform decomposition; (ii) modelling shortcuts, such as functions to construct lagmatrices and seasonal dummy variables of various forms; (iii) an implementation of the Theta method; (iv) tools to facilitate the design of the forecasting process, such as ABC-XYZ analyses; and (v) "quality of life" functions, such as treating time series for trailing and leading values.
Fit species distribution models (SDMs) using the tidymodels framework, which provides a standardised interface to define models and process their outputs. tidysdm expands tidymodels by providing methods for spatial objects, models and metrics specific to SDMs, as well as a number of specialised functions to process occurrences for contemporary and palaeo datasets. The full functionalities of the package are described in Leonardi et al. (2024) <doi:10.1111/2041-210X.14406>.