This package provides a Shiny application to access the functionalities and datasets of the archeofrag package for spatial analysis in archaeology from refitting data. Quick and seamless exploration of archaeological refitting datasets, focusing on physical refits only. Features include: built-in documentation and convenient workflow, plot generation and exports, exploration of spatial units merging solutions, simulation of archaeological site formation processes, support for parallel computing, R code generation to re-execute simulations and ensure reproducibility, code generation for the openMOLE model exploration software. A demonstration of the app is available at <https://analytics.huma-num.fr/Sebastien.Plutniak/archeofrag/>.
Unified and user-friendly framework for using new distributional representations of biosensors data in different statistical modeling tasks: regression models, hypothesis testing, cluster analysis, visualization, and descriptive analysis. Distributional representations are a functional extension of compositional time-range metrics and we have used them successfully so far in modeling glucose profiles and accelerometer data. However, these functional representations can be used to represent any biosensor data such as ECG or medical imaging such as fMRI. Matabuena M, Petersen A, Vidal JC, Gude F. "Glucodensities: A new representation of glucose profiles using distributional data analysis" (2021) <doi:10.1177/0962280221998064>.
Predictive scores must be updated with care, because actions taken on the basis of existing risk scores causes bias in risk estimates from the updated score. A holdout set is a straightforward way to manage this problem: a proportion of the population is held-out from computation of the previous risk score. This package provides tools to estimate a size for this holdout set and associated errors. Comprehensive vignettes are included. Please see: Haidar-Wehbe S, Emerson SR, Aslett LJM, Liley J (2022) <doi:10.48550/arXiv.2202.06374> (to appear in Annals of Applied Statistics) for details of methods.
PACTA (Paris Agreement Capital Transition Assessment) for Banks is a tool that allows banks to calculate the climate alignment of their corporate lending portfolios. This package is designed to make it easy to install and load multiple PACTA for Banks packages in a single step. It also provides thorough documentation - the PACTA for Banks cookbook at <https://rmi-pacta.github.io/pacta.loanbook/articles/cookbook_overview.html> - on how to run a PACTA for Banks analysis. This covers prerequisites for the analysis, the separate steps of running the analysis, the interpretation of PACTA for Banks results, and advanced use cases.
Fits non-crossing regression quantiles as a function of linear covariates and multiple smooth terms, including varying coefficients, via B-splines with L1-norm difference penalties. Random intercepts and variable selection are allowed via the lasso penalties. The smoothing parameters are estimated as part of the model fitting, see Muggeo and others (2021) <doi:10.1177/1471082X20929802>. Monotonicity and concavity constraints on the fitted curves are allowed, see Muggeo and others (2013) <doi:10.1007/s10651-012-0232-1>, and also <doi:10.13140/RG.2.2.12924.85122> or <doi:10.13140/RG.2.2.29306.21445> some code examples.
This package provides a function extrapolate that extrapolates a given function f(x) to f(x0), evaluating f only at a geometric sequence of points > x0 (or optionally < x0). The key algorithm is Richardson extrapolation using a Neville–Aitken tableau, which adaptively increases the degree of an extrapolation polynomial until convergence is achieved to a desired tolerance (or convergence stalls due to e.g. floating-point errors). This allows one to obtain f(x0) to high-order accuracy, assuming that f(x0+h) has a Taylor series or some other power series in h.
Estimate and plot confounder-adjusted survival curves using either Direct Adjustment', Direct Adjustment with Pseudo-Values', various forms of Inverse Probability of Treatment Weighting', two forms of Augmented Inverse Probability of Treatment Weighting', Empirical Likelihood Estimation or Targeted Maximum Likelihood Estimation'. Also includes a significance test for the difference between two adjusted survival curves and the calculation of adjusted restricted mean survival times. Additionally enables the user to estimate and plot cause-specific confounder-adjusted cumulative incidence functions in the competing risks setting using the same methods (with some exceptions). For details, see Denz et. al (2023) <doi:10.1002/sim.9681>.
This package provides tools for Bayesian parameter estimation of adsorption isotherm models using Markov Chain Monte Carlo (MCMC) methods. This package enables users to fit non-linear and linear adsorption isotherm modelsâ Freundlich, Langmuir, and Temkinâ within a probabilistic framework, capturing uncertainty and parameter correlations. It provides posterior summaries, 95% credible intervals, convergence diagnostics (Gelman-Rubin), and visualizations through trace and density plots. With this R package, researchers can rigorously analyze adsorption behavior in environmental and chemical systems using robust Bayesian inference. For more details, see Gilks et al. (1995) <doi:10.1201/b14835>, and Gamerman & Lopes (2006) <doi:10.1201/9781482296426>.
Enhances the functionality of the mvbutils::foodweb() program. The matrix-format output of the original program contains identical row names and column names, each name representing a retrieved function. This format is enhanced by using the find_funs() program [see Sebastian (2017) <https://sebastiansauer.github.io/finds_funs/>] to concatenate the package name to the function name. Each package is assigned a unique color, that is used to color code the text naming the packages and the functions. This color coding is extended to the entries of value "1" within the matrix, indicating the pattern of ancestor and descendent functions.
Plots traced ultrasound tongue imaging data according to a polar coordinate system. There is currently support for plotting means and standard deviations of each category's trace; Smoothing Splines Analysis of Variance (SSANOVA) could be implemented as well. The origin of the polar coordinates may be defined manually or automatically determined based on different algorithms. Currently ultrapolaRplot supports ultrasound tongue imaging trace data from UltraTrace (<https://github.com/SwatPhonLab/UltraTrace>). UltraTrace is capable of importing data from Articulate Instruments AAA. read_textgrid.R is required for opening TextGrids to determine category and alignment information of ultrasound traces.
This package provides a variety of Network Scale-up Models for researchers to analyze Aggregated Relational Data, mostly through the use of Stan. In this version, the package implements models from Laga, I., Bao, L., and Niu, X (2021) <arXiv:2109.10204>, Zheng, T., Salganik, M. J., and Gelman, A. (2006) <doi:10.1198/016214505000001168>, Killworth, P. D., Johnsen, E. C., McCarty, C., Shelley, G. A., and Bernard, H. R. (1998) <doi:10.1016/S0378-8733(96)00305-X>, and Killworth, P. D., McCarty, C., Bernard, H. R., Shelley, G. A., and Johnsen, E. C. (1998) <doi:10.1177/0193841X9802200205>.
This package implements an adaptively weighted group Lasso procedure for simultaneous variable selection and structure identification in varying coefficient quantile regression models and additive quantile regression models with ultra-high dimensional covariates. The methodology, grounded in a strong sparsity condition, establishes selection consistency under certain weight conditions. To address the challenge of tuning parameter selection in practice, a BIC-type criterion named high-dimensional information criterion (HDIC) is proposed. The Lasso procedure, guided by HDIC-determined tuning parameters, maintains selection consistency. Theoretical findings are strongly supported by simulation studies. (Toshio Honda, Ching-Kang Ing, Wei-Ying Wu, 2019, <DOI:10.3150/18-BEJ1091>).
Efficient tools for parsing and standardizing Australian addresses from textual data. It utilizes optimized algorithms to accurately identify and extract components of addresses, such as street names, types, and postcodes, especially for large batched data in contexts where sending addresses to internet services may be slow or inappropriate. The core functionality is built on fast string processing techniques to handle variations in address formats and abbreviations commonly found in Australian address data. Designed for data scientists, urban planners, and logistics analysts, the package facilitates the cleaning and normalization of address information, supporting better data integration and analysis in urban studies, geography, and related fields.
Estimates the relative transmission probabilities between cases in an infectious disease outbreak or cluster using naive Bayes. Included are various functions to use these probabilities to estimate transmission parameters such as the generation/serial interval and reproductive number as well as finding the contribution of covariates to the probabilities and visualizing results. The ideal use is for an infectious disease dataset with metadata on the majority of cases but more informative data such as contact tracing or pathogen whole genome sequencing on only a subset of cases. For a detailed description of the methods see Leavitt et al. (2020) <doi:10.1093/ije/dyaa031>.
The user has the option to utilize the two-dimensional density estimation techniques called smoothed density published by Eilers and Goeman (2004) <doi:10.1093/bioinformatics/btg454>, and pareto density which was evaluated for univariate data by Thrun, Gehlert and Ultsch, 2020 <doi:10.1371/journal.pone.0238835>. Moreover, it provides visualizations of the density estimation in the form of two-dimensional scatter plots in which the points are color-coded based on increasing density. Colors are defined by the one-dimensional clustering technique called 1D distribution cluster algorithm (DDCAL) published by Lux and Rinderle-Ma (2023) <doi:10.1007/s00357-022-09428-6>.
Programming vaccine specific Clinical Data Interchange Standards Consortium (CDISC) compliant Analysis Data Model (ADaM) datasets in R'. Flat model is followed as per Center for Biologics Evaluation and Research (CBER) guidelines for creating vaccine specific domains. ADaM datasets are a mandatory part of any New Drug or Biologics License Application submitted to the United States Food and Drug Administration (FDA). Analysis derivations are implemented in accordance with the "Analysis Data Model Implementation Guide" (CDISC Analysis Data Model Team (2021), <https://www.cdisc.org/standards/foundational/adam/adamig-v1-3-release-package>). The package is an extension package of the admiral package.
In computationally demanding analysis projects, statisticians and data scientists asynchronously deploy long-running tasks to distributed systems, ranging from traditional clusters to cloud services. The crew.aws.batch package extends the mirai'-powered crew package with a worker launcher plugin for AWS Batch. Inspiration also comes from packages mirai by Gao (2023) <https://github.com/r-lib/mirai>, future by Bengtsson (2021) <doi:10.32614/RJ-2021-048>, rrq by FitzJohn and Ashton (2023) <https://github.com/mrc-ide/rrq>, clustermq by Schubert (2019) <doi:10.1093/bioinformatics/btz284>), and batchtools by Lang, Bischl, and Surmann (2017). <doi:10.21105/joss.00135>.
Sampler and post-processing functions for semi-parametric Bayesian infinite factor models, motivated by the Multiplicative Gamma Shrinkage Prior of Bhattacharya and Dunson (2011) <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3419391/>. Contains component C++ functions for building samplers for linear and 2-way interaction factor models using the multiplicative gamma and Dirichlet-Laplace shrinkage priors. The package also contains post processing functions to return matrices that display rotational ambiguity to identifiability through successive application of orthogonalization procedures and resolution of column label and sign switching. This package was developed with the support of the National Institute of Environmental Health Sciences grant 1R01ES028804-01.
The package provides a consistent way of producing references throughout a project. Enough flexibility is provided to make local changes to a single reference. The user can configure their own setup. The package offers a direct interface to varioref (for use, for example, in large projects such as a series of books, or a multivolume thesis written as a series of documents), and name references from the nameref package may be incorporated with ease. For large projects such as a series of books or a multi volume thesis, written as freestanding documents, a facility is provided to interface to the xr package for external document references.
Ordinal patterns describe the dynamics of a time series by looking at the ranks of subsequent observations. By comparing ordinal patterns of two times series, Schnurr (2014) <doi:10.1007/s00362-013-0536-8> defines a robust and non-parametric dependence measure: the ordinal pattern coefficient. Functions to calculate this and a method to detect a change in the pattern coefficient proposed in Schnurr and Dehling (2017) <doi:10.1080/01621459.2016.1164706> are provided. Furthermore, the package contains a function for calculating the ordinal pattern frequencies. Generalized ordinal patterns as proposed by Schnurr and Fischer (2022) <doi:10.1016/j.csda.2022.107472> are also considered.
The SPARRA risk score (Scottish Patients At Risk of admission and Re-Admission) estimates yearly risk of emergency hospital admission using electronic health records on a monthly basis for most of the Scottish population. This package implements a suite of functions used to analyse the behaviour and performance of the score, focusing particularly on differential performance over demographically-defined groups. It includes useful utility functions to plot receiver-operator-characteristic, precision-recall and calibration curves, draw stock human figures, estimate counterfactual quantities without the need to re-compute risk scores, to simulate a semi-realistic dataset. Our manuscript can be found at: <doi:10.1371/journal.pdig.0000675>.
This package implements functionality for exploratory data analysis and nonparametric analysis of spatial data, mainly spatial point patterns, in the spatstat family of packages. Methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported.
Implementation of single-source capture-recapture methods for population size estimation using zero-truncated, zero-one truncated and zero-truncated one-inflated Poisson, Geometric and Negative Binomial regression as well as Zelterman's and Chao's regression. Package includes point and interval estimators for the population size with variances estimated using analytical or bootstrap method. Details can be found in: van der Heijden et all. (2003) <doi:10.1191/1471082X03st057oa>, Böhning and van der Heijden (2019) <doi:10.1214/18-AOAS1232>, Böhning et al. (2020) Capture-Recapture Methods for the Social and Medical Sciences or Böhning and Friedl (2021) <doi:10.1007/s10260-021-00556-8>.
Implementations of various robust and flexible model-based clustering methods for data sets with missing values at random. Two main models are: Multivariate Contaminated Normal Mixture (MCNM, Tong and Tortora, 2022, <doi:10.1007/s11634-021-00476-1>) and Multivariate Generalized Hyperbolic Mixture (MGHM, Wei et al., 2019, <doi:10.1016/j.csda.2018.08.016>). Mixtures via some special or limiting cases of the multivariate generalized hyperbolic distribution are also included: Normal-Inverse Gaussian, Symmetric Normal-Inverse Gaussian, Skew-Cauchy, Cauchy, Skew-t, Student's t, Normal, Symmetric Generalized Hyperbolic, Hyperbolic Univariate Marginals, Hyperbolic, and Symmetric Hyperbolic. Funding: This work was partially supported by the National Science foundation NSF Grant NO. 2209974.