Density-based clustering methods are well adapted to the clustering of high-dimensional data and enable the discovery of core groups of various shapes despite large amounts of noise. This package provides a novel density-based cluster extraction method, OPTICS k-Xi, and a framework to compare k-Xi models using distance-based metrics to investigate datasets with unknown number of clusters. The vignette first introduces density-based algorithms with simulated datasets, then presents and evaluates the k-Xi cluster extraction method. Finally, the models comparison framework is described and experimented on 2 genetic datasets to identify groups and their discriminating features. The k-Xi algorithm is a novel OPTICS cluster extraction method that specifies directly the number of clusters and does not require fine-tuning of the steepness parameter as the OPTICS Xi method. Combined with a framework that compares models with varying parameters, the OPTICS k-Xi method can identify groups in noisy datasets with unknown number of clusters. Results on summarized genetic data of 1,200 patients are in Charlon T. (2019) <doi:10.13097/archive-ouverte/unige:161795>. A short video tutorial can be found at <https://www.youtube.com/watch?v=P2XAjqI5Lc4/>.
Generates predictive distributions based on calibrating priors for various commonly used statistical models, including models with predictors. Routines for densities, probabilities, quantiles, random deviates and the parameter posterior are provided. The predictions are generated from the Bayesian prediction integral, with priors chosen to give good reliability (also known as calibration). For homogeneous models, the prior is set to the right Haar prior, giving predictions which are exactly reliable. As a result, in repeated testing, the frequencies of out-of-sample outcomes and the probabilities from the predictions agree. For other models, the prior is chosen to give good reliability. Where possible, the Bayesian prediction integral is solved exactly. Where exact solutions are not possible, the Bayesian prediction integral is solved using the Datta-Mukerjee-Ghosh-Sweeting (DMGS) asymptotic expansion. Optionally, the prediction integral can also be solved using posterior samples generated using Paul Northrop's ratio of uniforms sampling package ('rust'). Results are also generated based on maximum likelihood, for comparison purposes. Various model selection diagnostics and testing routines are included. Based on "Reducing reliability bias in assessments of extreme weather risk using calibrating priors", Jewson, S., Sweeting, T. and Jewson, L. (2024); <doi:10.5194/ascmo-11-1-2025>.
This package provides a powerful yet simple graphical tool available in the field of psychometrics is the Wright Map (also known as item maps or item-person maps), which presents the location of both respondents and items on the same scale. Wright Maps are commonly used to present the results of dichotomous or polytomous item response models. The WrightMap package provides functions to create these plots from item parameters and person estimates stored as R objects. Although the package can be used in conjunction with any software used to estimate the IRT model (e.g. TAM', mirt', eRm or IRToys in R', or Stata', Mplus', etc.), WrightMap features special integration with ConQuest to facilitate reading and plotting its output directly.The wrightMap function creates Wright Maps based on person estimates and item parameters produced by an item response analysis. The CQmodel function reads output files created using ConQuest software and creates a set of data frames for easy data manipulation, bundled in a CQmodel object. The wrightMap function can take a CQmodel object as input or it can be used to create Wright Maps directly from data frames of person and item parameters.
Contain code to work with a C struct, in short cgeneric, to define a Gaussian Markov random (GMRF) model. The cgeneric contain code to specify GMRF elements such as the graph and the precision matrix, and also the initial and prior for its parameters, useful for model inference. It can be accessed from a C program and is the recommended way to implement new GMRF models in the INLA package (<https://www.r-inla.org>). The INLAtools implement functions to evaluate each one of the model specifications from R. The implemented functionalities leverage the use of cgeneric models and provide a way to debug the code as well to work with the prior for the model parameters and to sample from it. A very useful functionality is the Kronecker product method that creates a new model from multiple cgeneric models. It also works with the rgeneric, the R version of the cgeneric intended to easy try implementation of new GMRF models. The Kronecker between two cgeneric models was used in Sterrantino et. al. (2024) <doi:10.1007/s10260-025-00788-y>, and can be used to build the spatio-temporal intrinsic interaction models for what the needed constraints are automatically set.
The identification of novel compound-protein interaction (CPI) is important in drug discovery. Revealing unknown compound-protein interactions is useful to design a new drug for a target protein by screening candidate compounds. The accurate CPI prediction assists in effective drug discovery process. To identify potential CPI effectively, prediction methods based on machine learning and deep learning have been developed. Data for sequences are provided as discrete symbolic data. In the data, compounds are represented as SMILES (simplified molecular-input line-entry system) strings and proteins are sequences in which the characters are amino acids. The outcome is defined as a variable that indicates how strong two molecules interact with each other or whether there is an interaction between them. In this package, a deep-learning based model that takes only sequence information of both compounds and proteins as input and the outcome as output is used to predict CPI. The model is implemented by using compound and protein encoders with useful features. The CPI model also supports other modeling tasks, including protein-protein interaction (PPI), chemical-chemical interaction (CCI), or single compounds and proteins. Although the model is designed for proteins, DNA and RNA can be used if they are represented as sequences.
This package provides functions for the import, transformation, and analysis of data from muscle physiology experiments. The work loop technique is used to evaluate the mechanical work and power output of muscle. Josephson (1985) <doi:10.1242/jeb.114.1.493> modernized the technique for application in comparative biomechanics. Although our initial motivation was to provide functions to analyze work loop experiment data, as we developed the package we incorporated the ability to analyze data from experiments that are often complementary to work loops. There are currently three supported experiment types: work loops, simple twitches, and tetanus trials. Data can be imported directly from .ddf files or via an object constructor function. Through either method, data can then be cleaned or transformed via methods typically used in studies of muscle physiology. Data can then be analyzed to determine the timing and magnitude of force development and relaxation (for isometric trials) or the magnitude of work, net power, and instantaneous power among other things (for work loops). Although we do not provide plotting functions, all resultant objects are designed to be friendly to visualization via either base-R plotting or tidyverse functions. This package has been peer-reviewed by rOpenSci (v. 1.1.0).
For Bayesian and classical inference and prediction with count-valued data, Simultaneous Transformation and Rounding (STAR) Models provide a flexible, interpretable, and easy-to-use approach. STAR models the observed count data using a rounded continuous data model and incorporates a transformation for greater flexibility. Implicitly, STAR formalizes the commonly-applied yet incoherent procedure of (i) transforming count-valued data and subsequently (ii) modeling the transformed data using Gaussian models. STAR is well-defined for count-valued data, which is reflected in predictive accuracy, and is designed to account for zero-inflation, bounded or censored data, and over- or underdispersion. Importantly, STAR is easy to combine with existing MCMC or point estimation methods for continuous data, which allows seamless adaptation of continuous data models (such as linear regressions, additive models, BART, random forests, and gradient boosting machines) for count-valued data. The package also includes several methods for modeling count time series data, namely via warped Dynamic Linear Models. For more details and background on these methodologies, see the works of Kowal and Canale (2020) <doi:10.1214/20-EJS1707>, Kowal and Wu (2022) <doi:10.1111/biom.13617>, King and Kowal (2022) <arXiv:2110.14790>, and Kowal and Wu (2023) <arXiv:2110.12316>.
Read CODAR's SeaSonde High-Frequency Radar spectra files, compute radial metrics, and generate plots for spectra and antenna pattern data. Implementation is based in technical manuals, publications and patents, please refer to the following documents for more information: Barrick and Lipa (1999) <https://codar.com/images/about/patents/05990834.PDF>; CODAR Ocean Sensors (2002) <http://support.codar.com/Technicians_Information_Page_for_SeaSondes/Docs/Informative/FirstOrder_Settings.pdf>; Lipa et al. (2006) <doi:10.1109/joe.2006.886104>; Paolo et al. (2007) <doi:10.1109/oceans.2007.4449265>; CODAR Ocean Sensors (2009a) <http://support.codar.com/Technicians_Information_Page_for_SeaSondes/Docs/GuidesToFileFormats/File_AntennaPattern.pdf>; CODAR Ocean Sensors (2009b) <http://support.codar.com/Technicians_Information_Page_for_SeaSondes/Docs/GuidesToFileFormats/File_CrossSpectraReduced.pdf>; CODAR Ocean Sensors (2016a) <http://support.codar.com/Technicians_Information_Page_for_SeaSondes/Manuals_Documentation_Release_8/File_Formats/File_Cross_Spectra_V6.pdf>; CODAR Ocean Sensors (2016b) <http://support.codar.com/Technicians_Information_Page_for_SeaSondes/Manuals_Documentation_Release_8/File_Formats/FIle_Reduced_Spectra.pdf>; CODAR Ocean Sensors (2016c) <http://support.codar.com/Technicians_Information_Page_for_SeaSondes/Manuals_Documentation_Release_8/Application_Guides/Guide_SpectraPlotterMap.pdf>; Bushnell and Worthington (2022) <doi:10.25923/4c5x-g538>.
Test for association between the observed data and their estimated latent variables. The jackstraw package provides a resampling strategy and testing scheme to estimate statistical significance of association between the observed data and their latent variables. Depending on the data type and the analysis aim, the latent variables may be estimated by principal component analysis (PCA), factor analysis (FA), K-means clustering, and related unsupervised learning algorithms. The jackstraw methods learn over-fitting characteristics inherent in this circular analysis, where the observed data are used to estimate the latent variables and used again to test against that estimated latent variables. When latent variables are estimated by PCA, the jackstraw enables statistical testing for association between observed variables and latent variables, as estimated by low-dimensional principal components (PCs). This essentially leads to identifying variables that are significantly associated with PCs. Similarly, unsupervised clustering, such as K-means clustering, partition around medoids (PAM), and others, finds coherent groups in high-dimensional data. The jackstraw estimates statistical significance of cluster membership, by testing association between data and cluster centers. Clustering membership can be improved by using the resulting jackstraw p-values and posterior inclusion probabilities (PIPs), with an application to unsupervised evaluation of cell identities in single cell RNA-seq (scRNA-seq).
Package performs Cox regression and survival distribution function estimation when the survival times are subject to double truncation. In case that the survival and truncation times are quasi-independent, the estimation procedure for each method involves inverse probability weighting, where the weights correspond to the inverse of the selection probabilities and are estimated using the survival times and truncation times only. A test for checking this independence assumption is also included in this package. The functions available in this package for Cox regression, survival distribution function estimation, and testing independence under double truncation are based on the following methods, respectively: Rennert and Xie (2018) <doi:10.1111/biom.12809>, Shen (2010) <doi:10.1007/s10463-008-0192-2>, Martin and Betensky (2005) <doi:10.1198/016214504000001538>. When the survival times are dependent on at least one of the truncation times, an EM algorithm is employed to obtain point estimates for the regression coefficients. The standard errors are calculated using the bootstrap method. See Rennert and Xie (2022) <doi:10.1111/biom.13451>. Both the independent and dependent cases assume no censoring is present in the data. Please contact Lior Rennert <liorr@clemson.edu> for questions regarding function coxDT and Yidan Shi <yidan.shi@pennmedicine.upenn.edu> for questions regarding function coxDTdep.
The EQ-5D is a widely-used standarized instrument for measuring Health Related Quality Of Life (HRQOL), developed by the EuroQol group <https://euroqol.org/>. It assesses five dimensions; mobility, self-care, usual activities, pain/discomfort, and anxiety/depression, using either a three-level (EQ-5D-3L) or five-level (EQ-5D-5L) scale. Scores from these dimensions are commonly converted into a single utility index using country-specific value sets, which are critical in clinical and economic evaluations of healthcare and in population health surveys. The eq5dsuite package enables users to calculate utility index values for the EQ-5D instruments, including crosswalk utilities using the original crosswalk developed by van Hout et al. (2012) <doi:10.1016/j.jval.2012.02.008> (mapping EQ-5D-5L responses to EQ-5D-3L index values), or the recently developed reverse crosswalk by van Hout et al. (2021) <doi:10.1016/j.jval.2021.03.009> (mapping EQ-5D-3L responses to EQ-5D-5L index values). Users are allowed to add and/or remove user-defined value sets. Additionally, the package provides tools to analyze EQ-5D data according to the recommended guidelines outlined in "Methods for Analyzing and Reporting EQ-5D data" by Devlin et al. (2020) <doi:10.1007/978-3-030-47622-9>.
This package provides a flexible simulation tool for phylogenetic trees under a general model for speciation and extinction. Trees with a user-specified number of extant tips, or a user-specified stem age are simulated. It is possible to assume any probability distribution for the waiting time until speciation and extinction. Furthermore, the waiting times to speciation / extinction may be scaled in different parts of the tree, meaning we can simulate trees with clade-dependent diversification processes. At a speciation event, one species splits into two. We allow for two different modes at these splits: (i) symmetric, where for every speciation event new waiting times until speciation and extinction are drawn for both daughter lineages; and (ii) asymmetric, where a speciation event results in one species with new waiting times, and another that carries the extinction time and age of its ancestor. The symmetric mode can be seen as an vicariant or allopatric process where divided populations suffer equal evolutionary forces while the asymmetric mode could be seen as a peripatric speciation where a mother lineage continues to exist. Reference: O. Hagen and T. Stadler (2017). TreeSimGM: Simulating phylogenetic trees under general Bellman Harris models with lineage-specific shifts of speciation and extinction in R. Methods in Ecology and Evolution. <doi:10.1111/2041-210X.12917>.
This package provides a common way of validating a biological assay for is through a procedure, where m levels of an analyte are measured with n replicates at each level, and if all m estimates of the coefficient of variation (CV) are less than some prespecified level, then the assay is declared validated for precision within the range of the m analyte levels. Two limitations of this procedure are: there is no clear statistical statement of precision upon passing, and it is unclear how to modify the procedure for assays with constant standard deviation. We provide tools to convert such a procedure into a set of m hypothesis tests. This reframing motivates the m:n:q procedure, which upon completion delivers a 100q% upper confidence limit on the CV. Additionally, for a post-validation assay output of y, the method gives an ``effective standard deviation interval of log(y) plus or minus r, which is a 68% confidence interval on log(mu), where mu is the expected value of the assay output for that sample. Further, the m:n:q procedure can be straightforwardly applied to constant standard deviation assays. We illustrate these tools by applying them to a growth inhibition assay. This is an implementation of the methods described in Fay, Sachs, and Miura (2018) <doi:10.1002/sim.7528>.
Developed for the following tasks. 1 ) Computing the probability density function, cumulative distribution function, random generation, and estimating the parameters of the eleven mixture models. 2 ) Point estimation of the parameters of two - parameter Weibull distribution using twelve methods and three - parameter Weibull distribution using nine methods. 3 ) The Bayesian inference for the three - parameter Weibull distribution. 4 ) Estimating parameters of the three - parameter Birnbaum - Saunders, generalized exponential, and Weibull distributions fitted to grouped data using three methods including approximated maximum likelihood, expectation maximization, and maximum likelihood. 5 ) Estimating the parameters of the gamma, log-normal, and Weibull mixture models fitted to the grouped data through the EM algorithm, 6 ) Estimating parameters of the nonlinear height curve fitted to the height - diameter observation, 7 ) Estimating parameters, computing probability density function, cumulative distribution function, and generating realizations from gamma shape mixture model introduced by Venturini et al. (2008) <doi:10.1214/07-AOAS156> , 8 ) The Bayesian inference, computing probability density function, cumulative distribution function, and generating realizations from univariate and bivariate Johnson SB distribution, 9 ) Robust multiple linear regression analysis when error term follows skewed t distribution, 10 ) Estimating parameters of a given distribution fitted to grouped data using method of maximum likelihood, and 11 ) Estimating parameters of the Johnson SB distribution through the Bayesian, method of moment, conditional maximum likelihood, and two - percentile method.
Family Planning programs and initiatives typically use nationally representative surveys to estimate key indicators of a countryâ s family planning progress. However, in recent years, routinely collected family planning services data (Service Statistics) have been used as a supplementary data source to bridge gaps in the surveys. The use of service statistics comes with the caveat that adjustments need to be made for missing private sector contributions to the contraceptive method supply chain. Evaluating the supply source of modern contraceptives often relies on Demographic Health Surveys (DHS), where many countries do not have recent data beyond 2015/16. Fortunately, in the absence of recent surveys we can rely on statistical model-based estimates and projections to fill the knowledge gap. We present a Bayesian, hierarchical, penalized-spline model with multivariate-normal spline coefficients, to account for across method correlations, to produce country-specific,annual estimates for the proportion of modern contraceptive methods coming from the public and private sectors. This package provides a quick and convenient way for users to access the DHS modern contraceptive supply share data at national and subnational administration levels, estimate, evaluate and plot annual estimates with uncertainty for a sample of low- and middle-income countries. Methods for the estimation of method supply shares at the national level are described in Comiskey, Alkema, Cahill (2022) <arXiv:2212.03844>.
Based on individual market shares of all participants in a market or space, the package offers a set of different structural and concentration measures frequently - and not so frequently - used in research and in practice. Measures can be calculated in groups or individually. The calculated measure or the resulting vector in table format should help practitioners make more informed decisions. Methods used in this package are from: 1. Chang, E. J., Guerra, S. M., de Souza Penaloza, R. A. & Tabak, B. M. (2005) "Banking concentration: the Brazilian case". 2. Cobham, A. and A. Summer (2013). "Is It All About the Tails? The Palma Measure of Income Inequality". 3. Garcia Alba Idunate, P. (1994). "Un Indice de dominancia para el analisis de la estructura de los mercados". 4. Ginevicius, R. and S. Cirba (2009). "Additive measurement of market concentration" <doi:10.3846/1611-1699.2009.10.191-198>. 5. Herfindahl, O. C. (1950), "Concentration in the steel industry" (PhD thesis). 6. Hirschmann, A. O. (1945), "National power and structure of foreign trade". 7. Melnik, A., O. Shy, and R. Stenbacka (2008), "Assessing market dominance" <doi:10.1016/j.jebo.2008.03.010>. 8. Palma, J. G. (2006). "Globalizing Inequality: Centrifugal and Centripetal Forces at Work". 9. Shannon, C. E. (1948). "A Mathematical Theory of Communication". 10. Simpson, E. H. (1949). "Measurement of Diversity" <doi:10.1038/163688a0>.
This package provides automated downloading, parsing and formatting of weather data for Australia through API endpoints provided by the Department of Primary Industries and Regional Development (DPIRD) of Western Australia and by the Science and Technology Division of the Queensland Government's Department of Environment and Science (DES). As well as the Bureau of Meteorology (BOM) of the Australian government precis and coastal forecasts, and downloading and importing radar and satellite imagery files. DPIRD weather data are accessed through public APIs provided by DPIRD, <https://www.dpird.wa.gov.au/online-tools/apis/>, providing access to weather station data from the DPIRD weather station network. Australia-wide weather data are based on data from the Australian Bureau of Meteorology (BOM) data and accessed through SILO (Scientific Information for Land Owners) Jeffrey et al. (2001) <doi:10.1016/S1364-8152(01)00008-1>. DPIRD data are made available under a Creative Commons Attribution 3.0 Licence (CC BY 3.0 AU) license <https://creativecommons.org/licenses/by/3.0/au/deed.en>. SILO data are released under a Creative Commons Attribution 4.0 International licence (CC BY 4.0) <https://creativecommons.org/licenses/by/4.0/>. BOM data are (c) Australian Government Bureau of Meteorology and released under a Creative Commons (CC) Attribution 3.0 licence or Public Access Licence (PAL) as appropriate, see <http://www.bom.gov.au/other/copyright.shtml> for further details.
This package contains functions to compute p-values for the one-sample and two-sample Kolmogorov-Smirnov (KS) tests and the two-sample Kuiper test for any fixed critical level and arbitrary (possibly very large) sample sizes. For the one-sample KS test, this package implements a novel, accurate and efficient method named Exact-KS-FFT, which allows the pre-specified cumulative distribution function under the null hypothesis to be continuous, purely discrete or mixed. In the two-sample case, it is assumed that both samples come from an unspecified (unknown) continuous, purely discrete or mixed distribution, i.e. ties (repeated observations) are allowed, and exact p-values of the KS and the Kuiper tests are computed. Note, the two-sample Kuiper test is often used when data samples are on the line or on the circle (circular data). To cite this package in publication: (for the use of the one-sample KS test) Dimitrina S. Dimitrova, Vladimir K. Kaishev, and Senren Tan. Computing the Kolmogorov-Smirnov Distribution When the Underlying CDF is Purely Discrete, Mixed, or Continuous. Journal of Statistical Software. 2020; 95(10): 1--42. <doi:10.18637/jss.v095.i10>. (for the use of the two-sample KS and Kuiper tests) Dimitrina S. Dimitrova, Yun Jia and Vladimir K. Kaishev (2024). The R functions KS2sample and Kuiper2sample: Efficient Exact Calculation of P-values of the Two-sample Kolmogorov-Smirnov and Kuiper Tests. submitted.
This package provides functions for Gaussian and Non Gaussian (bivariate) spatial and spatio-temporal data analysis are provided for a) (fast) simulation of random fields, b) inference for random fields using standard likelihood and a likelihood approximation method called weighted composite likelihood based on pairs and b) prediction using (local) best linear unbiased prediction. Weighted composite likelihood can be very efficient for estimating massive datasets. Both regression and spatial (temporal) dependence analysis can be jointly performed. Flexible covariance models for spatial and spatial-temporal data on Euclidean domains and spheres are provided. There are also many useful functions for plotting and performing diagnostic analysis. Different non Gaussian random fields can be considered in the analysis. Among them, random fields with marginal distributions such as Skew-Gaussian, Student-t, Tukey-h, Sin-Arcsin, Two-piece, Weibull, Gamma, Log-Gaussian, Binomial, Negative Binomial and Poisson. See the URL for the papers associated with this package, as for instance, Bevilacqua and Gaetan (2015) <doi:10.1007/s11222-014-9460-6>, Bevilacqua et al. (2016) <doi:10.1007/s13253-016-0256-3>, Vallejos et al. (2020) <doi:10.1007/978-3-030-56681-4>, Bevilacqua et. al (2020) <doi:10.1002/env.2632>, Bevilacqua et. al (2021) <doi:10.1111/sjos.12447>, Bevilacqua et al. (2022) <doi:10.1016/j.jmva.2022.104949>, Morales-Navarrete et al. (2023) <doi:10.1080/01621459.2022.2140053>, and a large class of examples and tutorials.
The presence of outliers in a dataset can substantially bias the results of statistical analyses. To correct for outliers, micro edits are manually performed on all records. A set of constraints and decision rules is typically used to aid the editing process. However, straightforward decision rules might overlook anomalies arising from disruption of linear relationships. Computationally efficient methods are provided to identify historical, tail, and relational anomalies at the data-entry level (Sartore et al., 2024; <doi:10.6339/24-JDS1136>). A score statistic is developed for each anomaly type, using a distribution-free approach motivated by the Bienaymé-Chebyshev's inequality, and fuzzy logic is used to detect cellwise outliers resulting from different types of anomalies. Each data entry is individually scored and individual scores are combined into a final score to determine anomalous entries. In contrast to fuzzy logic, Bayesian bootstrap and a Bayesian test based on empirical likelihoods are also provided as studied by Sartore et al. (2024; <doi:10.3390/stats7040073>). These algorithms allow for a more nuanced approach to outlier detection, as it can identify outliers at data-entry level which are not obviously distinct from the rest of the data. --- This research was supported in part by the U.S. Department of Agriculture, National Agriculture Statistics Service. The findings and conclusions in this publication are those of the authors and should not be construed to represent any official USDA, or US Government determination or policy.
This package provides a tuneable and interpretable method for relaxing the instrumental variables (IV) assumptions to infer treatment effects in the presence of unobserved confounding. For a treatment-associated covariate to be a valid IV, it must be (a) unconfounded with the outcome and (b) have a causal effect on the outcome that is exclusively mediated by the exposure. There is no general test of the validity of these IV assumptions for any particular pre-treatment covariate. However, if different pre-treatment covariates give differing causal effect estimates when treated as IVs, then we know at least some of the covariates violate these assumptions. budgetIVr exploits this fact by taking as input a minimum budget of pre-treatment covariates assumed to be valid IVs and idenfiying the set of causal effects that are consistent with the user's data and budget assumption. The following generalizations of this principle can be used in this package: (1) a vector of multiple budgets can be assigned alongside corresponding thresholds that model degrees of IV invalidity; (2) budgets and thresholds can be chosen using specialist knowledge or varied in a principled sensitivity analysis; (3) treatment effects can be nonlinear and/or depend on multiple exposures (at a computational cost). The methods in this package require only summary statistics. Confidence sets are constructed under the "no measurement error" (NOME) assumption from the Mendelian randomization literature. For further methodological details, please refer to Penn et al. (2024) <doi:10.48550/arXiv.2411.06913>.
Assess the agreement in method comparison studies by tolerance intervals and errors-in-variables (EIV) regressions. The Ordinary Least Square regressions (OLSv and OLSh), the Deming Regression (DR), and the (Correlated)-Bivariate Least Square regressions (BLS and CBLS) can be used with unreplicated or replicated data. The BLS() and CBLS() are the two main functions to estimate a regression line, while XY.plot() and MD.plot() are the two main graphical functions to display, respectively an (X,Y) plot or (M,D) plot with the BLS or CBLS results. Four hyperbolic statistical intervals are provided: the Confidence Interval (CI), the Confidence Bands (CB), the Prediction Interval and the Generalized prediction Interval. Assuming no proportional bias, the (M,D) plot (Band-Altman plot) may be simplified by calculating univariate tolerance intervals (beta-expectation (type I) or beta-gamma content (type II)). Major updates from last version 1.0.0 are: title shortened, include the new functions BLS.fit() and CBLS.fit() as shortcut of the, respectively, functions BLS() and CBLS(). References: B.G. Francq, B. Govaerts (2016) <doi:10.1002/sim.6872>, B.G. Francq, B. Govaerts (2014) <doi:10.1016/j.chemolab.2014.03.006>, B.G. Francq, B. Govaerts (2014) <http://publications-sfds.fr/index.php/J-SFdS/article/view/262>, B.G. Francq (2013), PhD Thesis, UCLouvain, Errors-in-variables regressions to assess equivalence in method comparison studies, <https://dial.uclouvain.be/pr/boreal/object/boreal%3A135862/datastream/PDF_01/view>.
This package provides a collection of user-friendly functions for assessing and visualizing fragility of individual studies (Walsh et al., 2014 <doi:10.1016/j.jclinepi.2013.10.019>; Lin, 2021 <doi:10.1111/jep.13428>), conventional pairwise meta-analyses (Atal et al., 2019 <doi:10.1016/j.jclinepi.2019.03.012>), and network meta-analyses of multiple treatments with binary outcomes (Xing et al., 2020 <doi:10.1016/j.jclinepi.2020.07.003>). The included functions are designed to: 1) calculate the fragility index (i.e., the minimal event status modifications that can alter the significance or non-significance of the original result) and fragility quotient (i.e., fragility index divided by sample size) at a specific significance level; 2) give the cases of event status modifications for altering the result's significance or non-significance and visualize these cases; 3) visualize the trend of statistical significance as event status is modified; 4) efficiently derive fragility indexes and fragility quotients at multiple significance levels, and visualize the relationship between these fragility measures against the significance levels; and 5) calculate fragility indexes and fragility quotients of multiple datasets (e.g., a collection of clinical trials or meta-analyses) and produce plots of their overall distributions. The outputs from these functions may inform the robustness of clinical results in terms of statistical significance and aid the interpretation of fragility measures. The usage of this package is illustrated in Lin et al. (2023 <doi:10.1016/j.ajog.2022.08.053>) and detailed in Lin and Chu (2022 <doi:10.1371/journal.pone.0268754>).
This package provides a modification of the preventive vaccine efficacy trial design of Gilbert, Grove et al. (2011, Statistical Communications in Infectious Diseases) is implemented, with application generally to individual-randomized clinical trials with multiple active treatment groups and a shared control group, and a study endpoint that is a time-to-event endpoint subject to right-censoring. The design accounts for the issues that the efficacy of the treatment/vaccine groups may take time to accrue while the multiple treatment administrations/vaccinations are given; there is interest in assessing the durability of treatment efficacy over time; and group sequential monitoring of each treatment group for potential harm, non-efficacy/efficacy futility, and high efficacy is warranted. The design divides the trial into two stages of time periods, where each treatment is first evaluated for efficacy in the first stage of follow-up, and, if and only if it shows significant treatment efficacy in stage one, it is evaluated for longer-term durability of efficacy in stage two. The package produces plots and tables describing operating characteristics of a specified design including an unconditional power for intention-to-treat and per-protocol/as-treated analyses; trial duration; probabilities of the different possible trial monitoring outcomes (e.g., stopping early for non-efficacy); unconditional power for comparing treatment efficacies; and distributions of numbers of endpoint events occurring after the treatments/vaccinations are given, useful as input parameters for the design of studies of the association of biomarkers with a clinical outcome (surrogate endpoint problem). The code can be used for a single active treatment versus control design and for a single-stage design.