In the observational study design stage, matching/weighting methods are conducted. However, when many background variables are present, the decision as to which variables to prioritize for matching/weighting is not trivial. Thus, the joint treatment-outcome variable importance plots are created to guide variable selection. The joint variable importance plots enhance variable comparisons via unadjusted bias curves derived under the omitted variable bias framework. The plots translate variable importance into recommended values for tuning parameters in existing methods. Post-matching and/or weighting plots can also be used to visualize and assess the quality of the observational study design. The method motivation and derivation is presented in "Prioritizing Variables for Observational Study Design using the Joint Variable Importance Plot" by Liao et al. (2024) <doi:10.1080/00031305.2024.2303419>. See the package paper by Liao and Pimentel (2024) <doi:10.21105/joss.06093> for a beginner friendly user introduction.
This package contains functions for multiple imputation which complements existing functionality in R. In particular, several imputation methods for the mice package (van Buuren & Groothuis-Oudshoorn, 2011, <doi:10.18637/jss.v045.i03>) are implemented. Main features of the miceadds package include plausible value imputation (Mislevy, 1991, <doi:10.1007/BF02294457>), multilevel imputation for variables at any level or with any number of hierarchical and non-hierarchical levels (Grund, Luedtke & Robitzsch, 2018, <doi:10.1177/1094428117703686>; van Buuren, 2018, Ch.7, <doi:10.1201/9780429492259>), imputation using partial least squares (PLS) for high dimensional predictors (Robitzsch, Pham & Yanagida, 2016), nested multiple imputation (Rubin, 2003, <doi:10.1111/1467-9574.00217>), substantive model compatible imputation (Bartlett et al., 2015, <doi:10.1177/0962280214521348>), and features for the generation of synthetic datasets (Reiter, 2005, <doi:10.1111/j.1467-985X.2004.00343.x>; Nowok, Raab, & Dibben, 2016, <doi:10.18637/jss.v074.i11>).
This package provides functions to compute various clinical scores used in healthcare. These include the Charlson Comorbidity Index (CCI), predicting 10-year survival in patients with multiple comorbidities; the EPICES score, an individual indicator of precariousness considering its multidimensional nature; the MELD score for chronic liver disease severity; the Alternative Fistula Risk Score (a-FRS) for postoperative pancreatic fistula risk; and the Distal Pancreatectomy Fistula Risk Score (D-FRS) for risk following distal pancreatectomy. For detailed methodology, refer to Charlson et al. (1987) <doi:10.1016/0021-9681(87)90171-8> , Sass et al. (2006) <doi:10.1007/s10332-006-0131-5>, Kamath et al. (2001) <doi:10.1053/jhep.2001.22172>, Kim et al. (2008) <doi:10.1056/NEJMoa0801209> Kim et al. (2021) <doi:10.1053/j.gastro.2021.08.050>, Mungroop et al. (2019) <doi:10.1097/SLA.0000000000002620>, and de Pastena et al. (2023) <doi:10.1097/SLA.0000000000005497>..
Extensive functions for bivariate copula (bicopula) computations and related operations for bicopula theory. The lower, upper, product, and select other bicopula are implemented along with operations including the diagonal, survival copula, dual of a copula, co-copula, and numerical bicopula density. Level sets, horizontal and vertical sections are supported. Numerical derivatives and inverses of a bicopula are provided through which simulation is implemented. Bicopula composition, convex combination, asymmetry extension, and products also are provided. Support extends to the Kendall Function as well as the Lmoments thereof. Kendall Tau, Spearman Rho and Footrule, Gini Gamma, Blomqvist Beta, Hoeffding Phi, Schweizer- Wolff Sigma, tail dependency, tail order, skewness, and bivariate Lmoments are implemented, and positive/negative quadrant dependency, left (right) increasing (decreasing) are available. Other features include Kullback-Leibler Divergence, Vuong Procedure, spectral measure, and Lcomoments for fit and inference, Lcomoment ratio diagrams, maximum likelihood, and AIC, BIC, and RMSE for goodness-of-fit.
Computes fungible coefficients and Monte Carlo data. Underlying theory for these functions is described in the following publications: Waller, N. (2008). Fungible Weights in Multiple Regression. Psychometrika, 73(4), 691-703, <DOI:10.1007/s11336-008-9066-z>. Waller, N. & Jones, J. (2009). Locating the Extrema of Fungible Regression Weights. Psychometrika, 74(4), 589-602, <DOI:10.1007/s11336-008-9087-7>. Waller, N. G. (2016). Fungible Correlation Matrices: A Method for Generating Nonsingular, Singular, and Improper Correlation Matrices for Monte Carlo Research. Multivariate Behavioral Research, 51(4), 554-568. Jones, J. A. & Waller, N. G. (2015). The normal-theory and asymptotic distribution-free (ADF) covariance matrix of standardized regression coefficients: theoretical extensions and finite sample behavior. Psychometrika, 80, 365-378, <DOI:10.1007/s11336-013-9380-y>. Waller, N. G. (2018). Direct Schmid-Leiman transformations and rank-deficient loadings matrices. Psychometrika, 83, 858-870. <DOI:10.1007/s11336-017-9599-0>.
Approximate Bayesian regularization using Gaussian approximations. The input is a vector of estimates and a Gaussian error covariance matrix of the key parameters. Bayesian shrinkage is then applied to obtain parsimonious solutions. The method is described on Karimova, van Erp, Leenders, and Mulder (2024) <DOI:10.31234/osf.io/2g8qm>. Gibbs samplers are used for model fitting. The shrinkage priors that are supported are Gaussian (ridge) priors, Laplace (lasso) priors (Park and Casella, 2008 <DOI:10.1198/016214508000000337>), and horseshoe priors (Carvalho, et al., 2010; <DOI:10.1093/biomet/asq017>). These priors include an option for grouped regularization of different subsets of parameters (Meier et al., 2008; <DOI:10.1111/j.1467-9868.2007.00627.x>). F priors are used for the penalty parameters lambda^2 (Mulder and Pericchi, 2018 <DOI:10.1214/17-BA1092>). This correspond to half-Cauchy priors on lambda (Carvalho, Polson, Scott, 2010 <DOI:10.1093/biomet/asq017>).
msPurity R package was developed to: 1) Assess the spectral quality of fragmentation spectra by evaluating the "precursor ion purity". 2) Process fragmentation spectra. 3) Perform spectral matching. What is precursor ion purity? -What we call "Precursor ion purity" is a measure of the contribution of a selected precursor peak in an isolation window used for fragmentation. The simple calculation involves dividing the intensity of the selected precursor peak by the total intensity of the isolation window. When assessing MS/MS spectra this calculation is done before and after the MS/MS scan of interest and the purity is interpolated at the recorded time of the MS/MS acquisition. Additionally, isotopic peaks can be removed, low abundance peaks are removed that are thought to have limited contribution to the resulting MS/MS spectra and the isolation efficiency of the mass spectrometer can be used to normalise the intensities used for the calculation.
Functionality for reliability estimates. For unidimensional tests: Coefficient alpha, Guttman's lambda-2/-4/-6, the Greatest lower bound and coefficient omega_u ('unidimensional') in a Bayesian and a frequentist version. For multidimensional tests: omega_t (total) and omega_h (hierarchical). The results include confidence and credible intervals, the probability of a coefficient being larger than a cutoff, and a check for the factor models, necessary for the omega coefficients. The method for the Bayesian unidimensional estimates, except for omega_u, is sampling from the posterior inverse Wishart for the covariance matrix based measures (see Murphy', 2007, <https://groups.seas.harvard.edu/courses/cs281/papers/murphy-2007.pdf>. The Bayesian omegas (u, t, and h) are obtained by Gibbs sampling from the conditional posterior distributions of (1) the single factor model, (2) the second-order factor model, (3) the bi-factor model, (4) the correlated factor model ('Lee', 2007, <doi:10.1002/9780470024737>).
Programmatic connection to the OpenAltimetry API <https://openaltimetry.earthdatacloud.nasa.gov/data/openapi/swagger-ui/index.html/> to download and process ATL03 (Global Geolocated Photon Data), ATL06 (Land Ice Height), ATL07 (Sea Ice Height), ATL08 (Land and Vegetation Height), ATL10 (Sea Ice Freeboard), ATL12 (Ocean Surface Height) and ATL13 (Inland Water Surface Height) ICESat-2 Altimeter Data. The user has the option to download the data by selecting a bounding box from a 1- or 5-degree grid globally utilizing a shiny application. The ICESat-2 mission collects altimetry data of the Earth's surface. The sole instrument on ICESat-2 is the Advanced Topographic Laser Altimeter System (ATLAS) instrument that measures ice sheet elevation change and sea ice thickness, while also generating an estimate of global vegetation biomass. ICESat-2 continues the important observations of ice-sheet elevation change, sea-ice freeboard, and vegetation canopy height begun by ICESat in 2003.
This package provides a comprehensive framework for calculating unbiased distances in datasets containing mixed-type variables (numerical and categorical). The package implements a general formulation that ensures multivariate additivity and commensurability, meaning that variables contribute equally to the overall distance regardless of their type, scale, or distribution. Supports multiple distance measures including Gower's distance, Euclidean distance, Manhattan distance, and various categorical variable distances such as simple matching, Eskin, occurrence frequency, and association-based distances. Provides tools for variable scaling (standard deviation, range, robust range, and principal component scaling), and handles both independent and association-based category dissimilarities. Implements methods to correct for biases that typically arise from different variable types, distributions, and number of categories. Particularly useful for cluster analysis, data visualization, and other distance-based methods when working with mixed data. Methods based on van de Velden et al. (2024) <doi:10.48550/arXiv.2411.00429> "Unbiased mixed variables distance".
This package provides a set of functions designed to calculate the standardised precipitation and standardised precipitation evapotranspiration indices using NASA POWER data as described in Blain et al. (2023) <doi:10.2139/ssrn.4442843>. These indices are calculated using a reference data source. The functions verify if the indices estimates meet the assumption of normality and how well NASA POWER estimates represent real-world data. Indices are calculated in a routine mode. Potential evapotranspiration amounts and the difference between rainfall and potential evapotranspiration are also calculated. The functions adopt a basic time scale that splits each month into four periods. Days 1 to 7, days 8 to 14, days 15 to 21, and days 22 to 28, 29, 30, or 31, where TS=4 corresponds to a 1-month length moving window (calculated 4 times per month) and TS=48 corresponds to a 12-month length moving window (calculated 4 times per month).
The Satellite Application Facility on Climate Monitoring (CM SAF) is a ground segment of the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT) and one of EUMETSATs Satellite Application Facilities. The CM SAF contributes to the sustainable monitoring of the climate system by providing essential climate variables related to the energy and water cycle of the atmosphere (<https://www.cmsaf.eu>). It is a joint cooperation of eight National Meteorological and Hydrological Services. The cmsafops R-package provides a collection of R-operators for the analysis and manipulation of CM SAF NetCDF formatted data. Other CF conform NetCDF data with time, longitude and latitude dimension should be applicable, but there is no guarantee for an error-free application. CM SAF climate data records are provided for free via (<https://wui.cmsaf.eu/safira>). Detailed information and test data are provided on the CM SAF webpage (<http://www.cmsaf.eu/R_toolbox>).
This package implements fractional differencing with Autoregressive Moving Average models to analyse long-memory time series data. Traditional ARIMA models typically use integer values for differencing, which are suitable for time series with short memory or anti-persistent behaviour. In contrast, the Fractional ARIMA model allows fractional differencing, enabling it to effectively capture long memory characteristics in time series data. The âfracARMAâ package is user-friendly and allows users to manually input the fractional differencing parameter, which can be obtained using various estimators such as the GPH estimator, Sperio method, or Wavelet method and many. Additionally, the package enables users to directly feed the time series data, AR order, MA order, fractional differencing parameter, and the proportion of training data as a split ratio, all in a single command. The package is based on the reference from the paper of Irshad and others (2024, <doi:10.22271/maths.2024.v9.i6b.1906>).
Plots U-Pb data on Wetherill and Tera-Wasserburg concordia diagrams. Calculates concordia and discordia ages. Performs linear regression of measurements with correlated errors using York', Titterington', Ludwig and Omnivariant Generalised Least-Squares ('OGLS') approaches. Generates Kernel Density Estimates (KDEs) and Cumulative Age Distributions (CADs). Produces Multidimensional Scaling (MDS) configurations and Shepard plots of multi-sample detrital datasets using the Kolmogorov-Smirnov distance as a dissimilarity measure. Calculates 40Ar/39Ar ages, isochrons, and age spectra. Computes weighted means accounting for overdispersion. Calculates U-Th-He (single grain and central) ages, logratio plots and ternary diagrams. Processes fission track data using the external detector method and LA-ICP-MS, calculates central ages and plots fission track and other data on radial (a.k.a. Galbraith') plots. Constructs total Pb-U, Pb-Pb, Th-Pb, K-Ca, Re-Os, Sm-Nd, Lu-Hf, Rb-Sr and 230Th-U isochrons as well as 230Th-U evolution plots.
This package provides functions are provided to facilitate prior elicitation for Bayesian generalised linear models using independent conditional means priors. The package supports the elicitation of multivariate normal priors for generalised linear models. The approach can be applied to indirect elicitation for a generalised linear model that is linear in the parameters. The package is designed such that the facilitator executes functions within the R console during the elicitation session to provide graphical and numerical feedback at each design point. Various methodologies for eliciting fractiles (equivalently, percentiles or quantiles) are supported, including versions of the approach of Hosack et al. (2017) <doi:10.1016/j.ress.2017.06.011>. For example, experts may be asked to provide central credible intervals that correspond to a certain probability. Or experts may be allowed to vary the probability allocated to the central credible interval for each design point. Additionally, a median may or may not be elicited.
Large-scale matrix-variate data have been widely observed nowadays in various research areas such as finance, signal processing and medical imaging. Modelling matrix-valued data by matrix-elliptical family not only provides a flexible way to handle heavy-tail property and tail dependencies, but also maintains the intrinsic row and column structure of random matrices. We proposed a new tool named matrix Kendall's tau which is efficient for analyzing random elliptical matrices. By applying this new type of Kendellâ s tau to the matrix elliptical factor model, we propose a Matrix-type Robust Two-Step (MRTS) method to estimate the loading and factor spaces. See the details in He at al. (2022) <arXiv:2207.09633>. In this package, we provide the algorithms for calculating sample matrix Kendall's tau, the MRTS method and the Matrix Kendall's tau Eigenvalue-Ratio (MKER) method which is used for determining the number of factors.
This package provides tools to visualize the results of a classification or a regression. The graphical displays include stacked plots, silhouette plots, quasi residual plots, class maps, predictions plots, and predictions correlation plots. Implements the techniques described and illustrated in Raymaekers J., Rousseeuw P.J., Hubert M. (2022). Class maps for visualizing classification results. \emphTechnometrics, 64(2), 151â 165. \doi10.1080/00401706.2021.1927849 (open access), Raymaekers J., Rousseeuw P.J.(2022). Silhouettes and quasi residual plots for neural nets and tree-based classifiers. \emphJournal of Computational and Graphical Statistics, 31(4), 1332â 1343. \doi10.1080/10618600.2022.2050249, and Rousseeuw, P.J. (2025). Explainable Linear and Generalized Linear Models by the Predictions Plot. <doi:10.48550/arXiv.2412.16980> (open access). Examples can be found in the vignettes: "Discriminant_analysis_examples","K_nearest_neighbors_examples", "Support_vector_machine_examples", "Rpart_examples", "Random_forest_examples", "Neural_net_examples", and "predsplot_examples".
Estimates a variety of Dynamic Conditional Correlation (DCC) models. More in detail, the dccmidas package allows the estimation of the corrected DCC (cDCC) of Aielli (2013) <doi:10.1080/07350015.2013.771027>, the DCC-MIDAS of Colacito et al. (2011) <doi:10.1016/j.jeconom.2011.02.013>, the Asymmetric DCC of Cappiello et al. <doi:10.1093/jjfinec/nbl005>, and the Dynamic Equicorrelation (DECO) of Engle and Kelly (2012) <doi:10.1080/07350015.2011.652048>. dccmidas offers the possibility of including standard GARCH <doi:10.1016/0304-4076(86)90063-1>, GARCH-MIDAS <doi:10.1162/REST_a_00300> and Double Asymmetric GARCH-MIDAS <doi:10.1016/j.econmod.2018.07.025> models in the univariate estimation. Moreover, also the scalar and diagonal BEKK <doi:10.1017/S0266466600009063> models can be estimated. Finally, the package calculates also the var-cov matrix under two non-parametric models: the Moving Covariance and the RiskMetrics specifications.
Includes tools to calculate statistical power, minimum detectable effect size (MDES), MDES difference (MDESD), and minimum required sample size for various multilevel randomized experiments (MRE) with continuous outcomes. Accomodates 14 types of MRE designs to detect main treatment effect, seven types of MRE designs to detect moderated treatment effect (2-1-1, 2-1-2, 2-2-1, 2-2-2, 3-3-1, 3-3-2, and 3-3-3 designs; <total.lev> - <trt.lev> - <mod.lev>), five types of MRE designs to detect mediated treatment effects (2-1-1, 2-2-1, 3-1-1, 3-2-1, and 3-3-1 designs; <trt.lev> - <med.lev> - <out.lev>), four types of partially nested (PN) design to detect main treatment effect, and three types of PN designs to detect mediated treatment effects (2/1, 3/1, 3/2; <trt.arm.lev> / <ctrl.arm.lev>). See PowerUp! Excel series at <https://www.causalevaluation.org/>.
Generation of domain variables, linearization of several non-linear population statistics (the ratio of two totals, weighted income percentile, relative median income ratio, at-risk-of-poverty rate, at-risk-of-poverty threshold, Gini coefficient, gender pay gap, the aggregate replacement ratio, the relative median income ratio, median income below at-risk-of-poverty gap, income quintile share ratio, relative median at-risk-of-poverty gap), computation of regression residuals in case of weight calibration, variance estimation of sample surveys by the ultimate cluster method (Hansen, Hurwitz and Madow, Sample Survey Methods And Theory, vol. I: Methods and Applications; vol. II: Theory. 1953, New York: John Wiley and Sons), variance estimation for longitudinal, cross-sectional measures and measures of change for single and multistage stage cluster sampling designs (Berger, Y. G., 2015, <doi:10.1111/rssa.12116>). Several other precision measures are derived - standard error, the coefficient of variation, the margin of error, confidence interval, design effect.
For emulating multifidelity computer models. The major methods include univariate autoregressive cokriging and multivariate autoregressive cokriging. The autoregressive cokriging methods are implemented for both hierarchically nested design and non-nested design. For hierarchically nested design, the model parameters are estimated via standard optimization algorithms; For non-nested design, the model parameters are estimated via Monte Carlo expectation-maximization (MCEM) algorithms. In both cases, the priors are chosen such that the posterior distributions are proper. Notice that the uniform priors on range parameters in the correlation function lead to improper posteriors. This should be avoided when Bayesian analysis is adopted. The development of objective priors for autoregressive cokriging models can be found in Pulong Ma (2020) <DOI:10.1137/19M1289893>. The development of the multivariate autoregressive cokriging models with possibly non-nested design can be found in Pulong Ma, Georgios Karagiannis, Bledar A Konomi, Taylor G Asher, Gabriel R Toro, and Andrew T Cox (2019) <arXiv:1909.01836>.
Digital simulation of electrochemical processes. Each function allows for implicit and explicit solution of the differential equation using methods like Euler, Backwards implicit, Runge Kutta 4, Crank Nicholson and Backward differentiation formula as well as different number of points for derivative approximation. Several electrochemical processes can be simulated such as: Chronoamperometry, Potential Step, Linear Sweep, Cyclic Voltammetry, Cyclic Voltammetry with electrochemical reaction followed by chemical reaction (EC mechanism) and CV with two following electrochemical reaction (EE mechanism). In update 1.1.0 has been added a general purpose CV function that allow to simulate up to 4 EE mechanism combined with chemical reaction for each species.Update 1.2.0 improved the accuracy of the measurements and allow personalized data resolution for simulation. Bibliography regarding this methods can be found in the following texts. Dieter Britz, Jorg Strutwolf (2016) <ISBN:978-3-319-30292-8>. Allen J. Bard, Larry R. Faulkner (2000) <ISBN:978-0-471-04372-0>.
The definition of fuzzy random variable and the methods of simulation from fuzzy random variables are two challenging statistical problems in three recent decades. This package is organized based on a special definition of fuzzy random variable and simulate fuzzy random variable by Piecewise Linear Fuzzy Numbers (PLFNs); see Coroianua et al. (2013) <doi:10.1016/j.fss.2013.02.005> for details about PLFNs. Some important statistical functions are considered for obtaining the membership function of main statistics, such as mean, variance, summation, standard deviation and coefficient of variance. Some of applied advantages of Sim.PLFN package are: (1) Easily generating / simulation a random sample of PLFN, (2) drawing the membership functions of the simulated PLFNs or the membership function of the statistical result, and (3) Considering the simulated PLFNs for arithmetic operation or importing into some statistical computation. Finally, it must be mentioned that Sim.PLFN package works on the basis of FuzzyNumbers package.
Design and evaluate choice-based conjoint survey experiments. Generate a variety of survey designs, including random designs, frequency-based designs, and D-optimal designs, as well as "labeled" designs (also known as "alternative-specific designs"), designs with "no choice" options, and designs with dominant alternatives removed. Conveniently inspect and compare designs using a variety of metrics, including design balance, overlap, and D-error, and simulate choice data for a survey design either randomly or according to a utility model defined by user-provided prior parameters. Conduct a power analysis for a given survey design by estimating the same model on different subsets of the data to simulate different sample sizes. Bayesian D-efficient designs using the cea and modfed methods are obtained using the idefix package by Traets et al (2020) <doi:10.18637/jss.v096.i03>. Choice simulation and model estimation in power analyses are handled using the logitr package by Helveston (2023) <doi:10.18637/jss.v105.i10>.