Joint frailty models have been widely used to study the associations between recurrent events and a survival outcome. However, existing joint frailty models only consider one or a few recurrent events and cannot deal with high-dimensional recurrent events. This package can be used to fit our recently developed penalized joint frailty model that can handle high-dimensional recurrent events. Specifically, an adaptive lasso penalty is imposed on the parameters for the effects of the recurrent events on the survival outcome, which allows for variable selection. Also, our algorithm is computationally efficient, which is based on the Gaussian variational approximation method.
Generates and evaluates D, I, A, Alias, E, T, and G optimal designs. Supports generation and evaluation of blocked and split/split-split/.../N-split plot designs. Includes parametric and Monte Carlo power evaluation functions, and supports calculating power for censored responses. Provides a framework to evaluate power using functions provided in other packages or written by the user. Includes a Shiny graphical user interface that displays the underlying code used to create and evaluate the design to improve ease-of-use and make analyses more reproducible. For details, see Morgan-Wall et al. (2021) <doi:10.18637/jss.v099.i01>.
Format dates and times flexibly and to whichever locales make sense. This package parses dates, times, and date-times in various formats (including string-based ISO 8601 constructions). The formatting syntax gives the user many options for formatting the date and time output in a precise manner. Time zones in the input can be expressed in multiple ways and there are many options for formatting time zones in the output as well. Several of the provided helper functions allow for automatic generation of locale-aware formatting patterns based on date/time skeleton formats and standardized date/time formats with varying specificity.
We introduced a novel ensemble-based explainable machine learning model using Model Confidence Set (MCS) and two stage Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) algorithm. The model combined the predictive capabilities of different machine-learning models and integrates the interpretability of explainability methods. To develop the proposed algorithm, a two-stage Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) framework was employed. The package has been developed using the algorithm of Paul et al. (2023) <doi:10.1007/s40009-023-01218-x> and Yeasin and Paul (2024) <doi:10.1007/s11227-023-05542-3>.
The HMS (Hierarchic Memetic Strategy) is a composite global optimization strategy consisting of a multi-population evolutionary strategy and some auxiliary methods. The HMS makes use of a dynamically-evolving data structure that provides an organization among the component populations. It is a tree with a fixed maximal height and variable internal node degree. Each component population is governed by a particular evolutionary engine. This package provides a simple R implementation with examples of using different genetic algorithms as the population engines. References: J. Sawicki, M. Å oÅ , M. SmoÅ ka, J. Alvarez-Aramberri (2022) <doi:10.1007/s11047-020-09836-w>.
This package provides a framework package aimed to provide standardized computational environment for specialist work via object classes to represent the data coded by samples, taxa and segments (i.e. subpopulations, repeated measures). It supports easy processing of the data along with cross tabulation and relational data tables for samples and taxa. An object of class `mefa is a project specific compendium of the data and can be easily used in further analyses. Methods are provided for extraction, aggregation, conversion, plotting, summary and reporting of `mefa objects. Reports can be generated in plain text or LaTeX
format. Vignette contains worked examples.
Features tools for the network data analysis and community detection. Provides multiple methods for fitting, model selection and goodness-of-fit testing in degree-corrected stochastic blocks models. Most of the computations are fast and scalable for sparse networks, esp. for Poisson versions of the models. Implements the following: Amini, Chen, Bickel and Levina (2013) <doi:10.1214/13-AOS1138> Bickel and Sarkar (2015) <doi:10.1111/rssb.12117> Lei (2016) <doi:10.1214/15-AOS1370> Wang and Bickel (2017) <doi:10.1214/16-AOS1457> Zhang and Amini (2020) <arXiv:2012.15047>
Le and Levina (2022) <doi:10.1214/21-EJS1971>.
Automates many of the tasks associated with quantitative discourse analysis of transcripts containing discourse including frequency counts of sentence types, words, sentences, turns of talk, syllables and other assorted analysis tasks. The package provides parsing tools for preparing transcript data. Many functions enable the user to aggregate data by any number of grouping variables, providing analysis and seamless integration with other R packages that undertake higher level analysis and visualization of text. This affords the user a more efficient and targeted analysis. qdap is designed for transcript analysis, however, many functions are applicable to other areas of Text Mining/ Natural Language Processing.
This package provides a terribly-simple data base for numeric time series, written purely in R, so no external database-software is needed. Series are stored in plain-text files (the most-portable and enduring file type) in CSV format. Timestamps are encoded using R's native numeric representation for Date'/'POSIXct', which makes them fast to parse, but keeps them accessible with other software. The package provides tools for saving and updating series in this standardised format, for retrieving and joining data, for summarising files and directories, and for coercing series from and to other data types (such as zoo series).
The Tweedie compound Poisson distribution is a mixture of a degenerate distribution at the origin and a continuous distribution on the positive real line. It has been applied in a wide range of fields in which continuous data with exact zeros regularly arise. The cplm package provides likelihood based and Bayesian procedures for fitting common Tweedie compound Poisson linear models. In particular, models with hierarchical structures or extra zero inflation can be handled. Further, the package implements the Gini index based on an ordered version of the Lorenz curve as a robust model comparison tool involving zero-inflated and highly skewed distributions.
This package provides utilities to calculate the probabilities of various dice-rolling events, such as the probability of rolling a four-sided die six times and getting a 4, a 3, and either a 1 or 2 among the six rolls (in any order); the probability of rolling two six-sided dice three times and getting a 10 on the first roll, followed by a 4 on the second roll, followed by anything but a 7 on the third roll; or the probabilities of each possible sum of rolling five six-sided dice, dropping the lowest two rolls, and summing the remaining dice.
This package contains methods for fitting Generalized Linear Models (GLMs) and Generalized Additive Models (GAMs). Generalized regression models are common methods for handling data for which assuming Gaussian-distributed errors is not appropriate. For instance, if the response of interest is binary, count, or proportion data, one can instead model the expectation of the response based on an appropriate data-generating distribution. This package provides methods for fitting GLMs and GAMs under Beta regression, Poisson regression, Gamma regression, and Binomial regression (currently GLM only) settings. Models are fit using local scoring algorithms described in Hastie and Tibshirani (1990) <doi:10.1214/ss/1177013604>.
This package provides functions for fitting and doing predictions with Gaussian process models using Vecchia's (1988) approximation. Package also includes functions for reordering input locations, finding ordered nearest neighbors (with help from FNN package), grouping operations, and conditional simulations. Covariance functions for spatial and spatial-temporal data on Euclidean domains and spheres are provided. The original approximation is due to Vecchia (1988) <http://www.jstor.org/stable/2345768>, and the reordering and grouping methods are from Guinness (2018) <doi:10.1080/00401706.2018.1437476>. Model fitting employs a Fisher scoring algorithm described in Guinness (2019) <doi:10.48550/arXiv.1905.08374>
.
Detailed functionality for working with the univariate and multivariate Generalized Hyperbolic distribution and its special cases (Hyperbolic (hyp), Normal Inverse Gaussian (NIG), Variance Gamma (VG), skewed Student-t and Gaussian distribution). Especially, it contains fitting procedures, an AIC-based model selection routine, and functions for the computation of density, quantile, probability, random variates, expected shortfall and some portfolio optimization and plotting routines as well as the likelihood ratio test. In addition, it contains the Generalized Inverse Gaussian distribution. See Chapter 3 of A. J. McNeil
, R. Frey, and P. Embrechts. Quantitative risk management: Concepts, techniques and tools. Princeton University Press, Princeton (2005).
Analysis, imputation, and multiple imputation of count data using covariates. LORI uses a log-linear Poisson model where main row and column effects, as well as effects of known covariates and interaction terms can be fitted. The estimation procedure is based on the convex optimization of the Poisson loss penalized by a Lasso type penalty and a nuclear norm. LORI returns estimates of main effects, covariate effects and interactions, as well as an imputed count table. The package also contains a multiple imputation procedure. The methods are described in Robin, Josse, Moulines and Sardy (2019) <doi:10.1016/j.jmva.2019.04.004>.
An embedded proximal interior point quadratic programming solver, which can solve dense and sparse quadratic programs, described in Schwan, Jiang, Kuhn, and Jones (2023) <doi:10.48550/arXiv.2304.00290>
. Combining an infeasible interior point method with the proximal method of multipliers, the algorithm can handle ill-conditioned convex quadratic programming problems without the need for linear independence of the constraints. The solver is written in header only C++ 14 leveraging the Eigen library for vectorized linear algebra. For small dense problems, vectorized instructions and cache locality can be exploited more efficiently. Allocation free problem updates and re-solves are also provided.
The aim of XINA
is to determine which proteins exhibit similar patterns within and across experimental conditions, since proteins with co-abundance patterns may have common molecular functions. XINA
imports multiple datasets, tags dataset in silico, and combines the data for subsequent subgrouping into multiple clusters. The result is a single output depicting the variation across all conditions. XINA
not only extracts coabundance profiles within and across experiments, but also incorporates protein-protein interaction databases and integrative resources such as Kyoto encyclopedia of genes and genomes (KEGG) to infer interactors and molecular functions, respectively, and produces intuitive graphical outputs.
Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm as described in http://doi.org/10.18637/jss.v045.i03. Each variable has its own imputation model. Built-in imputation models are provided for continuous data (predictive mean matching, normal), binary data (logistic regression), unordered categorical data (polytomous logistic regression) and ordered categorical data (proportional odds). MICE can also impute continuous two-level data (normal model, pan, second-level variables). Passive imputation can be used to maintain consistency between variables. Various diagnostic plots are available to inspect the quality of the imputations.
This package implements the fast cross-validation via sequential testing (CVST) procedure. CVST is an improved cross-validation procedure which uses non-parametric testing coupled with sequential analysis to determine the best parameter set on linearly increasing subsets of the data. Additionally to the CVST the package contains an implementation of the ordinary k-fold cross-validation with a flexible and powerful set of helper objects and methods to handle the overall model selection process. The implementations of the Cochran's Q test with permutations and the sequential testing framework of Wald are generic and can therefore also be used in other contexts.
Rclone
is a command line program to sync files and directories to and from different cloud storage providers.
Features include:
MD5/SHA1 hashes checked at all times for file integrity
Timestamps preserved on files
Partial syncs supported on a whole file basis
Copy mode to just copy new/changed files
Sync (one way) mode to make a directory identical
Check mode to check for file hash equality
Can sync to and from network, e.g., two different cloud accounts
Optional encryption (Crypt)
Optional cache (Cache)
Optional FUSE mount (rclone mount)
The rema package implements a permutation-based approach for binary meta-analyses of 2x2 tables, founded on conditional logistic regression, that provides more reliable statistical tests when heterogeneity is observed in rare event data (Zabriskie et al. 2021 <doi:10.1002/sim.9142>). To adjust for the effect of heterogeneity, this method conditions on the sufficient statistic of a proxy for the heterogeneity effect as opposed to estimating the heterogeneity variance. While this results in the model not strictly falling under the random-effects framework, it is akin to a random-effects approach in that it assumes differences in variability due to treatment. Further, this method does not rely on large-sample approximations or continuity corrections for rare event data. This method uses the permutational distribution of the test statistic instead of asymptotic approximations for inference. The number of observed events drives the computation complexity for creating this permutational distribution. Accordingly, for this method to be computationally feasible, it should only be applied to meta-analyses with a relatively low number of observed events. To create this permutational distribution, a network algorithm, based on the work of Mehta et al. (1992) <doi:10.2307/1390598> and Corcoran et al. (2001) <doi:10.1111/j.0006-341x.2001.00941.x>, is employed using C++ and integrated into the package.
Estimate, assess, test, and study linear, nonlinear, hierarchical and multigroup structural equation models using composite-based approaches and procedures, including estimation techniques such as partial least squares path modeling (PLS-PM) and its derivatives (PLSc, ordPLSc
, robustPLSc
), generalized structured component analysis (GSCA), generalized structured component analysis with uniqueness terms (GSCAm), generalized canonical correlation analysis (GCCA), principal component analysis (PCA), factor score regression (FSR) using sum score, regression or Bartlett scores (including bias correction using Croonâ s approach), as well as several tests and typical postestimation procedures (e.g., verify admissibility of the estimates, assess the model fit, test the model fit etc.).
Providing six different algorithms that can be used to split the available data into training, test and validation subsets with similar distribution for hydrological model developments. The dataSplit()
function will help you divide the data according to specific requirements, and you can refer to the par.default()
function to set the parameters for data splitting. The getAUC()
function will help you measure the similarity of distribution features between the data subsets. For more information about the data splitting algorithms, please refer to: Chen et al. (2022) <doi:10.1016/j.jhydrol.2022.128340>, Zheng et al. (2022) <doi:10.1029/2021WR031818>.
It implements many univariate and multivariate permutation (and rotation) tests. Allowed tests: the t one and two samples, ANOVA, linear models, Chi Squared test, rank tests (i.e. Wilcoxon, Mann-Whitney, Kruskal-Wallis), Sign test and Mc Nemar. Test on Linear Models are performed also in presence of covariates (i.e. nuisance parameters). The permutation and the rotation methods to get the null distribution of the test statistics are available. It also implements methods for multiplicity control such as Westfall & Young minP
procedure and Closed Testing (Marcus, 1976) and k-FWER. Moreover, it allows to test for fixed effects in mixed effects models.