Stochastic collapsed variational inference on mixed-membership stochastic blockmodel for networks, incorporating node-level predictors of mixed-membership vectors, as well as dyad-level predictors. For networks observed over time, the model defines a hidden Markov process that allows the effects of node-level predictors to evolve in discrete, historical periods. In addition, the package offers a variety of utilities for exploring results of estimation, including tools for conducting posterior predictive checks of goodness-of-fit and several plotting functions. The package implements methods described in Olivella, Pratt and Imai (2019) Dynamic Stochastic Blockmodel Regression for Social Networks: Application to International Conflicts', available at <https://www.santiagoolivella.info/pdfs/socnet.pdf>.
Calculates various functions needed for design and monitoring survival trials accounting for complex situations such as delayed treatment effect, treatment crossover, non-uniform accrual, and different censoring distributions between groups. The event time distribution is assumed to be piecewise exponential (PWE) distribution and the entry time is assumed to be piecewise uniform distribution. As compared with Version 1.2.1, two more types of hybrid crossover are added. A bug is corrected in the function "pwecx" that calculates the crossover-adjusted survival, distribution, density, hazard and cumulative hazard functions. Also, to generate the crossover-adjusted event time random variable, a more efficient algorithm is used and the output includes crossover indicators.
Resources, tutorials, and code snippets dedicated to exploring the intersection of quantum computing and artificial intelligence (AI) in the context of analyzing Cluster of Differentiation 4 (CD4) lymphocytes and optimizing antiretroviral therapy (ART) for human immunodeficiency virus (HIV). With the emergence of quantum artificial intelligence and the development of small-scale quantum computers, there's an unprecedented opportunity to revolutionize the understanding of HIV dynamics and treatment strategies. This project leverages the R package qsimulatR (Ostmeyer and Urbach, 2023, <https://CRAN.R-project.org/package=qsimulatR>), a quantum computer simulator, to explore these applications in quantum computing techniques, addressing the challenges in studying CD4 lymphocytes and enhancing ART efficacy.
The SAVVY (Survival Analysis for AdVerse Events with VarYing Follow-Up Times) project is a consortium of academic and pharmaceutical industry partners that aims to improve the analyses of adverse event (AE) data in clinical trials through the use of survival techniques appropriately dealing with varying follow-up times and competing events, see Stegherr, Schmoor, Beyersmann, et al. (2021) <doi:10.1186/s13063-021-05354-x>. Although statistical methodologies have advanced, in AE analyses often the incidence proportion, the incidence density or a non-parametric Kaplan-Meier estimator are used, which either ignore censoring or competing events. This package contains functions to easily conduct the proposed improved AE analyses.
This package provides support for all calendars as specified in the Climate and Forecast (CF) Metadata Conventions for climate and forecasting data. The CF Metadata Conventions is widely used for distributing files with climate observations or projections, including the Coupled Model Intercomparison Project (CMIP) data used by climate change scientists and the Intergovernmental Panel on Climate Change (IPCC). This package specifically allows the user to work with any of the CF-compliant calendars (many of which are not compliant with POSIXt). The CF time coordinate is formally defined in the CF Metadata Conventions document.
This package provides a framework and complete preset pipeline for quantification and analysis of ATAC-seq Reads. It covers raw sequencing reads preprocessing (FASTQ files), reads alignment (Rbowtie2), aligned reads file operations (SAM, BAM, and BED files), peak calling (F-seq), genome annotations (Motif, GO, SNP analysis) and quality control report. The package is managed by dataflow graph. It is easy for user to pass variables seamlessly between processes and understand the workflow. Users can process FASTQ files through end-to-end preset pipeline which produces a pretty HTML report for quality control and preliminary statistical results, or customize workflow starting from any intermediate stages with esATAC functions easily and flexibly.
This package provides a collection of functions for computing centrographic statistics (e.g., standard distance, standard deviation ellipse, standard deviation box) for observations taken at point locations. Separate plotting functions have been developed for each measure. Users interested in writing results to ESRI shapefiles can do so by using results from aspace functions as inputs to the convert.to.shapefile() and write.shapefile() functions in the shapefiles library. We intend to provide terra integration for geographic data in a future release. The aspace package was originally conceived to aid in the analysis of spatial patterns of travel behaviour (see Buliung and Remmel 2008 <doi:10.1007/s10109-008-0063-7>).
This package contains functions to perform Bayesian inference using a spectral analysis of Gaussian process priors. Gaussian processes are represented with a Fourier series based on cosine basis functions. Currently the package includes parametric linear models, partial linear additive models with/without shape restrictions, generalized linear additive models with/without shape restrictions, and density estimation model. To maximize computational efficiency, the actual Markov chain Monte Carlo sampling for each model is done using codes written in FORTRAN 90. This software has been developed using funding supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (no. NRF-2016R1D1A1B03932178 and no. NRF-2017R1D1A3B03035235).
Method for fitting a cellwise robust linear M-regression model (CRM, Filzmoser et al. (2020) <DOI:10.1016/j.csda.2020.106944>) that yields both a map of cellwise outliers consistent with the linear model, and a vector of regression coefficients that is robust against vertical outliers and leverage points. As a by-product, the method yields an imputed data set that contains estimates of what the values in cellwise outliers would need to amount to if they had fit the model. The package also provides diagnostic tools for analyzing casewise and cellwise outliers using sparse directions of maximal outlyingness (SPADIMO, Debruyne et al. (2019) <DOI:10.1007/s11222-018-9831-5>).
Bindings to edlib, a lightweight performant C/C++ library for exact pairwise sequence alignment using edit distance (Levenshtein distance). The algorithm computes the optimal alignment path, but also can be used to find only the start and/or end of the alignment path for convenience. Edlib was designed to be ultrafast and require little memory, with the capability to handle very large sequences. Three alignment methods are supported: global (Needleman-Wunsch), infix (Hybrid Wunsch), and prefix (Semi-Hybrid Wunsch). The original C/C++ library is described in "Edlib: a C/C++ library for fast, exact sequence alignment using edit distance", M. Å oÅ¡iÄ , M. Å ikiÄ , <doi:10.1093/bioinformatics/btw753>.
Estimates generalized additive latent and mixed models using maximum marginal likelihood, as defined in Sorensen et al. (2023) <doi:10.1007/s11336-023-09910-z>, which is an extension of Rabe-Hesketh and Skrondal (2004)'s unifying framework for multilevel latent variable modeling <doi:10.1007/BF02295939>. Efficient computation is done using sparse matrix methods, Laplace approximation, and automatic differentiation. The framework includes generalized multilevel models with heteroscedastic residuals, mixed response types, factor loadings, smoothing splines, crossed random effects, and combinations thereof. Syntax for model formulation is close to lme4 (Bates et al. (2015) <doi:10.18637/jss.v067.i01>) and PLmixed (Rockwood and Jeon (2019) <doi:10.1080/00273171.2018.1516541>).
Probabilistic models describing the behavior of workload and queue on a High Performance Cluster and computing GRID under FIFO service discipline basing on modified Kiefer-Wolfowitz recursion. Also sample data for inter-arrival times, service times, number of cores per task and waiting times of HPC of Karelian Research Centre are included, measurements took place from 06/03/2009 to 02/30/2011. Functions provided to import/export workload traces in Standard Workload Format (swf). Stability condition of the model may be verified either exactly, or approximately. Stability analysis: see Rumyantsev and Morozov (2017) <doi:10.1007/s10479-015-1917-2>, workload recursion: see Rumyantsev (2014) <doi:10.1109/PDCAT.2014.36>.
Many data science problems reduce to operations on very tall, skinny matrices. However, sometimes these matrices can be so tall that they are difficult to work with, or do not even fit into main memory. One strategy to deal with such objects is to distribute their rows across several processors. To this end, we offer an S4 class for tall, skinny, distributed matrices, called the shaq'. We also provide many useful numerical methods and statistics operations for operating on these distributed objects. The naming is a bit "tongue-in-cheek", with the class a play on the fact that Shaquille ONeal ('Shaq') is very tall, and he starred in the film Kazaam'.
It is an extension of lmom R package: pel...()','cdf...()',qua...() function families are lumped and called from one function per each family respectively in order to create robust automatic tools to fit data with different probability distributions and then to estimate probability values and return periods. The implemented functions are able to manage time series with constant and/or missing values without stopping the execution with error messages. The package also contains tools to calculate several indices based on variability (e.g. SPI , Standardized Precipitation Index, see <https://climatedataguide.ucar.edu/climate-data/standardized-precipitation-index-spi> and <http://spei.csic.es/>) for multiple time series or spatially gridded values.
Selecting the optimal multidimensional scaling (MDS) procedure for metric data via metric MDS (ratio, interval, mspline) and nonmetric MDS (ordinal). Selecting the optimal multidimensional scaling (MDS) procedure for interval-valued data via metric MDS (ratio, interval, mspline).Selecting the optimal multidimensional scaling procedure for interval-valued data by varying all combinations of normalization and optimization methods.Selecting the optimal MDS procedure for statistical data referring to the evaluation of tourist attractiveness of Lower Silesian counties. (Borg, I., Groenen, P.J.F., Mair, P. (2013) <doi:10.1007/978-3-642-31848-1>, Walesiak, M. (2016) <doi:10.15611/ekt.2016.2.01>, Walesiak, M. (2017) <doi:10.15611/ekt.2017.3.01>).
This package provides a programmatic interface to <http://sp2000.org.cn>, re-written based on an accompanying Species 2000 API. Access tables describing catalogue of the Chinese known species of animals, plants, fungi, micro-organisms, and more. This package also supports access to catalogue of life global <http://catalogueoflife.org>, China animal scientific database <http://zoology.especies.cn> and catalogue of life Taiwan <https://taibnet.sinica.edu.tw/home_eng.php>. The development of SP2000 package were supported by Biodiversity Survey and Assessment Project of the Ministry of Ecology and Environment, China <2019HJ2096001006>,Yunnan University's "Double First Class" Project <C176240405> and Yunnan University's Research Innovation Fund for Graduate Students <2019227>.
Survival analysis models are commonly used in medicine and other areas. Many of them are too complex to be interpreted by human. Exploration and explanation is needed, but standard methods do not give a broad enough picture. survex provides easy-to-apply methods for explaining survival models, both complex black-boxes and simpler statistical models. They include methods specific to survival analysis such as SurvSHAP(t) introduced in Krzyzinski et al., (2023) <doi:10.1016/j.knosys.2022.110234>, SurvLIME described in Kovalev et al., (2020) <doi:10.1016/j.knosys.2020.106164> as well as extensions of existing ones described in Biecek et al., (2021) <doi:10.1201/9780429027192>.
EQ-5D value set estimation can be done using the hybrid model likelihood as described by Oppe and van Hout (2010) <doi:10.1002/hec.3560> and Ramos-Goñi et al. (2017) <doi:10.1097/MLR.0000000000000283>. The package is based on flexmix and among others contains an M-step-driver as described by Leisch (2004) <doi:10.18637/jss.v011.i08>. Users can estimate latent classes and address preference heterogeneity. Both uncensored and censored data are supported. Furthermore, heteroscedasticity can be taken into account. It is possible to control for different covariates on the continuous and dichotomous parts of the data and start values can differ between the expected latent classes.
This package provides two methods of estimating income inequality statistics from binned income data, such as the income data provided in the Census. These methods use different interpolation techniques to infer the distribution of incomes within income bins. One method is an implementation of Jargowsky and Wheeler's mean-constrained integration over brackets (MCIB). The other method is based on a new technique, Lorenz interpolation, which estimates income inequality by constructing an interpolated Lorenz curve based on the binned income data. These methods can be used to estimate three income inequality measures: the Gini (the default measure returned), the Theil, and the Atkinson's index. Jargowsky and Wheeler (2018) <doi:10.1177/0081175018782579>.
Vector AutoRegressive (VAR) type models with tailored regularisation structures are provided to uncover network type structures in the data, such as influential time series (influencers). Currently the package implements the LISAR model from Zhang and Trimborn (2023) <doi:10.2139/ssrn.4619531>. The package automatically derives the required regularisation sequences and refines it during the estimation to provide the optimal model. The package allows for model optimisation under various loss functions such as Mean Squared Forecasting Error (MSFE), Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC). It provides a dedicated class, allowing for summary prints of the optimal model and a plotting function to conveniently analyse the optimal model via heatmaps.
Extremely efficient procedures for fitting the entire group lasso and group elastic net regularization path for GLMs, multinomial, the Cox model and multi-task Gaussian models. Similar to the R package glmnet in scope of models, and in computational speed. This package provides R bindings to the C++ code underlying the corresponding Python package adelie'. These bindings offer a general purpose group elastic net solver, a wide range of matrix classes that can exploit special structure to allow large-scale inputs, and an assortment of generalized linear model classes for fitting various types of data. The package is an implementation of Yang, J. and Hastie, T. (2024) <doi:10.48550/arXiv.2405.08631>.
This package provides a set of functions for conducting cognitive diagnostic computerized adaptive testing applications (Chen, 2009) <DOI:10.1007/s11336-009-9123-2>). It includes different item selection rules such us the global discrimination index (Kaplan, de la Torre, and Barrada (2015) <DOI:10.1177/0146621614554650>) and the nonparametric selection method (Chang, Chiu, and Tsai (2019) <DOI:10.1177/0146621618813113>), as well as several stopping rules. Functions for generating item banks and responses are also provided. To guide item bank calibration, model comparison at the item level can be conducted using the two-step likelihood ratio test statistic by Sorrel, de la Torre, Abad and Olea (2017) <DOI:10.1027/1614-2241/a000131>.
Efficient algorithms for fitting generalized linear and additive models with group elastic net penalties as described in Helwig (2025) <doi:10.1080/10618600.2024.2362232>. Implements group LASSO, group MCP, and group SCAD with an optional group ridge penalty. Computes the regularization path for linear regression (gaussian), multivariate regression (multigaussian), smoothed support vector machines (svm1), squared support vector machines (svm2), logistic regression (binomial), multinomial logistic regression (multinomial), log-linear count regression (poisson and negative.binomial), and log-linear continuous regression (gamma and inverse gaussian). Supports default and formula methods for model specification, k-fold cross-validation for tuning the regularization parameters, and nonparametric regression via tensor product reproducing kernel (smoothing spline) basis function expansion.
Enables the user to calculate Value at Risk (VaR) and Expected Shortfall (ES) by means of various types of historical simulation. Currently plain-, age-, volatility-weighted- and filtered historical simulation are implemented in this package. Volatility weighting can be carried out via an exponentially weighted moving average model (EWMA) or other GARCH-type models. The performance can be assessed via Traffic Light Test, Coverage Tests and Loss Functions. The methods of the package are described in Gurrola-Perez, P. and Murphy, D. (2015) <https://EconPapers.repec.org/RePEc:boe:boeewp:0525> as well as McNeil, J., Frey, R., and Embrechts, P. (2015) <https://ideas.repec.org/b/pup/pbooks/10496.html>.