In the context of paid research studies and clinical trials, budget considerations and patient sampling from available populations are subject to inherent constraints. We introduce the CDsampling package, which integrates optimal design theories within the framework of constrained sampling. This package offers the possibility to find both D-optimal approximate and exact allocations for samplings with or without constraints. Additionally, it provides functions to find constrained uniform sampling as a robust sampling strategy with limited model information. Our package offers functions for the computation of the Fisher information matrix under generalized linear models (including regular linear regression model) and multinomial logistic models.To demonstrate the applications, we also provide a simulated dataset and a real dataset embedded in the package. Yifei Huang, Liping Tong, and Jie Yang (2025)<doi:10.5705/ss.202022.0414>.
This package provides a decorator is a function that receives a function, extends its behaviour, and returned the altered function. Any caller that uses the decorated function uses the same interface as it were the original, undecorated function. Decorators serve two primary uses: (1) Enhancing the response of a function as it sends data to a second component; (2) Supporting multiple optional behaviours. An example of the first use is a timer decorator that runs a function, outputs its execution time on the console, and returns the original function's result. An example of the second use is input type validation decorator that during running time tests whether the caller has passed input arguments of a particular class. Decorators can reduce execution time, say by memoization, or reduce bugs by adding defensive programming routines.
An implementation of logistic normal multinomial (LNM) clustering. It is an extension of LNM mixture model proposed by Fang and Subedi (2020) <arXiv:2011.06682>, and is designed for clustering compositional data. The package includes 3 extended models: LNM Factor Analyzer (LNM-FA), LNM Bicluster Mixture Model (LNM-BMM) and Penalized LNM Factor Analyzer (LNM-FA). There are several advantages of LNM models: 1. LNM provides more flexible covariance structure; 2. Factor analyzer can reduce the number of parameters to estimate; 3. Bicluster can simultaneously cluster subjects and taxa, and provides significant biological insights; 4. Penalty term allows sparse estimation in the covariance matrix. Details for model assumptions and interpretation can be found in papers: Tu and Subedi (2021) <arXiv:2101.01871> and Tu and Subedi (2022) <doi:10.1002/sam.11555>.
Systematic reviews should be described in a high degree of methodological detail. The PRISMA Statement calls for a high level of reporting detail in systematic reviews and meta-analyses. An integral part of the methodological description of a review is a flow diagram. This package produces an interactive flow diagram that conforms to the PRISMA2020 preprint. When made interactive, the reader/user can click on each box and be directed to another website or file online (e.g. a detailed description of the screening methods, or a list of excluded full texts), with a mouse-over tool tip that describes the information linked to in more detail. Interactive versions can be saved as HTML files, whilst static versions for inclusion in manuscripts can be saved as HTML, PDF, PNG, SVG, PS or WEBP files.
Rdiff-backup backs up one directory to another, possibly over a network. The target directory ends up a copy of the source directory, but extra reverse diffs are stored in a special subdirectory of that target directory, so you can still recover files lost some time ago. The idea is to combine the best features of a mirror and an incremental backup. Rdiff-backup also preserves subdirectories, hard links, dev files, permissions, uid/gid ownership, modification times, extended attributes, acls, and resource forks. Also, rdiff-backup can operate in a bandwidth efficient manner over a pipe, like rsync. Thus you can use rdiff-backup and ssh to securely back a hard drive up to a remote location, and only the differences will be transmitted. Finally, rdiff-backup is easy to use and settings have sensible defaults.
This package provides several functions to explore miRNA sponge (also called ceRNA or miRNA decoy) regulation from putative miRNA-target interactions or/and transcriptomics data (including bulk, single-cell and spatial gene expression data). It provides eight popular methods for identifying miRNA sponge interactions, and an integrative method to integrate miRNA sponge interactions from different methods, as well as the functions to validate miRNA sponge interactions, and infer miRNA sponge modules, conduct enrichment analysis of miRNA sponge modules, and conduct survival analysis of miRNA sponge modules. By using a sample control variable strategy, it provides a function to infer sample-specific miRNA sponge interactions. In terms of sample-specific miRNA sponge interactions, it implements three similarity methods to construct sample-sample correlation network.
Implementation of a probabilistic method to calculate nicheROVER (_niche_ _r_egion and niche _over_lap) metrics using multidimensional niche indicator data (e.g., stable isotopes, environmental variables, etc.). The niche region is defined as the joint probability density function of the multidimensional niche indicators at a user-defined probability alpha (e.g., 95%). Uncertainty is accounted for in a Bayesian framework, and the method can be extended to three or more indicator dimensions. It provides directional estimates of niche overlap, accounts for species-specific distributions in multivariate niche space, and produces unique and consistent bivariate projections of the multivariate niche region. The article by Swanson et al. (2015) <doi:10.1890/14-0235.1> provides a detailed description of the methodology. See the package vignette for a worked example using fish stable isotope data.
This package provides a function to calculate multiple performance metrics for actual and predicted values. In total eight metrics will be calculated for particular actual and predicted series. Helps to describe a Statistical model's performance in predicting a data. Also helps to compare various models performance. The metrics are Root Mean Squared Error (RMSE), Relative Root Mean Squared Error (RRMSE), Mean absolute Error (MAE), Mean absolute percentage error (MAPE), Mean Absolute Scaled Error (MASE), Nash-Sutcliffe Efficiency (NSE), Willmottâ s Index (WI), and Legates and McCabe Index (LME). Among them, first five are expected to be lesser whereas, the last three are greater the better. More details can be found from Garai and Paul (2023) <doi:10.1016/j.iswa.2023.200202> and Garai et al. (2024) <doi:10.1007/s11063-024-11552-w>.
Procedures for simulating biomes by equilibrium vegetation models, with a special focus on paleoenvironmental applications. Three widely used equilibrium biome models are currently implemented in the package: the Holdridge Life Zone (HLZ) system (Holdridge 1947, <doi:10.1126/science.105.2727.367>), the Köppen-Geiger classification (KGC) system (Köppen 1936, <https://koeppen-geiger.vu-wien.ac.at/pdf/Koppen_1936.pdf>) and the BIOME model (Prentice et al. 1992, <doi:10.2307/2845499>). Three climatic forest-steppe models are also implemented. An approach for estimating monthly time series of relative sunshine duration from temperature and precipitation data (Yin 1999, <doi:10.1007/s007040050111>) is also adapted, allowing process-based biome models to be combined with high-resolution paleoclimate simulation datasets (e.g., CHELSA-TraCE21k v1.0 dataset: <https://chelsa-climate.org/chelsa-trace21k/>).
It computes two frequently applied actuarial measures, the expected shortfall and the value at risk. Seven well-known classical distributions in connection to the Bell generalized family are used as follows: Bell-exponential distribution, Bell-extended exponential distribution, Bell-Weibull distribution, Bell-extended Weibull distribution, Bell-Lomax distribution, Bell-Burr-12 distribution, and Bell-Burr-X distribution. Related works include: a) Fayomi, A., Tahir, M. H., Algarni, A., Imran, M., & Jamal, F. (2022). "A new useful exponential model with applications to quality control and actuarial data". Computational Intelligence and Neuroscience, 2022. <doi:10.1155/2022/2489998>. b) Alsadat, N., Imran, M., Tahir, M. H., Jamal, F., Ahmad, H., & Elgarhy, M. (2023). "Compounded Bell-G class of statistical models with applications to COVID-19 and actuarial data". Open Physics, 21(1), 20220242. <doi:10.1515/phys-2022-0242>.
The Genie algorithm (Gagolewski, 2021 <DOI:10.1016/j.softx.2021.100722>) is a robust and outlier-resistant hierarchical clustering method (Gagolewski, Bartoszuk, Cena, 2016 <DOI:10.1016/j.ins.2016.05.003>). This package features its faster and more powerful version. It allows clustering with respect to mutual reachability distances, enabling it to act as a noise point detector or a version of HDBSCAN* that can identify a predefined number of clusters. The package also features an implementation of the Gini and Bonferroni inequality indices, external cluster validity measures (e.g., the normalised clustering accuracy, the adjusted Rand index, the Fowlkes-Mallows index, and normalised mutual information), and internal cluster validity indices (e.g., the Calinski-Harabasz, Davies-Bouldin, Ball-Hall, Silhouette, and generalised Dunn indices). The Python version of genieclust is available via PyPI'.
An updated implementation of R package ranger by Wright et al, (2017) <doi:10.18637/jss.v077.i01> for training and predicting from random forests, particularly suited to high-dimensional data, and for embedding in Multiple Imputation by Chained Equations (MICE) by van Buuren (2007) <doi:10.1177/0962280206074463>. Ensembles of classification and regression trees are currently supported. Sparse data of class dgCMatrix (R package Matrix') can be directly analyzed. Conventional bagged predictions are available alongside an efficient prediction for MICE via the algorithm proposed by Doove et al (2014) <doi:10.1016/j.csda.2013.10.025>. Trained forests can be written to and read from storage. Survival and probability forests are not supported in the update, nor is data of class gwaa.data (R package GenABEL'); use the original ranger package for these analyses.
This package performs drug demand forecasting by modeling drug dispensing data while taking into account predicted enrollment and treatment discontinuation dates. The gap time between randomization and the first drug dispensing visit is modeled using interval-censored exponential, Weibull, log-logistic, or log-normal distributions (Anderson-Bergman (2017) <doi:10.18637/jss.v081.i12>). The number of skipped visits is modeled using Poisson, zero-inflated Poisson, or negative binomial distributions (Zeileis, Kleiber & Jackman (2008) <doi:10.18637/jss.v027.i08>). The gap time between two consecutive drug dispensing visits given the number of skipped visits is modeled using linear regression based on least squares or least absolute deviations (Birkes & Dodge (1993, ISBN:0-471-56881-3)). The number of dispensed doses is modeled using linear or linear mixed-effects models (McCulloch & Searle (2001, ISBN:0-471-19364-X)).
Compute correlation and other association matrices from small to high-dimensional datasets with relative simple functions and sensible defaults. Includes options for shrinkage and robustness to improve results in noisy or high-dimensional settings (p >= n), plus convenient print/plot methods for inspection. Implemented with optimised C++ backends using BLAS/OpenMP and memory-aware symmetric updates. Works with base matrices and data frames, returning standard R objects via a consistent S3 interface. Useful across genomics, agriculture, and machine-learning workflows. Supports Pearson, Spearman, Kendall, distance correlation, partial correlation, and robust biweight mid-correlation; Blandâ Altman analyses and Lin's concordance correlation coefficient (including repeated-measures extensions). Methods based on Ledoit and Wolf (2004) <doi:10.1016/S0047-259X(03)00096-4>; Schäfer and Strimmer (2005) <doi:10.2202/1544-6115.1175>; Lin (1989) <doi:10.2307/2532051>.
Via Foundry API provides streamlined tools for interacting with and extracting data from structured responses, particularly for use cases involving hierarchical data from Foundry's API. It includes functions to fetch and parse process-level and file-level metadata, allowing users to efficiently query and manipulate nested data structures. Key features include the ability to list all unique process names, retrieve file metadata for specific or all processes, and dynamically load or download files based on their type. With built-in support for handling various file formats (e.g., tabular and non-tabular files) and seamless integration with API through authentication, this package is designed to enhance workflows involving large-scale data management and analysis. Robust error handling and flexible configuration ensure reliable performance across diverse data environments. Please consult the documentation for the API endpoint for your installation.
This package contains basic structures and operations used frequently in quantum computing. Intended to be a convenient tool to help learn quantum mechanics and algorithms. Can create arbitrarily sized kets and bras and implements quantum gates, inner products, and tensor products. Creates arbitrarily controlled versions of all gates and can simulate complete or partial measurements of kets. Has functionality to convert functions into equivalent quantum gates and model quantum noise. Includes larger applications, such as Steane error correction <DOI:10.1103/physrevlett.77.793>, Quantum Fourier Transform and Shor's algorithm (Shor 1999), Grover's algorithm (1996), Quantum Approximation Optimization Algorithm (QAOA) (Farhi, Goldstone, and Gutmann 2014) <arXiv:1411.4028>, and a variational quantum classifier (Schuld 2018) <arXiv:1804.00633>. Can be used with the gridsynth algorithm <arXiv:1212.6253> to perform decomposition into the Clifford+T set.
This package implements a set of methodological tools that enable researchers to apply matching methods to time-series cross-sectional data. Imai, Kim, and Wang (2023) <http://web.mit.edu/insong/www/pdf/tscs.pdf> proposes a nonparametric generalization of the difference-in-differences estimator, which does not rely on the linearity assumption as often done in practice. Researchers first select a method of matching each treated observation for a given unit in a particular time period with control observations from other units in the same time period that have a similar treatment and covariate history. These methods include standard matching methods based on propensity score and Mahalanobis distance, as well as weighting methods. Once matching and refinement is done, treatment effects can be estimated with standard errors. The package also offers diagnostics for researchers to assess the quality of their results.
This package provides a fast, interactive cross-platform, and easy to share WebGL'-based 3D brain viewer that visualizes FreeSurfer and/or AFNI/SUMA surfaces. The viewer widget can be either standalone or embedded into R-shiny applications. The standalone version only require a web browser with WebGL2 support (for example, Chrome', Firefox', Safari'), and can be inserted into any websites. The R-shiny support allows the 3D viewer to be dynamically generated from reactive user inputs. Please check the publication by Wang, Magnotti, Zhang, and Beauchamp (2023, <doi:10.1523/ENEURO.0328-23.2023>) for electrode localization. This viewer has been fully adopted by RAVE <https://openwetware.org/wiki/RAVE>, an interactive toolbox to analyze iEEG data by Magnotti, Wang, and Beauchamp (2020, <doi:10.1016/j.neuroimage.2020.117341>). Please check citation("threeBrain") for details.
This package provides functions to implement model selection and multimodel inference based on Akaike's information criterion (AIC) and the second-order AIC (AICc), as well as their quasi-likelihood counterparts (QAIC, QAICc) from various model object classes. The package implements classic model averaging for a given parameter of interest or predicted values, as well as a shrinkage version of model averaging parameter estimates or effect sizes. The package includes diagnostics and goodness-of-fit statistics for certain model types including those of unmarkedFit classes estimating demographic parameters after accounting for imperfect detection probabilities. Some functions also allow the creation of model selection tables for Bayesian models of the bugs', rjags', and jagsUI classes. Functions also implement model selection using BIC. Objects following model selection and multimodel inference can be formatted to LaTeX using xtable methods included in the package.
This package implements a class of univariate and multivariate spatio-temporal generalised linear mixed models for areal unit data, with inference in a Bayesian setting using Markov chain Monte Carlo (MCMC) simulation. The response variable can be binomial, Gaussian, or Poisson, but for some models only the binomial and Poisson data likelihoods are available. The spatio-temporal autocorrelation is modelled by random effects, which are assigned conditional autoregressive (CAR) style prior distributions. A number of different random effects structures are available, including models similar to Rushworth et al. (2014) <doi:10.1016/j.sste.2014.05.001>. Full details are given in the vignette accompanying this package. The creation and development of this package was supported by the Engineering and Physical Sciences Research Council (EPSRC) grants EP/J017442/1 and EP/T004878/1 and the Medical Research Council (MRC) grant MR/L022184/1.
Analyze data from a crossover design using generalized estimation equations (GEE), including carryover effects and various correlation structures based on the Kronecker product. It contains functions for semiparametric estimates of carry-over effects in repeated measures and allows estimation of complex carry-over effects. Related work includes: a) Cruz N.A., Melo O.O., Martinez C.A. (2023). "CrossCarry: An R package for the analysis of data from a crossover design with GEE". <doi:10.48550/arXiv.2304.02440>. b) Cruz N.A., Melo O.O., Martinez C.A. (2023). "A correlation structure for the analysis of Gaussian and non-Gaussian responses in crossover experimental designs with repeated measures". <doi:10.1007/s00362-022-01391-z> and c) Cruz N.A., Melo O.O., Martinez C.A. (2023). "Semiparametric generalized estimating equations for repeated measurements in cross-over designs". <doi:10.1177/09622802231158736>.
Linkage disequilibrium visualizations of up to several hundreds of single nucleotide polymorphisms (SNPs), annotated with chromosomic positions and gene names. Two types of plots are available for small numbers of SNPs (<40) and for large numbers (tested up to 500). Both can be extended by combining other ggplots, e.g. association studies results, and functions enable to directly visualize the effect of SNP selection methods, as minor allele frequency filtering and TagSNP selection, with a second correlation heatmap. The SNPs correlations are computed on Genotype Data objects from the GWASTools package using the SNPRelate package, and the plots are customizable ggplot2 and gtable objects and are annotated using the biomaRt package. Usage is detailed in the vignette with example data and results from up to 500 SNPs of 1,200 scans are in Charlon T. (2019) <doi:10.13097/archive-ouverte/unige:161795>.
This package implements a functionality to deal with RabbitMQ Streams using asyncio.
It is designed and implemented with the following qualities in mind:
asynchronous Pythonic API with type annotations
use of AMQP 1.0 message format to enable interoperability between RabbitMQ Stream. clients
auto reconnection to RabbitMQ broker with lazily created connection objects
Support of many RabbitMQ Streams broker features:
publishing single messages, or in batches, with confirmation
subscribing to a stream at a specific point in time, from a specific offset, or using offset reference
stream message filtering
writing stream offset reference
message deduplication
integration with AMQP 1.0 ecosystem at message format level
This package provides two methods for segmentation and joint segmentation/clustering of bivariate time-series. Originally intended for ecological segmentation (home-range and behavioural modes) but easily applied on other series, the package also provides tools for analysing outputs from R packages moveHMM and marcher'. The segmentation method is a bivariate extension of Lavielle's method available in adehabitatLT (Lavielle, 1999 <doi:10.1016/S0304-4149(99)00023-X> and 2005 <doi:10.1016/j.sigpro.2005.01.012>). This method rely on dynamic programming for efficient segmentation. The segmentation/clustering method alternates steps of dynamic programming with an Expectation-Maximization algorithm. This is an extension of Picard et al (2007) <doi:10.1111/j.1541-0420.2006.00729.x> method (formerly available in cghseg package) to the bivariate case. The method is fully described in Patin et al (2018) <doi:10.1101/444794>.