This package provides a client library for Vipul's Razor. Vipul's Razor is a distributed, collaborative, spam detection and filtering network. Through user contribution, Razor establishes a distributed and constantly updating catalogue of spam in propagation that is consulted by email clients to filter out known spam. Detection is done with statistical and randomized signatures that efficiently spot mutating spam content. User input is validated through reputation assignments based on consensus on report and revoke assertions which in turn is used for computing confidence values associated with individual signatures.
Supports propensity score-based methodsâ including matching, stratification, and weightingâ for estimating causal treatment effects. It also implements calibration using negative control outcomes to enhance robustness. debiasedTrialEmulation facilitates effect estimation for both binary and time-to-event outcomes, supporting risk ratio (RR), odds ratio (OR), and hazard ratio (HR) as effect measures. It integrates statistical modeling and visualization tools to assess covariate balance, equipoise, and bias calibration. Additional methodsâ including approaches to address immortal time bias, information bias, selection bias, and informative censoringâ are under development. Users interested in these extended features are encouraged to contact the package authors.
This package provides functions for evaluating and visualizing predictive model performance (specifically: binary classifiers) in the field of customer scoring. These metrics include lift, lift index, gain percentage, top-decile lift, F1-score, expected misclassification cost and absolute misclassification cost. See Berry & Linoff (2004, ISBN:0-471-47064-3), Witten and Frank (2005, 0-12-088407-0) and Blattberg, Kim & Neslin (2008, ISBN:978â 0â 387â 72578â 9) for details. Visualization functions are included for lift charts and gain percentage charts. All metrics that require class predictions offer the possibility to dynamically determine cutoff values for transforming real-valued probability predictions into class predictions.
This package implements the calibrated sensitivity analysis approach for matched observational studies. Our sensitivity analysis framework views matched sets as drawn from a super-population. The unmeasured confounder is modeled as a random variable. We combine matching and model-based covariate-adjustment methods to estimate the treatment effect. The hypothesized unmeasured confounder enters the picture as a missing covariate. We adopt a state-of-art Expectation Maximization (EM) algorithm to handle this missing covariate problem in generalized linear models (GLMs). As our method also estimates the effect of each observed covariate on the outcome and treatment assignment, we are able to calibrate the unmeasured confounder to observed covariates. Zhang, B., Small, D. S. (2018). <arXiv:1812.00215>.
Time series forecasting faces challenges due to the non-stationarity, nonlinearity, and chaotic nature of the data. Traditional deep learning models like Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) process data sequentially but are inefficient for long sequences. To overcome the limitations of these models, we proposed a transformer-based deep learning architecture utilizing an attention mechanism for parallel processing, enhancing prediction accuracy and efficiency. This paper presents user-friendly code for the implementation of the proposed transformer-based deep learning architecture utilizing an attention mechanism for parallel processing. References: Nayak et al. (2024) <doi:10.1007/s40808-023-01944-7> and Nayak et al. (2024) <doi:10.1016/j.simpa.2024.100716>.
DifferentialRegulation is a method for detecting differentially regulated genes between two groups of samples (e.g., healthy vs. disease, or treated vs. untreated samples), by targeting differences in the balance of spliced and unspliced mRNA abundances, obtained from single-cell RNA-sequencing (scRNA-seq) data. From a mathematical point of view, DifferentialRegulation accounts for the sample-to-sample variability, and embeds multiple samples in a Bayesian hierarchical model. Furthermore, our method also deals with two major sources of mapping uncertainty: i) ambiguous reads, compatible with both spliced and unspliced versions of a gene, and ii) reads mapping to multiple genes. In particular, ambiguous reads are treated separately from spliced and unsplced reads, while reads that are compatible with multiple genes are allocated to the gene of origin. Parameters are inferred via Markov chain Monte Carlo (MCMC) techniques (Metropolis-within-Gibbs).
Sensitivity analysis for case-control studies in which some cases may meet a more narrow definition of being a case compared to other cases which only meet a broad definition. The sensitivity analyses are described in Small, Cheng, Halloran and Rosenbaum (2013, "Case Definition and Sensitivity Analysis", Journal of the American Statistical Association, 1457-1468). The functions sens.analysis.mh and sens.analysis.aberrant.rank provide sensitivity analyses based on the Mantel-Haenszel test statistic and aberrant rank test statistic as described in Rosenbaum (1991, "Sensitivity Analysis for Matched Case Control Studies", Biometrics); see also Section 1 of Small et al. The function adaptive.case.test provides adaptive inferences as described in Section 5 of Small et al. The function adaptive.noether.brown provides a sensitivity analysis for a matched cohort study based on an adaptive test. The other functions in the package are internal functions.
This package provides functions to delineate temporal dataset shifts in Electronic Health Records through the projection and visualization of dissimilarities among data temporal batches. This is done through the estimation of data statistical distributions over time and their projection in non-parametric statistical manifolds, uncovering the patterns of the data latent temporal variability. EHRtemporalVariability is particularly suitable for multi-modal data and categorical variables with a high number of values, common features of biomedical data where traditional statistical process control or time-series methods may not be appropriate. EHRtemporalVariability allows you to explore and identify dataset shifts through visual analytics formats such as Data Temporal heatmaps and Information Geometric Temporal (IGT) plots. An additional EHRtemporalVariability Shiny app can be used to load and explore the package results and even to allow the use of these functions to those users non-experienced in R coding. (Sáez et al. 2020) <doi:10.1093/gigascience/giaa079>.
This package implements a regularized Bayesian estimator that optimizes the estimation of between-group coefficients for multilevel latent variable models by minimizing mean squared error (MSE) and balancing variance and bias. The package provides more reliable estimates in scenarios with limited data, offering a robust solution for accurate parameter estimation in two-level latent variable models. It is designed for researchers in psychology, education, and related fields who face challenges in estimating between-group effects under small sample sizes and low intraclass correlation coefficients. The package includes comprehensive S3 methods for result objects: print(), summary(), coef(), se(), vcov(), confint(), as.data.frame(), dim(), length(), names(), and update() for enhanced usability and integration with standard R workflows. Dashuk et al. (2025a) <doi:10.1017/psy.2025.10045> derived the optimal regularized Bayesian estimator; Dashuk et al. (2025b) <doi:10.1007/s41237-025-00264-7> extended it to the multivariate case; and Luedtke et al. (2008) <doi:10.1037/a0012869> formalized the two-level latent variable framework.
This package contains functions to implement automated covariate selection using methods described in the high-dimensional propensity score (HDPS) algorithm by Schneeweiss et.al. Covariate adjustment in real-world-observational-data (RWD) is important for for estimating adjusted outcomes and this can be done by using methods such as, but not limited to, propensity score matching, propensity score weighting and regression analysis. While these methods strive to statistically adjust for confounding, the major challenge is in selecting the potential covariates that can bias the outcomes comparison estimates in observational RWD (Real-World-Data). This is where the utility of automated covariate selection comes in. The functions in this package help to implement the three major steps of automated covariate selection as described by Schneeweiss et. al elsewhere. These three functions, in order of the steps required to execute automated covariate selection are, get_candidate_covariates(), get_recurrence_covariates() and get_prioritised_covariates(). In addition to these functions, a sample real-world-data from publicly available de-identified medical claims data is also available for running examples and also for further exploration. The original article where the algorithm is described by Schneeweiss et.al. (2009) <doi:10.1097/EDE.0b013e3181a663cc> .
Determination of rainfall-runoff erosivity factor.
Documentation at https://melpa.org/#/run-command-recipes
Documentation at https://melpa.org/#/replace-from-region
Binary package needed by the iai-callgrind library
This package provides an ESS-like binding to send lines or regions to a REPL from Racket buffers.
Escape RegExp special characters
This gem is used to handle HTML sanitization in Rails applications. If you need similar functionality in non Rails apps consider using Loofah directly.
Escape RegExp special characters
Escape RegExp special characters
An implementation of r6rs bytevectors
Documentation at https://melpa.org/#/rainbow-identifiers