The queueing model of visual search models the accuracy and response time data in a visual search experiment using queueing models with finite customer population and stopping criteria of completing the service for finite number of customers. It implements the conceptualization of a hybrid model proposed by Moore and Wolfe (2001), in which visual stimuli enter the processing one after the other and then are identified in parallel. This package provides functions that simulate the specified queueing process and calculate the Wasserstein distance between the empirical response times and the model prediction.
REDUCE is a portable general-purpose computer algebra system. It is a system for doing scalar, vector and matrix algebra by computer, which also supports arbitrary precision numerical approximation and interfaces to gnuplot to provide graphics. It can be used interactively for simple calculations but also provides a full programming language, with a syntax similar to other modern programming languages. REDUCE supports alternative user interfaces including Run-REDUCE, TeXmacs and GNU Emacs. This package provides the Codemist Standard Lisp (CSL) version of REDUCE. It uses the gnuplot program, if installed, to draw figures.
Four methods for mediation analysis with missing data: Listwise deletion, Pairwise deletion, Multiple imputation, and Two Stage Maximum Likelihood algorithm. For MI and TS-ML, auxiliary variables can be included. Bootstrap confidence intervals for mediation effects are obtained. The robust method is also implemented for TS-ML. Since version 1.4, bmem adds the capability to conduct power analysis for mediation models. Details about the methods used can be found in these articles. Zhang and Wang (2003) <doi:10.1007/s11336-012-9301-5>. Zhang (2014) <doi:10.3758/s13428-013-0424-0>.
Multivariable fractional polynomial algorithm simultaneously selects variables and functional forms in both generalized linear models and Cox proportional hazard models. Key references for this algorithm are Royston and Altman (1994)<doi:10.2307/2986270> and Sauerbrei and Royston (2008, ISBN:978-0-470-02842-1). In addition, it can model a sigmoid relationship between variable x and an outcome variable y using the approximate cumulative distribution transformation proposed by Royston (2014) <doi:10.1177/1536867X1401400206>. This feature distinguishes it from a standard fractional polynomial function, which lacks the ability to achieve such modeling.
Calculation of Predictive Moran's eigenvector maps (pMEM
), as defined by Guénard and Legendre (In Press) "Spatially-explicit predictions using spatial eigenvector maps" <doi:10.5281/zenodo.13356457>. Methods in Ecology and Evolution. This method enables scientists to predict the values of spatially-structured environmental variables. Multiple types of pMEM
are defined, each one implemented on the basis of spatial weighting function taking a range parameter, and sometimes also a shape parameter. The code's modular nature enables programers to implement new pMEM
by defining new spatial weighting functions.
Implement a GAM-based (Generalized Additive Models) spatial surplus production model (spatial SPM), aimed at modeling northern shrimp population in Atlantic Canada but potentially to any stock in any location. The package is opinionated in its implementation of SPMs as it internally makes the choice to use penalized spatial gams with time lags. However, it also aims to provide options for the user to customize their model. The methods are described in Pedersen et al. (2022, <https://www.dfo-mpo.gc.ca/csas-sccs/Publications/ResDocs-DocRech/2022/2022_062-eng.html>
).
Monte Carlo and MCMC sampling algorithms for semiparametric Bayesian regression analysis. These models feature a nonparametric (unknown) transformation of the data paired with widely-used regression models including linear regression, spline regression, quantile regression, and Gaussian processes. The transformation enables broader applicability of these key models, including for real-valued, positive, and compactly-supported data with challenging distributional features. The samplers prioritize computational scalability and, for most cases, Monte Carlo (not MCMC) sampling for greater efficiency. Details of the methods and algorithms are provided in Kowal and Wu (2023) <arXiv:2306.05498>
.
This package provides a collection of procedures for analysing, visualising, and managing single-case data. These include piecewise linear regression models, multilevel models, overlap indices ('PND', PEM', PAND', PET', tau-u', baseline corrected tau', CDC'), and randomization tests. Data preparation functions support outlier detection, handling missing values, scaling, and custom transformations. An export function helps to generate html, word, and latex tables in a publication friendly style. More details can be found in the online book Analyzing single-case data with R and scan', Juergen Wilbert (2025) <https://jazznbass.github.io/scan-Book/>.
We provide functionality to implement penalized PCA with an option to smooth the objective function using Nesterov smoothing. Two functions are available to compute a user-specified number of eigenvectors. The function unsmoothed_penalized_EV()
computes a penalized PCA without smoothing and has three parameters (the input matrix, the Lasso penalty, and the number of desired eigenvectors). The function smoothed_penalized_EV()
computes a smoothed penalized PCA using the same parameters and additionally requires the specification of a smoothing parameter. Both functions return a matrix having the desired eigenvectors as columns.
Using The Free Evocation of Words Technique method with some functions, this package will make a social representation and other analysis. The Free Evocation of Words Technique consists of collecting a number of words evoked by a subject facing exposure to an inducer term. The purpose of this technique is to understand the relationships created between words evoked by the individual and the inducer term. This technique is included in the theory of social representations, therefore, on the information transmitted by an individual, seeks to create a profile that define a social group.
Feature selection aims to identify and remove redundant, irrelevant and noisy variables from high-dimensional datasets. Selecting informative features affects the subsequent classification and regression analyses by improving their overall performances. Several methods have been proposed to perform feature selection: most of them relies on univariate statistics, correlation, entropy measurements or the usage of backward/forward regressions. Herein, we propose an efficient, robust and fast method that adopts stochastic optimization approaches for high-dimensional. GARS is an innovative implementation of a genetic algorithm that selects robust features in high-dimensional and challenging datasets.
Protein Group Code Algorithm (PGCA) is a computationally inexpensive algorithm to merge protein summaries from multiple experimental quantitative proteomics data. The algorithm connects two or more groups with overlapping accession numbers. In some cases, pairwise groups are mutually exclusive but they may still be connected by another group (or set of groups) with overlapping accession numbers. Thus, groups created by PGCA from multiple experimental runs (i.e., global groups) are called "connected" groups. These identified global protein groups enable the analysis of quantitative data available for protein groups instead of unique protein identifiers.
This package provides methods for estimating the area under the concentration versus time curve (AUC) and its standard error in the presence of Below the Limit of Quantification (BLOQ) observations. Two approaches are implemented: direct estimation using censored maximum likelihood, and a two-step approach that first imputes BLOQ values using various methods and then computes the AUC using the imputed data. Technical details are described in Barnett et al. (2020), "Methods for Non-Compartmental Pharmacokinetic Analysis With Observations Below the Limit of Quantification," Statistics in Biopharmaceutical Research. <doi:10.1080/19466315.2019.1701546>.
Fits a constrained regression model for an ordinal response with ordinal predictors and possibly others, Espinosa and Hennig (2019) <DOI:10.1007/s11222-018-9842-2>. The parameter estimates associated with an ordinal predictor are constrained to be monotonic. If a monotonicity direction (isotonic or antitonic) is not specified for an ordinal predictor by the user, then one of the available methods will either establish it or drop the monotonicity assumption. Two monotonicity tests are also available to test the null hypothesis of monotonicity over a set of parameters associated with an ordinal predictor.
Train a Gaussian stochastic process model of an unknown function, possibly observed with error, via maximum likelihood or maximum a posteriori (MAP) estimation, run model diagnostics, and make predictions, following Sacks, J., Welch, W.J., Mitchell, T.J., and Wynn, H.P. (1989) "Design and Analysis of Computer Experiments", Statistical Science, <doi:10.1214/ss/1177012413>. Perform sensitivity analysis and visualize low-order effects, following Schonlau, M. and Welch, W.J. (2006), "Screening the Input Variables to a Computer Model Via Analysis of Variance and Visualization", <doi:10.1007/0-387-28014-6_14>.
This package provides a streamlined tool for eplet analysis of donor and recipient HLA (human leukocyte antigen) mismatch. Messy, low-resolution HLA typing data is cleaned, and imputed to high-resolution using the NMDP (National Marrow Donor Program) haplotype reference database <https://haplostats.org/haplostats>. High resolution data is analyzed for overall or single antigen eplet mismatch using a reference table (currently supporting HLAMatchMaker
<http://www.epitopes.net> versions 2 and 3). Data can enter or exit the workflow at different points depending on the user's aims and initial data quality.
This package provides basic tools for computing clusters of instances described by multiple time-to-event censored endpoints. From long-format datasets, where one instance is described by one or more records of events, a procedure is used to compute state matrices. Then, from state matrices, a procedure provides optimised computation of the Jaccard distance between instances. The library is currently in development, and more options and tools allowing graphical representation of typologies are expected. For methodological details, see our methodological paper: Delord M, Douiri A (2025) <doi:10.1186/s12874-025-02476-7>.
The oblique decision tree (ODT) uses linear combinations of predictors as partitioning variables in a decision tree. Oblique Decision Random Forest (ODRF) is an ensemble of multiple ODTs generated by feature bagging. Oblique Decision Boosting Tree (ODBT) applies feature bagging during the training process of ODT-based boosting trees to ensemble multiple boosting trees. All three methods can be used for classification and regression, and ODT and ODRF serve as supplements to the classical CART of Breiman (1984) <DOI:10.1201/9781315139470> and Random Forest of Breiman (2001) <DOI:10.1023/A:1010933404324> respectively.
Implementation of prediction and inference procedures for Synthetic Control methods using least square, lasso, ridge, or simplex-type constraints. Uncertainty is quantified with prediction intervals as developed in Cattaneo, Feng, and Titiunik (2021) <https://nppackages.github.io/references/Cattaneo-Feng-Titiunik_2021_JASA.pdf> for a single treated unit and in Cattaneo, Feng, Palomba, and Titiunik (2023) <doi:10.48550/arXiv.2210.05026>
for multiple treated units and staggered adoption. More details about the software implementation can be found in Cattaneo, Feng, Palomba, and Titiunik (2024) <doi:10.48550/arXiv.2202.05984>
.
Set of tools to fit a semi-parametric regression model suitable for analysis of data sets in which the response variable is continuous, strictly positive, asymmetric and possibly, censored. Under this setup, both the median and the skewness of the response variable distribution are explicitly modeled by using semi-parametric functions, whose non-parametric components may be approximated by natural cubic splines or P-splines. Supported distributions for the model error include log-normal, log-Student-t, log-power-exponential, log-hyperbolic, log-contaminated-normal, log-slash, Birnbaum-Saunders and Birnbaum-Saunders-t distributions.
Elaboration of vehicular emissions inventories, consisting in four stages, pre-processing activity data, preparing emissions factors, estimating the emissions and post-processing of emissions in maps and databases. More details in Ibarra-Espinosa et al (2018) <doi:10.5194/gmd-11-2209-2018>. Before using VEIN you need to know the vehicular composition of your study area, in other words, the combination of of type of vehicles, size and fuel of the fleet. Then, it is recommended to start with the project to download a template to create a structure of directories and scripts.
Estimates the standard and weighted Elo (WElo, Angelini et al., 2022 <doi:10.1016/j.ejor.2021.04.011>) rates. The current version provides Elo and WElo rates for tennis, according to different systems of weights (games or sets) and scale factors (constant, proportional to the number of matches, with more weight on Grand Slam matches or matches played on a specific surface). Moreover, the package gives the possibility of estimating the (bootstrap) standard errors for the rates. Finally, the package includes betting functions that automatically select the matches on which place a bet.
This package provides a parallel implementation of Weighted Subspace Random Forest. The Weighted Subspace Random Forest algorithm was proposed in the International Journal of Data Warehousing and Mining by Baoxun Xu, Joshua Zhexue Huang, Graham Williams, Qiang Wang, and Yunming Ye (2012) <DOI:10.4018/jdwm.2012040103>. The algorithm can classify very high-dimensional data with random forests built using small subspaces. A novel variable weighting method is used for variable subspace selection in place of the traditional random variable sampling.This new approach is particularly useful in building models from high-dimensional data.
Supports a structured approach for exploring PKPD data <https://opensource.nibr.com/xgx/>. It also contains helper functions for enabling the modeler to follow best R practices (by appending the program name, figure name location, and draft status to each plot). In addition, it enables the modeler to follow best graphical practices (by providing a theme that reduces chart ink, and by providing time-scale, log-scale, and reverse-log-transform-scale functions for more readable axes). Finally, it provides some data checking and summarizing functions for rapidly exploring pharmacokinetics and pharmacodynamics (PKPD) datasets.