Test for no adverse shift in two-sample comparison when we have a training set, the reference distribution, and a test set. The approach is flexible and relies on a robust and powerful test statistic, the weighted AUC. Technical details are in Kamulete, V. M. (2021) <arXiv:1908.04000>. Modern notions of outlyingness such as trust scores and prediction uncertainty can be used as the underlying scores for example.
Elastic net regression models are controlled by two parameters, lambda, a measure of shrinkage, and alpha, a metric defining the model's location on the spectrum between ridge and lasso regression. glmnet provides tools for selecting lambda via cross validation but no automated methods for selection of alpha. Elastic Net SearcheR automates the simultaneous selection of both lambda and alpha. Developed, in part, with support by NICHD R03 HD094912.
Analysis of dichotomous and polytomous response data using the explanatory item response modeling framework, as described in Bulut, Gorgun, & Yildirim-Erbasli (2021) <doi:10.3390/psych3030023>, Stanke & Bulut (2019) <doi:10.21449/ijate.515085>, and De Boeck & Wilson (2004) <doi:10.1007/978-1-4757-3990-9>. Generalized linear mixed modeling is used for estimating the effects of item-related and person-related variables on dichotomous and polytomous item responses.
Simulate and analyze multistate models with general hazard functions. gems provides functionality for the preparation of hazard functions and parameters, simulation from a general multistate model and predicting future events. The multistate model is not required to be a Markov model and may take the history of previous events into account. In the basic version, it allows to simulate from transition-specific hazard function, whose parameters are multivariable normally distributed.
This tool identifies hydropeaking events from raw time-series flow record, a rapid flow variation induced by the hourly-adjusted electricity market. The novelty of HEDA is to use vector angle instead of the first-order derivative to detect change points which not only largely improves the computing efficiency but also accounts for the rate of change of the flow variation. More details <doi:10.1016/j.jhydrol.2021.126392>.
Efficient Bayesian multinomial logistic regression based on heavy-tailed (hyper-LASSO, non-convex) priors. The posterior of coefficients and hyper-parameters is sampled with restricted Gibbs sampling for leveraging the high-dimensionality and Hamiltonian Monte Carlo for handling the high-correlation among coefficients. A detailed description of the method: Li and Yao (2018), Journal of Statistical Computation and Simulation, 88:14, 2827-2851, <doi:10.48550/arXiv.1405.3319>.
Inference approach for jointly modeling correlated count and binary outcomes. This formulation allows simultaneous modeling of zero inflation via the Bernoulli component while providing a more accurate assessment of the Hierarchical Zero-Inflated Poisson's parsimony (Lizandra C. Fabio, Jalmar M. F. Carrasco, Victor H. Lachos and Ming-Hui Chen, Likelihood-based inference for joint modeling of correlated count and binary outcomes with extra variability and zeros, 2025, under submission).
Interpretable nonparametric modeling of longitudinal data using additive Gaussian process regression. Contains functionality for inferring covariate effects and assessing covariate relevances. Models are specified using a convenient formula syntax, and can include shared, group-specific, non-stationary, heterogeneous and temporally uncertain effects. Bayesian inference for model parameters is performed using Stan'. The modeling approach and methods are described in detail in Timonen et al. (2021) <doi:10.1093/bioinformatics/btab021>.
This package provides tools for data analysis with partially observed Markov process (POMP) models (also known as stochastic dynamical systems, hidden Markov models, and nonlinear, non-Gaussian, state-space models). The package provides facilities for implementing POMP models, simulating them, and fitting them to time series data by a variety of frequentist and Bayesian methods. It is also a versatile platform for implementation of inference methods for general POMP models.
Efficient statistical inference of two-sample MR (Mendelian Randomization) analysis. It can account for the correlated instruments and the horizontal pleiotropy, and can provide the accurate estimates of both causal effect and horizontal pleiotropy effect as well as the two corresponding p-values. There are two main functions in the PPMR package. One is PMR_individual() for individual level data, the other is PMR_summary() for summary data.
Option pricing (financial derivatives) techniques mainly following textbook Options, Futures and Other Derivatives', 9ed by John C.Hull, 2014. Prentice Hall. Implementations are via binomial tree option model (BOPM), Black-Scholes model, Monte Carlo simulations, etc. This package is a result of Quantitative Financial Risk Management course (STAT 449 and STAT 649) at Rice University, Houston, TX, USA, taught by Oleg Melnikov, statistics PhD student, as of Spring 2015.
Analyse species-habitat associations in R. Therefore, information about the location of the species (as a point pattern) is needed together with environmental conditions (as a categorical raster). To test for significance habitat associations, one of the two components is randomized. Methods are mainly based on Plotkin et al. (2000) <doi:10.1006/jtbi.2000.2158> and Harms et al. (2001) <doi:10.1111/j.1365-2745.2001.00615.x>.
This package provides tools for 3D point cloud voxelisation, projection, geometrical and morphological description of trees (DBH, height, volume, crown diameter), analyses of temporal changes between different measurement times, distance based clustering and visualisation of 3D voxel clouds and 2D projection. Most analyses and algorithms provided in the package are based on the concept of space exploration and are described in Lecigne et al. (2018, <doi:10.1093/aob/mcx095>).
This package provides functions to allow users to build and analyze design consistent tree and random forest models using survey data from a complex sample design. The tree model algorithm can fit a linear model to survey data in each node obtained by recursively partitioning the data. The splitting variables and selected splits are obtained using a randomized permutation test procedure which adjusted for complex sample design features used to obtain the data. Likewise the model fitting algorithm produces design-consistent coefficients to any specified least squares linear model between the dependent and independent variables used in the end nodes. The main functions return the resulting binary tree or random forest as an object of "rpms" or "rpms_forest" type. The package also provides methods modeling a "boosted" tree or forest model and a tree model for zero-inflated data as well as a number of functions and methods available for use with these object types.
Thisp package enables you to track and report code coverage for your package and (optionally) upload the results to a coverage service. Code coverage is a measure of the amount of code being exercised by a set of tests. It is an indirect measure of test quality and completeness. This package is compatible with any testing methodology or framework and tracks coverage of both R code and compiled C/C++/FORTRAN code.
This package provides a function to make gene presence/absence calls based on distance from negative strand matching probesets (NSMP) which are derived from Affymetrix annotation. PANP is applied after gene expression values are created, and therefore can be used after any preprocessing method such as MAS5 or GCRMA, or PM-only methods like RMA. NSMP sets have been established for the HGU133A and HGU133-Plus-2.0 chipsets to date.
It provides access to and information about the most important Brazilian economic time series - from the Getulio Vargas Foundation <http://portal.fgv.br/en>, the Central Bank of Brazil <http://www.bcb.gov.br> and the Brazilian Institute of Geography and Statistics <http://www.ibge.gov.br>. It also presents tools for managing, analysing (e.g. generating dynamic reports with a complete analysis of a series) and exporting these time series.
Bayesian hierarchical methods for the detection of differences in rates of related outcomes for multiple treatments for clustered observations (Carragher et al. (2020) <doi:10.1002/sim.8563>). This software was developed for the Precision Drug Theraputics: Risk Prediction in Pharmacoepidemiology project as part of a Rutherford Fund Fellowship at Health Data Research (UK), Medical Research Council (UK) award reference MR/S003967/1 (<https://gtr.ukri.org/>). Principal Investigator: Raymond Carragher.
Cuddy-Della valle index gives the degree of instability present in the data by accommodating the effect of a trend. The adjusted R squared value of the best fitted model is chosen. The index is obtained by multiplying the coefficient of variation with square root of one minus the adjusted R-squared value. This package has been developed using concept of Shankar et al. (2022)<doi:10.3389/fsufs.2023.1208898>.
This package provides object-oriented database management tools for working with large datasets across multiple database systems. Features include robust connection management for SQL Server and PostgreSQL databases, advanced table operations with bulk data loading and upsert functionality, comprehensive data validation through customizable field type and content validators, efficient index management, and cross-database compatibility. Designed for high-performance data operations in surveillance systems and large-scale data processing workflows.
This package provides a collection of functions and jamovi module for the estimation approach to inferential statistics, the approach which emphasizes effect sizes, interval estimates, and meta-analysis. Nearly all functions are based on statpsych and metafor'. This package is still under active development, and breaking changes are likely, especially with the plot and hypothesis test functions. Data sets are included for all examples from Cumming & Calin-Jageman (2024) <ISBN:9780367531508>.
Meteorological Tools following the FAO56 irrigation paper of Allen et al. (1998) [1]. Functions for calculating: reference evapotranspiration (ETref), extraterrestrial radiation (Ra), net radiation (Rn), saturation vapor pressure (satVP), global radiation (Rs), soil heat flux (G), daylight hours, and more. [1] Allen, R. G., Pereira, L. S., Raes, D., & Smith, M. (1998). Crop evapotranspiration-Guidelines for computing crop water requirements-FAO Irrigation and drainage paper 56. FAO, Rome, 300(9).
Impute the covariance matrix of incomplete data so that factor analysis can be performed. Imputations are made using multiple imputation by Multivariate Imputation with Chained Equations (MICE) and combined with Rubin's rules. Parametric Fieller confidence intervals and nonparametric bootstrap confidence intervals can be obtained for the variance explained by different numbers of principal components. The method is described in Nassiri et al. (2018) <doi:10.3758/s13428-017-1013-4>.
This package provides functions to calculate hazard and survival function of Multi-Stage Clonal Expansion Models used in cancer epidemiology. For the Two-Stage Clonal Expansion Model an exact solution is implemented assuming piecewise constant parameters, see Heidenreich, Luebeck, Moolgavkar (1997) <doi:10.1111/j.1539-6924.1997.tb00878.x>. Numerical solutions are provided for its extensions, see also Little, Vineis, Li (2008) <doi:10.1016/j.jtbi.2008.05.027>.