The package obtains parameter estimation, i.e., maximum likelihood estimators (MLE), via the Expectation-Maximization (EM) algorithm for the Finite Mixture of Regression (FMR) models with Normal distribution, and MLE for the Finite Mixture of Accelerated Failure Time Regression (FMAFTR) subject to right censoring with Log-Normal and Weibull distributions via the EM algorithm and the Newton-Raphson algorithm (for Weibull distribution). More importantly, the package obtains the maximum penalized likelihood (MPLE) for both FMR and FMAFTR models (collectively called FMRs). A component-wise tuning parameter selection based on a component-wise BIC is implemented in the package. Furthermore, this package provides Ridge Regression and Elastic Net.
This package provides methods for manipulating regression models and for describing these in a style adapted for medical journals. It contains functions for generating an HTML table with crude and adjusted estimates, plotting hazard ratio, plotting model estimates and confidence intervals using forest plots, extending this to comparing multiple models in a single forest plots. In addition to the descriptive methods, there are functions for the robust covariance matrix provided by the sandwich package, a function for adding non-linearities to a model, and a wrapper around the Epi package's Lexis()
functions for time-splitting a dataset when modeling non-proportional hazards in Cox regressions.
Calculates nonparametric pointwise confidence intervals for the survival distribution for right censored data, and for medians [Fay and Brittain <DOI:10.1002/sim.6905>]. Has two-sample tests for dissimilarity (e.g., difference, ratio or odds ratio) in survival at a fixed time, and differences in medians [Fay, Proschan, and Brittain <DOI:10.1111/biom.12231>]. Basically, the package gives exact inference methods for one- and two-sample exact inferences for Kaplan-Meier curves (e.g., generalizing Fisher's exact test to allow for right censoring), which are especially important for latter parts of the survival curve, small sample sizes or heavily censored data. Includes mid-p options.
This package provides a collection of functions that perform jump regression and image analysis such as denoising, deblurring and jump detection. The implemented methods are based on the following research: Qiu, P. (1998) <doi:10.1214/aos/1024691468>, Qiu, P. and Yandell, B. (1997) <doi: 10.1080/10618600.1997.10474746>, Qiu, P. (2009) <doi: 10.1007/s10463-007-0166-9>, Kang, Y. and Qiu, P. (2014) <doi: 10.1080/00401706.2013.844732>, Qiu, P. and Kang, Y. (2015) <doi: 10.5705/ss.2014.054>, Kang, Y., Mukherjee, P.S. and Qiu, P. (2018) <doi: 10.1080/00401706.2017.1415975>, Kang, Y. (2020) <doi: 10.1080/10618600.2019.1665536>.
Set of tools to fit a linear multiple or semi-parametric regression models with the possibility of non-informative random right-censoring. Under this setup, the localization parameter of the response variable distribution is modeled by using linear multiple regression or semi-parametric functions, whose non-parametric components may be approximated by natural cubic spline or P-splines. The supported distribution for the model error is a generalized log-gamma distribution which includes the generalized extreme value and standard normal distributions as important special cases. Inference is based on penalized likelihood and bootstrap methods. Also, some numerical and graphical devices for diagnostic of the fitted models are offered.
Parameter inference methods for models defined implicitly using a random simulator. Inference is carried out using simulation-based estimates of the log-likelihood of the data. The inference methods implemented in this package are explained in Park, J. (2025) <doi:10.48550/arxiv.2311.09446>. These methods are built on a simulation metamodel which assumes that the estimates of the log-likelihood are approximately normally distributed with the mean function that is locally quadratic around its maximum. Parameter estimation and uncertainty quantification can be carried out using the ht()
function (for hypothesis testing) and the ci()
function (for constructing a confidence interval for one-dimensional parameters).
Implementation in a simple and efficient way of fully customisable population genetics simulations, considering multiple loci that have epistatic interactions. Specifically suited to the modelling of multilocus nucleocytoplasmic systems (with both diploid and haploid loci), it is nevertheless possible to simulate purely diploid (or purely haploid) genetic models. Examples of models that can be simulated with Ease are numerous, for example models of genetic incompatibilities as presented by Marie-Orleach et al. (2022) <doi:10.1101/2022.07.25.501356>. Many others are conceivable, although few are actually explored, Ease having been developed in particular to provide a solution so that these kinds of models can be simulated simply.
Analysis of experimental multi-parent populations to detect regions of the genome (called quantitative trait loci, QTLs) influencing phenotypic traits measured in unique and multiple environments. The population must be composed of crosses between a set of at least three parents (e.g. factorial design, diallel', or nested association mapping). The functions cover data processing, QTL detection, and results visualization. The implemented methodology is described in Garin, Wimmer, Mezmouk, Malosetti and van Eeuwijk (2017) <doi:10.1007/s00122-017-2923-3>, in Garin, Malosetti and van Eeuwijk (2020) <doi: 10.1007/s00122-020-03621-0>, and in Garin, Diallo, Tekete, Thera, ..., and Rami (2024) <doi: 10.1093/genetics/iyae003>.
This package performs multiple empirical likelihood tests. It offers an easy-to-use interface and flexibility in specifying hypotheses and calibration methods, extending the framework to simultaneous inferences. The core computational routines are implemented using the Eigen C++ library and RcppEigen
interface, with OpenMP
for parallel computation. Details of the testing procedures are provided in Kim, MacEachern
, and Peruggia (2023) <doi:10.1080/10485252.2023.2206919>. A companion paper by Kim, MacEachern
, and Peruggia (2024) <doi:10.18637/jss.v108.i05> is available for further information. This work was supported by the U.S. National Science Foundation under Grants No. SES-1921523 and DMS-2015552.
Nonparametric efficiency measurement and statistical inference via DEA type estimators (see Färe, Grosskopf, and Lovell (1994) <doi:10.1017/CBO9780511551710>, Kneip, Simar, and Wilson (2008) <doi:10.1017/S0266466608080651> and Badunenko and Mozharovskyi (2020) <doi:10.1080/01605682.2019.1599778>) as well as Stochastic Frontier estimators for both cross-sectional data and 1st, 2nd, and 4th generation models for panel data (see Kumbhakar and Lovell (2003) <doi:10.1017/CBO9781139174411>, Badunenko and Kumbhakar (2016) <doi:10.1016/j.ejor.2016.04.049>). The stochastic frontier estimators can handle both half-normal and truncated normal models with conditional mean and heteroskedasticity. The marginal effects of determinants can be obtained.
Evaluate or optimize designs for nonlinear mixed effects models using the Fisher Information matrix. Methods used in the package refer to Mentré F, Mallet A, Baccar D (1997) <doi:10.1093/biomet/84.2.429>, Retout S, Comets E, Samson A, Mentré F (2007) <doi:10.1002/sim.2910>, Bazzoli C, Retout S, Mentré F (2009) <doi:10.1002/sim.3573>, Le Nagard H, Chao L, Tenaillon O (2011) <doi:10.1186/1471-2148-11-326>, Combes FP, Retout S, Frey N, Mentré F (2013) <doi:10.1007/s11095-013-1079-3> and Seurat J, Tang Y, Mentré F, Nguyen TT (2021) <doi:10.1016/j.cmpb.2021.106126>.
Analyze the default risk of credit portfolios. Commonly known models, like CreditRisk+
or the CreditMetrics
model are implemented in their very basic settings. The portfolio loss distribution can be achieved either by simulation or analytically in case of the classic CreditRisk+
model. Models are only implemented to respect losses caused by defaults, i.e. migration risk is not included. The package structure is kept flexible especially with respect to distributional assumptions in order to quantify the sensitivity of risk figures with respect to several assumptions. Therefore the package can be used to determine the credit risk of a given portfolio as well as to quantify model sensitivities.
Theories are one of the most important tools of science. Although psychologists discussed problems of theory in their discipline for a long time, weak theories are still widespread in most subfields. One possible reason for this is that psychologists lack the tools to systematically assess the quality of their theories. Previously a computational model for formal theory evaluation based on the concept of explanatory coherence was developed (Thagard, 1989, <doi:10.1017/S0140525X00057046>). However, there are possible improvements to this model and it is not available in software that psychologists typically use. Therefore, a new implementation of explanatory coherence based on the Ising model is available in this R-package.
In randomized studies involving severely ill patients, functional outcomes are often unobserved due to missed clinic visits, premature withdrawal or death. It is well known that if these unobserved functional outcomes are not handled properly, biased treatment comparisons can be produced. In this package, we implement a procedure for comparing treatments that is based on the composite endpoint of both the functional outcome and survival. The procedure was proposed in Wang et al. (2016) <DOI:10.1111/biom.12594> and Wang et al. (2020) <DOI:10.18637/jss.v093.i12>. It considers missing data imputation with different sensitivity analysis strategies to handle the unobserved functional outcomes not due to death.
This package provides functions for performing stochastic search variable selection (SSVS) for binary and continuous outcomes and visualizing the results. SSVS is a Bayesian variable selection method used to estimate the probability that individual predictors should be included in a regression model. Using MCMC estimation, the method samples thousands of regression models in order to characterize the model uncertainty regarding both the predictor set and the regression parameters. For details see Bainter, McCauley
, Wager, and Losin (2020) Improving practices for selecting a subset of important predictors in psychology: An application to predicting pain, Advances in Methods and Practices in Psychological Science 3(1), 66-80 <DOI:10.1177/2515245919885617>.
This package provides a problem solving environment (PSE) for fitting separable nonlinear models to measurements arising in physics and chemistry experiments, as described by Mullen & van Stokkum (2007) <doi:10.18637/jss.v018.i03> for its use in fitting time resolved spectroscopy data, and as described by Laptenok et al. (2007) <doi:10.18637/jss.v018.i08> for its use in fitting Fluorescence Lifetime Imaging Microscopy (FLIM) data, in the study of Förster Resonance Energy Transfer (FRET). `TIMP` also serves as the computation backend for the `GloTarAn`
software, a graphical user interface for the package, as described in Snellenburg et al. (2012) <doi:10.18637/jss.v049.i03>.
This package contains functions for applying the T^2-test for equivalence. The T^2-test for equivalence is a multivariate two-sample equivalence test. Distance measure of the test is the Mahalanobis distance. For multivariate normally distributed data the T^2-test for equivalence is exact and UMPI. The function T2EQ()
implements the T^2-test for equivalence according to Wellek (2010) <DOI:10.1201/ebk1439808184>. The function T2EQ.dissolution.profiles.hoffelder()
implements a variant of the T^2-test for equivalence according to Hoffelder (2016) <http://www.ecv.de/suse_item.php?suseId=Z|pi|8430>
for the equivalence comparison of highly variable dissolution profiles.
This package provides a wrapper for machine learning (ML) methods to select among a portfolio of algorithms based on the value of a key performance indicator (KPI). A number of features is used to adjust a model to predict the value of the KPI for each algorithm, then, for a new value of the features the KPI is estimated and the algorithm with the best one is chosen. To learn it can use the regression methods in caret package or a custom function defined by the user. Several graphics available to analyze the results obtained. This library has been used in Ghaddar et al. (2023) <doi:10.1287/ijoc.2022.0090>).
This package provides a comprehensive framework for early epidemic detection through school absenteeism surveillance. The package offers three core functionalities: (1) simulation of population structures, epidemic spread, and resulting school absenteeism patterns; (2) implementation of surveillance models that generate alerts for impending epidemics based on absenteeism data and (3) evaluation of alert timeliness and accuracy through alert time quality metrics to optimize model parameters. These tools enable public health officials and researchers to develop and assess early warning systems before implementation. Methods are based on research published in Vanderkruk et al. (2023) <doi:10.1186/s12889-023-15747-z> and Ward et al. (2019) <doi:10.1186/s12889-019-7521-7>.
This package performs genomic mediation analysis with adaptive confounding adjustment (GMAC) proposed by Yang et al. (2017) <doi:10.1101/078683>. It implements large scale mediation analysis and adaptively selects potential confounding variables to adjust for each mediation test from a pool of candidate confounders. The package is tailored for but not limited to genomic mediation analysis (e.g., cis-gene mediating trans-gene regulation pattern where an eQTL
, its cis-linking gene transcript, and its trans-gene transcript play the roles as treatment, mediator and the outcome, respectively), restricting to scenarios with the presence of cis-association (i.e., treatment-mediator association) and random eQTL
(i.e., treatment).
Higher order likelihood inference is a promising approach for analyzing small sample size data. The holi package provides web applications for higher order likelihood inference. It currently supports linear, logistic, and Poisson generalized linear models through the rstar_glm()
function, based on Pierce and Bellio (2017) <doi:10.1111/insr.12232> and likelihoodAsy
'. The package offers two main features: LA_rstar()
, which launches an interactive shiny application allowing users to fit models with rstar_glm()
through their web browser, and sim_rstar_glm_pgsql()
, which streamlines the process of launching a web-based shiny simulation application that saves results to a user-created PostgreSQL
database.
This package provides a collection of statistical hypothesis tests and other techniques for identifying certain spatial relationships/phenomena in DNA sequences. In particular, it provides tests and graphical methods for determining whether or not DNA sequences comply with Chargaff's second parity rule or exhibit purine-pyrimidine parity. In addition, there are functions for efficiently simulating discrete state space Markov chains and testing arbitrary symbolic sequences of symbols for the presence of first-order Markovianness. Also, it has functions for counting words/k-mers (and cylinder patterns) in arbitrary symbolic sequences. Functions which take a DNA sequence as input can handle sequences stored as SeqFastadna
objects from the seqinr package.
This package provides a flexible and robust joint test of the single nucleotide polymorphism (SNP) main effect and genotype-by-treatment interaction effect for continuous and binary endpoints. Two analytic procedures, Cauchy weighted joint test (CWOT) and adaptively weighted joint test (AWOT), are proposed to accurately calculate the joint test p-value. The proposed methods are evaluated through extensive simulations under various scenarios. The results show that the proposed AWOT and CWOT control type I error well and outperform existing methods in detecting the most interesting signal patterns in pharmacogenetics (PGx) association studies. For reference, see Hong Zhang, Devan Mehrotra and Judong Shen (2022) <doi:10.13140/RG.2.2.28323.53280>.
The gamma-Orthogonal Matching Pursuit (gamma-OMP) is a recently suggested modification of the OMP feature selection algorithm for a wide range of response variables. The package offers many alternative regression models, such linear, robust, survival, multivariate etc., including k-fold cross-validation. References: Tsagris M., Papadovasilakis Z., Lakiotaki K. and Tsamardinos I. (2018). "Efficient feature selection on gene expression data: Which algorithm to use?" BioRxiv
. <doi:10.1101/431734>. Tsagris M., Papadovasilakis Z., Lakiotaki K. and Tsamardinos I. (2022). "The gamma-OMP algorithm for feature selection with application to gene expression data". IEEE/ACM Transactions on Computational Biology and Bioinformatics 19(2): 1214--1224. <doi:10.1109/TCBB.2020.3029952>.