This package provides various functions for reading and preparing the Panel Study of Income Dynamics (PSID) for longitudinal analysis, including functions that read the PSID's fixed width format files directly into R, rename all of the PSID's longitudinal variables so that recurring variables have consistent names across years, simplify assembling longitudinal datasets from cross sections of the PSID Family Files, and export the resulting PSID files into file formats common among other statistical programming languages ('SAS', STATA', and SPSS').
Simultaneously estimates sparse regression coefficients and response network structure in multivariate models with missing data. Unlike traditional approaches requiring imputation, handles missingness natively through unbiased estimating equations (MCAR/MAR compatible). Employs dual L1 regularization with automated selection via cross-validation or information criteria. Includes parallel computation, warm starts, adaptive grids, publication-ready visualizations, and prediction methods. Ideal for genomics, neuroimaging, and multi-trait studies with incomplete high-dimensional outcomes. See Zeng et al. (2025) <doi:10.48550/arXiv.2507.05990>.
This R package provides a calculation of between-cases AUC estimate, corresponding covariance, and variance estimate in the nested data problem. Also, the package has the function to simulate the nested data. The calculated between-cases AUC estimate is used to evaluate the reader's diagnostic performance in clinical tasks with nested data. For more details on the above methods, please refer to the paper by H Du, S Wen, Y Guo, F Jin, BD Gallas (2022) <doi:10.1177/09622802221111539>.
This package provides tools for the analysis of land use and cover (LUC) time series. It includes support for loading spatiotemporal raster data and synthesized spatial plotting. Several LUC change (LUCC) metrics in regular or irregular time intervals can be extracted and visualized through one- and multistep sankey and chord diagrams. A complete intensity analysis according to Aldwaik and Pontius (2012) <doi:10.1016/j.landurbplan.2012.02.010> is implemented, including tools for the generation of standardized multilevel output graphics.
This package contains functions to fit proportional hazards (PH) model to partly interval-censored (PIC) data (Pan et al. (2020) <doi:10.1177/0962280220921552>), PH model with spatial frailty to spatially dependent PIC data (Pan and Cai (2021) <doi:10.1080/03610918.2020.1839497>), and mixed effects PH model to clustered PIC data. Each random intercept/random effect can follow both a normal prior and a Dirichlet process mixture prior. It also includes the corresponding functions for general interval-censored data.
Creation of linkage maps in polyploid species from marker dosage scores of an F1 cross from two heterozygous parents. Currently works for outcrossing diploid, autotriploid, autotetraploid and autohexaploid species, as well as segmental allotetraploids. Methods are described in a manuscript of Bourke et al. (2018) <doi:10.1093/bioinformatics/bty371>. Since version 1.1.0, both discrete and probabilistic genotypes are acceptable input; for more details on the latter see Liao et al. (2021) <doi:10.1007/s00122-021-03834-x>.
Various functions for discrete time survival analysis and longitudinal analysis. SIMEX method for correcting for bias for errors-in-variables in a mixed effects model. Asymptotic mean and variance of different proportional hazards test statistics using different ties methods given two survival curves and censoring distributions. Score test and Wald test for regression analysis of grouped survival data. Calculation of survival curves for events defined by the response variable in a mixed effects model crossing a threshold with or without confirmation.
The tcplfit2 R package performs basic concentration-response curve fitting. The original tcplFit() function in the tcpl R package performed basic concentration-response curvefitting to 3 models. With tcplfit2, the core tcpl concentration-response functionality has been expanded to process diverse high-throughput screen (HTS) data generated at the US Environmental Protection Agency, including targeted ToxCast, high-throughput transcriptomics (HTTr) and high-throughput phenotypic profiling (HTPP). tcplfit2 can be used independently to support analysis for diverse chemical screening efforts.
Implementation of four extensions of the Zipf distribution: the Marshall-Olkin Extended Zipf (MOEZipf) Pérez-Casany, M., & Casellas, A. (2013) <arXiv:1304.4540>, the Zipf-Poisson Extreme (Zipf-PE), the Zipf-Poisson Stopped Sum (Zipf-PSS) and the Zipf-Polylog distributions. In log-log scale, the two first extensions allow for top-concavity and top-convexity while the third one only allows for top-concavity. All the extensions maintain the linearity associated with the Zipf model in the tail.
Hipathia is a method for the computation of signal transduction along signaling pathways from transcriptomic data. The method is based on an iterative algorithm which is able to compute the signal intensity passing through the nodes of a network by taking into account the level of expression of each gene and the intensity of the signal arriving to it. It also provides a new approach to functional analysis allowing to compute the signal arriving to the functions annotated to each pathway.
This package aims to analyse count-based methylation data on predefined genomic regions, such as those obtained by targeted sequencing, and thus to identify differentially methylated regions (DMRs) that are associated with phenotypes or traits. The method is built a rich flexible model that allows for the effects, on the methylation levels, of multiple covariates to vary smoothly along genomic regions. At the same time, this method also allows for sequencing errors and can adjust for variability in cell type mixture.
This package provides tools to design best-worst scaling designs (i.e., balanced incomplete block designs) and to analyze data from these designs, using aggregate and individual methods such as: difference scores, Louviere, Lings, Islam, Gudergan, & Flynn (2013) <doi:10.1016/j.ijresmar.2012.10.002>; analytical estimation, Lipovetsky & Conklin (2014) <doi:10.1016/j.jocm.2014.02.001>; empirical Bayes, Lipovetsky & Conklin (2015) <doi:10.1142/S1793536915500028>; Elo, Hollis (2018) <doi:10.3758/s13428-017-0898-2>; and network-based measures.
The backfill Bayesian optimal interval design using efficacy and toxicity outcomes for dose optimization (BF-BOIN-ET) design is a novel clinical trial design to allow patients to be backfilled at lower doses during a dose-finding trial while prioritizing the dose-escalation cohort to explore a higher dose. The advantages compared to the other designs in terms of the percentage of correct optimal dose (OD) selection, reducing the sample size, and shortening the duration of the trial, in various realistic setting.
Cluster analysis is performed using pairwise distance information and a random partition distribution. The method is implemented for two random partition distributions. It draws samples and then obtains and plots clustering estimates. An implementation of a selection algorithm is provided for the mass parameter of the partition distribution. Since pairwise distances are the principal input to this procedure, it is most comparable to the hierarchical and k-medoids clustering methods. The method is Dahl, Andros, Carter (2022+) <doi:10.1002/sam.11602>.
This package provides a covariate-dependent approach to Gaussian graphical modeling as described in Dasgupta et al. (2022). Employs a novel weighted pseudo-likelihood approach to model the conditional dependence structure of data as a continuous function of an extraneous covariate. The main function, covdepGE::covdepGE(), estimates a graphical representation of the conditional dependence structure via a block mean-field variational approximation, while several auxiliary functions (inclusionCurve(), matViz(), and plot.covdepGE()) are included for visualizing the resulting estimates.
This package provides a simple way of fitting detection functions to distance sampling data for both line and point transects. Adjustment term selection, left and right truncation as well as monotonicity constraints and binning are supported. Abundance and density estimates can also be calculated (via a Horvitz-Thompson-like estimator) if survey area information is provided. See Miller et al. (2019) <doi:10.18637/jss.v089.i01> for more information on methods and <https://distancesampling.org/resources/vignettes.html> for example analyses.
This is a collection of assorted functions and examples collected from various projects. Currently we have functionalities for simplifying overlapping time intervals, Charlson comorbidity score constructors for Danish data, getting frequency for multiple variables, getting standardized output from logistic and log-linear regressions, sibling design linear regression functionalities a method for calculating the confidence intervals for functions of parameters from a GLM, Bayes equivalent for hypothesis testing with asymptotic Bayes factor, and several help functions for generalized random forest analysis using grf'.
This package provides functions to perform exploratory factor analysis (EFA) procedures and compare their solutions. The goal is to provide state-of-the-art factor retention methods and a high degree of flexibility in the EFA procedures. This way, for example, implementations from R psych and SPSS can be compared. Moreover, functions for Schmid-Leiman transformation and the computation of omegas are provided. To speed up the analyses, some of the iterative procedures, like principal axis factoring (PAF), are implemented in C++.
Estimation, forecasting, and simulation of generalized autoregressive score (GAS) models of Creal, Koopman, and Lucas (2013) <doi:10.1002/jae.1279> and Harvey (2013) <doi:10.1017/cbo9781139540933>. Model specification allows for various data types and distributions, different parametrizations, exogenous variables, joint and separate modeling of exogenous variables and dynamics, higher score and autoregressive orders, custom and unconditional initial values of time-varying parameters, fixed and bounded values of coefficients, and missing values. Model estimation is performed by the maximum likelihood method.
This package performs the execution of the main procedures of multiple comparisons in the literature, Scott-Knott (1974) <http://www.jstor.org/stable/2529204>, Batista (2016) <http://repositorio.ufla.br/jspui/handle/1/11466>, including graphic representations and export to different extensions of its results. An additional part of the package is the presence of the performance evaluation of the tests (Type I error per experiment and the power). This will assist the user in making the decision for the chosen test.
This package provides tools for the analysis of psychophysical data in R. This package allows to estimate the Point of Subjective Equivalence (PSE) and the Just Noticeable Difference (JND), either from a psychometric function or from a Generalized Linear Mixed Model (GLMM). Additionally, the package allows plotting the fitted models and the response data, simulating psychometric functions of different shapes, and simulating data sets. For a description of the use of GLMMs applied to psychophysical data, refer to Moscatelli et al. (2012).
This package provides a graphical user interface to integrate, visualize and explore results from linkage and quantitative trait loci analysis, together with genomic information for autopolyploid species. The app is meant for interactive use and allows users to optionally upload different sources of information, including gene annotation and alignment files, enabling the exploitation and search for candidate genes in a genome browser. In its current version, VIEWpoly supports inputs from MAPpoly', polymapR', diaQTL', QTLpoly', polyqtlR', GWASpoly', and HIDECAN packages.
This package provides a comprehensive data analysis framework for NIH-funded research that streamlines workflows for both data cleaning and preparing NIH Data Archive ('NDA') submission templates. Provides unified access to multiple data sources ('REDCap', MongoDB', Qualtrics') through interfaces to their APIs, with specialized functions for data cleaning, filtering, merging, and parsing. Features automatic validation, field harmonization, and memory-aware processing to enhance reproducibility in multi-site collaborative research as described in Mittal et al. (2021) <doi:10.20900/jpbs.20210011>.
This package is a rasterization preprocessing framework that aggregates cellular information into spatial pixels to reduce resource requirements for spatial omics data analysis. SEraster reduces the number of points in spatial omics datasets for downstream analysis through a process of rasterization where single cells gene expression or cell-type labels are aggregated into equally sized pixels based on a user-defined resolution. SEraster can be incorporated with other packages to conduct downstream analyses for spatial omics datasets, such as detecting spatially variable genes.