This package provides significance controlled variable selection algorithms with different directions (forward, backward, stepwise) based on diverse criteria (AIC, BIC, adjusted r-square, PRESS, or p-value). The algorithm selects a final model with only significant variables defined as those with significant p-values after multiple testing correction such as Bonferroni, False Discovery Rate, etc. See Zambom and Kim (2018) <doi:10.1002/sta4.210>.
This package implements a Bayesian hierarchical model designed to identify skips in mobile menstrual cycle self-tracking on mobile apps. Future developments will allow for the inclusion of covariates affecting cycle mean and regularity, as well as extra information regarding tracking non-adherence. Main methods to be outlined in a forthcoming paper, with alternative models from Li et al. (2022) <doi:10.1093/jamia/ocab182>.
This package provides a programmatic interface in R for the US Department of Transportation (DOT) National Highway Transportation Safety Administration (NHTSA) vehicle identification number (VIN) API, located at <https://vpic.nhtsa.dot.gov/api/>. The API can decode up to 50 vehicle identification numbers in one call, and provides manufacturer information about the vehicles, including make, model, model year, and gross vehicle weight rating (GVWR).
This package provides extensive functionality for comparing results obtained by different methods for differential expression analysis of RNAseq data. It also contains functions for simulating count data. Finally, it provides convenient interfaces to several packages for performing the differential expression analysis. These can also be used as templates for setting up and running a user-defined differential analysis workflow within the framework of the package.
This R package makes use of the exhaustive RESTful Web service API that has been implemented for the Cellabase database. It enable researchers to query and obtain a wealth of biological information from a single database saving a lot of time. Another benefit is that researchers can easily make queries about different biological topics and link all this information together as all information is integrated.
MethylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. The package is designed to deal with sequencing data from Reduced representation bisulfite sequencing (RRBS) and its variants, but also target-capture methods and whole genome bisulfite sequencing. It also has functions to analyze base-pair resolution 5hmC data from experimental protocols such as oxBS-Seq and TAB-Seq.
This package provides procedures for model-based trees for subgroup analyses in clinical trials and model-based forests for the estimation and prediction of personalised treatment effects. Currently partitioning of linear models, lm(), generalised linear models, glm(), and Weibull models, survreg(), are supported. Advanced plotting functionality is supported for the trees and a test for parameter heterogeneity is provided for the personalised models.
Manage the life cycle of your exported functions with shared conventions, documentation badges, and non-invasive deprecation warnings. The lifecycle package defines four development stages (experimental, maturing, stable, and questioning) and three deprecation stages (soft-deprecated, deprecated, and defunct). It makes it easy to insert badges corresponding to these stages in your documentation. Usage of deprecated functions are signalled with increasing levels of non-invasive verbosity.
An implementation of the RainFARM (Rainfall Filtered Autoregressive Model) stochastic precipitation downscaling method (Rebora et al. (2006) <doi:10.1175/JHM517.1>). Adapted for climate downscaling according to D'Onofrio et al. (2018) <doi:10.1175/JHM-D-13-096.1> and for complex topography as in Terzago et al. (2018) <doi:10.5194/nhess-18-2825-2018>. The RainFARM method is based on the extrapolation to small scales of the Fourier spectrum of a large-scale precipitation field, using a fixed logarithmic slope and random phases at small scales, followed by a nonlinear transformation of the resulting linearly correlated stochastic field. RainFARM allows to generate ensembles of spatially downscaled precipitation fields which conserve precipitation at large scales and whose statistical properties are consistent with the small-scale statistics of observed precipitation, based only on knowledge of the large-scale precipitation field.
This package provides statistical tools to analyze heterogeneous effects of rare variants within genes that are associated with multiple traits. The package implements methods for assessing pleiotropic effects and identifying allelic heterogeneity, which can be useful in large-scale genetic studies. Methods include likelihood-based statistical tests to assess these effects. For more details, see Lu et al. (2024) <doi:10.1101/2024.10.01.614806>.
This package provides a method for determining groups in multiple curves with an automatic selection of their number based on k-means or k-medians algorithms. The selection of the optimal number is provided by bootstrap methods. The methodology can be applied both in regression and survival framework. Implemented methods are: Grouping multiple survival curves described by Villanueva et al. (2018) <doi:10.1002/sim.8016>.
This package implements various estimators for average treatment effects - an inverse probability weighted (IPW) estimator, an augmented inverse probability weighted (AIPW) estimator, and a standard regression estimator - that make use of generalized additive models for the treatment assignment model and/or outcome model. See: Glynn, Adam N. and Kevin M. Quinn. 2010. "An Introduction to the Augmented Inverse Propensity Weighted Estimator." Political Analysis. 18: 36-56.
An implementation of the decimated two-dimensional complex dual-tree wavelet transform as described in Kingsbury (1999) <doi:10.1098/rsta.1999.0447> and Selesnick et al. (2005) <doi:10.1109/MSP.2005.1550194>. Also includes the undecimated version and spectral bias correction described in Nelson et al. (2018) <doi:10.1007/s11222-017-9784-0>. The code is partly based on the dtcwt Python library.
Density ratio estimation. The estimated density ratio function can be used in many applications such as anomaly detection, change-point detection, covariate shift adaptation. The implemented methods are uLSIF (Hido et al. (2011) <doi:10.1007/s10115-010-0283-2>), RuLSIF (Yamada et al. (2011) <doi:10.1162/NECO_a_00442>), and KLIEP (Sugiyama et al. (2007) <doi:10.1007/s10463-008-0197-x>).
This package provides a plot overlying the niche of multiple species is obtained: 1) to determine the niche conditions which favor a higher species richness, 2) to create a box plot with the range of environmental variables of the species, 3) to obtain a list of species in an area of the niche selected by the user and, 4) to estimate niche overlap among the species.
This package provides tools for fitting periodic coefficients regression models to data where periodicity plays a crucial role. It allows users to model and analyze relationships between variables that exhibit cyclical or seasonal patterns, offering functions for estimating parameters and testing the periodicity of coefficients in linear regression models. For simple periodic coefficient regression model see Regui et al. (2024) <doi:10.1080/03610918.2024.2314662>.
This package provides a system that computes metrics to assess the segmentation accuracy of geospatial data. These metrics calculate the discrepancy between segmented and reference objects, and indicate the segmentation accuracy. For more details on choosing evaluation metrics, we suggest seeing Costa et al. (2018) <doi:10.1016/j.rse.2017.11.024> and Jozdani et al. (2020) <doi:10.1016/j.isprsjprs.2020.01.002>.
Multiple imputation of missing data in a dataset using MICT or MICT-timing methods. The core idea of the algorithms is to fill gaps of missing data, which is the typical form of missing data in a longitudinal setting, recursively from their edges. Prediction is based on either a multinomial or random forest regression model. Covariates and time-dependent covariates can be included in the model.
Can be used to model the fate of soil organic carbon and soil organic nitrogen and to calculate N mineralisation rates. Provides a framework that numerically solves differential equations of soil organic carbon models based on first-order kinetics and extends these models to include the nitrogen component. The name sorcering is an acronym for Soil ORganic Carbon & CN Ratio drIven Nitrogen modellinG framework'.
Utilities for handling character vectors that store human-readable text (either plain or with markup, such as HTML or LaTeX). The package provides, in particular, functions that help with the preparation of plain-text reports, e.g. for expanding and aligning strings that form the lines of such reports. The package also provides generic functions for transforming R objects to HTML and to plain text.
Bringing business and financial analysis to the tidyverse'. The tidyquant package provides a convenient wrapper to various xts', zoo', quantmod', TTR and PerformanceAnalytics package functions and returns the objects in the tidy tibble format. The main advantage is being able to use quantitative functions with the tidyverse functions including purrr', dplyr', tidyr', ggplot2', lubridate', etc. See the tidyquant website for more information, documentation and examples.
An implementation of Vasicek and Song goodness-of-fit tests. Several functions are provided to estimate differential Shannon entropy, i.e., estimate Shannon entropy of real random variables with density, and test the goodness-of-fit of some family of distributions, including uniform, Gaussian, log-normal, exponential, gamma, Weibull, Pareto, Fisher, Laplace and beta distributions; see Lequesne and Regnault (2020) <doi:10.18637/jss.v096.c01>.
epigraHMM provides a set of tools for the analysis of epigenomic data based on hidden Markov Models. It contains two separate peak callers, one for consensus peaks from biological or technical replicates, and one for differential peaks from multi-replicate multi-condition experiments. In differential peak calling, epigraHMM provides window-specific posterior probabilities associated with every possible combinatorial pattern of read enrichment across conditions.
scTreeViz provides classes to support interactive data aggregation and visualization of single cell RNA-seq datasets with hierarchies for e.g. cell clusters at different resolutions. The `TreeIndex` class provides methods to manage hierarchy and split the tree at a given resolution or across resolutions. The `TreeViz` class extends `SummarizedExperiment` and can performs quick aggregations on the count matrix defined by clusters.