This package implements the gene-based segregation test(GESE) and the weighted GESE test for identifying genes with causal variants of large effects for family-based sequencing data. The methods are described in Qiao, D. Lange, C., Laird, N.M., Won, S., Hersh, C.P., et al. (2017). <DOI:10.1002/gepi.22037>. Gene-based segregation method for identifying rare variants for family-based sequencing studies. Genet Epidemiol 41(4):309-319. More details can be found at <http://scholar.harvard.edu/dqiao/gese>.
This package provides functions that (1) fit multivariate discrete distributions, (2) generate random numbers from multivariate discrete distributions, and (3) run regression and penalized regression on the multivariate categorical response data. Implemented models include: multinomial logit model, Dirichlet multinomial model, generalized Dirichlet multinomial model, and negative multinomial model. Making the best of the minorization-maximization (MM) algorithm and Newton-Raphson method, we derive and implement stable and efficient algorithms to find the maximum likelihood estimates. On a multi-core machine, multi-threading is supported.
Uses several Nonlinear Mixed effect (NONMEM) models (as NONMEM control files) to perform bootstrap model averaging and Monte Carlo Simulation for Model Based Bio-Equivalence (MBBE). Power is returned as the fraction of the simulations with successful bioequivalence (BE) test, for maximum concentration (Cmax), Area under the curve to the last observed value (AUClast) and Area under the curve extrapolate to infinity (AUCinf). See Hooker, A. (2020) Improved bioequivalence assessment through model-informed and model-based strategies <https://www.fda.gov/media/138035/download>.
This package provides functions to impute missing values using Gaussian copulas for mixed data types as described by Christoffersen et al. (2021) <arXiv:2102.02642>
. The method is related to Hoff (2007) <doi:10.1214/07-AOAS107> and Zhao and Udell (2019) <arXiv:1910.12845>
but differs by making a direct approximation of the log marginal likelihood using an extended version of the Fortran code created by Genz and Bretz (2002) <doi:10.1198/106186002394> in addition to also support multinomial variables.
Six growth models are fitted using non-linear least squares. These are the Richards, the 3, 4 and 5 parameter logistic, the Gompetz and the Weibull growth models. Reference: Reddy T., Shkedy Z., van Rensburg C. J., Mwambi H., Debba P., Zuma K. and Manda, S. (2021). "Short-term real-time prediction of total number of reported COVID-19 cases and deaths in South Africa: a data driven approach". BMC medical research methodology, 21(1), 1-11. <doi:10.1186/s12874-020-01165-x>.
This package provides an R implementation of the Particle Metropolis within Gibbs sampler for model parameter, covariance matrix and random effect estimation. A more general implementation of the sampler based on the paper by Gunawan, D., Hawkins, G. E., Tran, M. N., Kohn, R., & Brown, S. D. (2020) <doi:10.1016/j.jmp.2020.102368>. An HTML tutorial document describing the package is available at <https://university-of-newcastle-research.github.io/samplerDoc/>
and includes several detailed examples, some background and troubleshooting steps.
Design a Bayesian seamless multi-arm biomarker-enriched phase II/III design with the survival endpoint with allowing sample size re-estimation. James M S Wason, Jean E Abraham, Richard D Baird, Ioannis Gournaris, Anne-Laure Vallier, James D Brenton, Helena M Earl, Adrian P Mander (2015) <doi:10.1038/bjc.2015.278>. Guosheng Yin, Nan Chen, J. Jack Lee (2018) <doi:10.1007/s12561-017-9199-7>. Ying Yuan, Beibei Guo, Mark Munsell, Karen Lu, Amir Jazaeri (2016) <doi:10.1002/sim.6971>.
This package provides tools to simulate multi-omics datasets with predefined signal structures. The generated data can be used for testing, validating, and benchmarking integrative analysis methods such as factor models and clustering approaches. This version includes enhanced signal customization, visualization tools (scatter, histogram, 3D), MOFA-based analysis pipelines, PowerPoint
export, and statistical profiling of datasets. Designed for both method development and teaching, SUMO supports real and synthetic data pipelines with interpretable outputs. Tini, Giulia, et al (2019) <doi:10.1093/bib/bbx167>.
This package provides a standardized workflow to reconstruct spatial configurations of altitude-bounded biogeographic systems over time. For example, tabs can model how island archipelagos expand or contract with changing sea levels or how alpine biomes shift in response to tree line movements. It provides functionality to account for various geophysical processes such as crustal deformation and other tectonic changes, allowing for a more accurate representation of biogeographic system dynamics. For more information see De Groeve et al. (2025) <doi:10.3897/arphapreprints.e151900>.
The wavelet-based variance transformation method is used for system modelling and prediction. It refines predictor spectral representation using Wavelet Theory, which leads to improved model specifications and prediction accuracy. Details of methodologies used in the package can be found in Jiang, Z., Sharma, A., & Johnson, F. (2020) <doi:10.1029/2019WR026962>, Jiang, Z., Rashid, M. M., Johnson, F., & Sharma, A. (2020) <doi:10.1016/j.envsoft.2020.104907>, and Jiang, Z., Sharma, A., & Johnson, F. (2021) <doi:10.1016/J.JHYDROL.2021.126816>.
This package creates complex autoregressive distributed lag (ARDL) models and constructs the underlying unrestricted and restricted error correction model (ECM) automatically, just by providing the order. It also performs the bounds-test for cointegration as described in Pesaran et al. (2001) <doi:10.1002/jae.616> and provides the multipliers and the cointegrating equation. The validity and the accuracy of this package have been verified by successfully replicating the results of Pesaran et al. (2001) in Natsiopoulos and Tzeremes (2022) <doi:10.1002/jae.2919>.
MCMC algorithms & processing functions for: 1. single response multiple regression, see Papageorgiou, G. (2018) <doi: 10.32614/RJ-2018-069>, 2. multivariate response multiple regression, with nonparametric models for the means, the variances and the correlation matrix, with variable selection, see Papageorgiou, G. and Marshall, B. C. (2020) <doi: 10.1080/10618600.2020.1739534>, 3. joint mean-covariance models for multivariate responses, see Papageorgiou, G. (2022) <doi: 10.1002/sim.9376>, and 4.Dirichlet process mixtures, see Papageorgiou, G. (2019) <doi: 10.1111/anzs.12273>.
Detects multiple changes in slope using the CPOP dynamic programming approach of Fearnhead, Maidstone, and Letchford (2019) <doi:10.1080/10618600.2018.1512868>. This method finds the best continuous piecewise linear fit to data under a criterion that measures fit to data using the residual sum of squares, but penalizes complexity based on an L0 penalty on changes in slope. Further information regarding the use of this package with detailed examples can be found in Fearnhead and Grose (2024) <doi:10.18637/jss.v109.i07>.
This package provides a set of functions to perform Raju, van der Linden and Fleer's (1995, <doi:10.1177/014662169501900405>) Differential Functioning of Items and Tests (DFIT) analyses. It includes functions to use the Monte Carlo Item Parameter Replication approach (Oshima, Raju, & Nanda, 2006, <doi:10.1111/j.1745-3984.2006.00001.x>) for obtaining the associated statistical significance tests cut-off points. They may also be used for a priori and post-hoc power calculations (Cervantes, 2017, <doi:10.18637/jss.v076.i05>).
Fits the space-time Epidemic Type Aftershock Sequence ('ETAS') model to earthquake catalogs using a stochastic declustering approach. The ETAS model is a spatio-temporal marked point process model and a special case of the Hawkes process. The package is based on a Fortran program by Jiancang Zhuang (available at <http://bemlar.ism.ac.jp/zhuang/software.html>), which is modified and translated into C++ and C such that it can be called from R. Parallel computing with OpenMP
is possible on supported platforms.
User-friendly and fast set of functions for estimating parameters of hierarchical Bayesian species distribution models (Latimer and others 2006 <doi:10.1890/04-0609>). Such models allow interpreting the observations (occurrence and abundance of a species) as a result of several hierarchical processes including ecological processes (habitat suitability, spatial dependence and anthropogenic disturbance) and observation processes (species detectability). Hierarchical species distribution models are essential for accurately characterizing the environmental response of species, predicting their probability of occurrence, and assessing uncertainty in the model results.
Identify Cancer Dysfunctional Sub-pathway by integrating gene expression, DNA methylation and copy number variation, and pathway topological information. 1)We firstly calculate the gene risk scores by integrating three kinds of data: DNA methylation, copy number variation, and gene expression. 2)Secondly, we perform a greedy search algorithm to identify the key dysfunctional sub-pathways within the pathways for which the discriminative scores were locally maximal. 3)Finally, the permutation test was used to calculate statistical significance level for these key dysfunctional sub-pathways.
Interpretability of complex machine learning models is a growing concern. This package helps to understand key factors that drive the decision made by complicated predictive model (so called black box model). This is achieved through local approximations that are either based on additive regression like model or CART like model that allows for higher interactions. The methodology is based on Tulio Ribeiro, Singh, Guestrin (2016) <doi:10.1145/2939672.2939778>. More details can be found in Staniak, Biecek (2018) <doi:10.32614/RJ-2018-072>.
To determine the number of quantitative assays needed for a sample of data using pooled testing methods, which include mini-pooling (MP), MP with algorithm (MPA), and marker-assisted MPA (mMPA
). To estimate the number of assays needed, the package also provides a tool to conduct Monte Carlo (MC) to simulate different orders in which the sample would be collected to form pools. Using MC avoids the dependence of the estimated number of assays on any specific ordering of the samples to form pools.
Functions, examples and data from the first and the second edition of "Numerical Methods and Optimization in Finance" by M. Gilli, D. Maringer and E. Schumann (2019, ISBN:978-0128150658). The package provides implementations of optimisation heuristics (Differential Evolution, Genetic Algorithms, Particle Swarm Optimisation, Simulated Annealing and Threshold Accepting), and other optimisation tools, such as grid search and greedy search. There are also functions for the valuation of financial instruments such as bonds and options, for portfolio selection and functions that help with stochastic simulations.
This is the R API for the nfer formalism (<http://nfer.io/>). nfer was developed to specify event stream abstractions for spacecraft telemetry such as the Mars Science Laboratory. Users write rules using a syntax that borrows heavily from Allen's Temporal Logic that, when applied to an event stream, construct a hierarchy of temporal intervals with data. The R API supports loading rules from a file or mining them from historical data. Traces of events or pools of intervals are provided as data frames.
Accesses OpenWeatherMap's
(owm) <https://openweathermap.org/> API. owm itself is a service providing weather data in the past, in the future and now. Furthermore, owm serves weather map layers usable in frameworks like leaflet'. In order to access the API, you need to sign up for an API key. There are free and paid plans. Beside functions for fetching weather data from owm', owmr supplies tools to tidy up fetched data (for fast and simple access) and to show it on leaflet maps.
This software ADAM
is a Gene set enrichment analysis (GSEA) package created to group a set of genes from comparative samples (control versus experiment) belonging to different species according to their respective functions. The corresponding roles are extracted from the default collections like Gene ontology and Kyoto encyclopedia of genes and genomes (KEGG). ADAM
show their significance by calculating the p-values referring to gene diversity and activity. Each group of genes is called Group of functionally associated genes (GFAG).
GAGE is a published method for gene set (enrichment or GSEA) or pathway analysis. GAGE is generally applicable independent of microarray or RNA-Seq data attributes including sample sizes, experimental designs, assay platforms, and other types of heterogeneity. The gage package provides functions for basic GAGE analysis, result processing and presentation. In addition, it provides demo microarray data and commonly used gene set data based on KEGG pathways and GO terms. These functions and data are also useful for gene set analysis using other methods.