This package provides a likelihood method is implemented to present evidence for evaluating bioequivalence (BE). The functions use bioequivalence data [area under the blood concentration-time curve (AUC) and peak concentration (Cmax)] from various crossover designs commonly used in BE studies including a fully replicated, a partially replicated design, and a conventional 2x2 crossover design. They will calculate the profile likelihoods for the mean difference, total standard deviation ratio, and within subject standard deviation ratio for a test and a reference drug. A plot of a standardized profile likelihood can be generated along with the maximum likelihood estimate and likelihood intervals, which present evidence for bioequivalence. See Liping Du and Leena Choi (2015) <doi:10.1002/pst.1661>.
Empirical Bayes thresholding using the methods developed by I. M. Johnstone and B. W. Silverman. The basic problem is to estimate a mean vector given a vector of observations of the mean vector plus white noise, taking advantage of possible sparsity in the mean vector. Within a Bayesian formulation, the elements of the mean vector are modelled as having, independently, a distribution that is a mixture of an atom of probability at zero and a suitable heavy-tailed distribution. The mixing parameter can be estimated by a marginal maximum likelihood approach. This leads to an adaptive thresholding approach on the original data. Extensions of the basic method, in particular to wavelet thresholding, are also implemented within the package.
Subgroup analyses are routinely performed in clinical trial analyses. From a methodological perspective, two key issues of subgroup analyses are multiplicity (even if only predefined subgroups are investigated) and the low sample sizes of subgroups which lead to highly variable estimates, see e.g. Yusuf et al (1991) <doi:10.1001/jama.1991.03470010097038>. This package implements subgroup estimates based on Bayesian shrinkage priors, see Carvalho et al (2019) <https://proceedings.mlr.press/v5/carvalho09a.html>. In addition, estimates based on penalized likelihood inference are available, based on Simon et al (2011) <doi:10.18637/jss.v039.i05>. The corresponding shrinkage based forest plots address the aforementioned issues and can complement standard forest plots in practical clinical trial analyses.
Google offers public access to global search volumes from its search engine through the Google Trends portal. The package downloads these search volumes provided by Google Trends and uses them to measure and analyze the distribution of search scores across countries or within countries. The package allows researchers and analysts to use these search scores to investigate global trends based on patterns within these scores. This offers insights such as degree of internationalization of firms and organizations or dissemination of political, social, or technological trends across the globe or within single countries. An outline of the package's methodological foundations and potential applications is available as a working paper: <https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3969013>.
This R package provides a single procedure guix.install()
, which allows users to install R packages via Guix right from within their running R session. If the requested R package does not exist in Guix at this time, the package and all its missing dependencies will be imported recursively and the generated package definitions will be written to ~/.Rguix/packages.scm
. This record of imported packages can be used later to reproduce the environment, and to add the packages in question to a proper Guix channel (or Guix itself). guix.install()
not only supports installing packages from CRAN, but also from Bioconductor or even arbitrary git or mercurial repositories, replacing the need for installation via devtools
.
Differential exon usage test for RNA-Seq data via an empirical Bayes shrinkage method for the dispersion parameter the utilizes inclusion-exclusion data to analyze the propensity to skip an exon across groups. The input data consists of two matrices where each row represents an exon and the columns represent the biological samples. The first matrix is the count of the number of reads expressing the exon for each sample. The second matrix is the count of the number of reads that either express the exon or explicitly skip the exon across the samples, a.k.a. the total count matrix. Dividing the two matrices yields proportions representing the propensity to express the exon versus skipping the exon for each sample.
Implementations of several multiple testing procedures that control the family-wise error rate (FWER) designed specifically for discrete tests. Included are discrete adaptations of the Bonferroni, Holm, Hochberg and Šidák procedures as described in the papers Döhler (2010) "Validation of credit default probabilities using multiple-testing procedures" <doi:10.21314/JRMV.2010.062> and Zhu & Guo (2019) "Family-Wise Error Rate Controlling Procedures for Discrete Data" <doi:10.1080/19466315.2019.1654912>. The main procedures of this package take as input the results of a test procedure from package DiscreteTests
or a set of observed p-values and their discrete support under their nulls. A shortcut function to apply discrete procedures directly to data is also provided.
Presents two methods to estimate the parameters mu', sigma', and tau of an ex-Gaussian distribution. Those methods are Quantile Maximization Likelihood Estimation ('QMLE') and Bayesian. The QMLE method allows a choice between three different estimation algorithms for these parameters : neldermead ('NEMD'), fminsearch ('FMIN'), and nlminb ('NLMI'). For more details about the methods you can refer at the following list: Brown, S., & Heathcote, A. (2003) <doi:10.3758/BF03195527>; McCormack
, P. D., & Wright, N. M. (1964) <doi:10.1037/h0083285>; Van Zandt, T. (2000) <doi:10.3758/BF03214357>; El Haj, A., Slaoui, Y., Solier, C., & Perret, C. (2021) <doi:10.19139/soic-2310-5070-1251>; Gilks, W. R., Best, N. G., & Tan, K. K. C. (1995) <doi:10.2307/2986138>.
This package provides a simple to use, intuitive, and extensible interface to several stochastic simulation algorithms for generating simulated trajectories of finite population continuous-time model. Currently it implements Gillespie's exact stochastic simulation algorithm (Direct method) and several approximate methods (Explicit tau-leap, Binomial tau-leap, and Optimized tau-leap). The package also contains a library of template models that can be run as demo models and can easily be customized and extended. Currently the following models are included, Decaying-Dimerization reaction set, linear chain system, logistic growth model, Lotka predator-prey model, Rosenzweig-MacArthur
predator-prey model, Kermack-McKendrick
SIR model, and a metapopulation SIRS model. Pineda-Krch et al. (2008) <doi:10.18637/jss.v025.i12>.
This package provides a fast C++ implementation of the design-based, Diffusion Decision Model (DDM) and the Linear Ballistic Accumulation (LBA) model. It enables the user to optimise the choice response time model by connecting with the Differential Evolution Markov Chain Monte Carlo (DE-MCMC) sampler implemented in the ggdmc package. The package fuses the hierarchical modelling, Bayesian inference, choice response time models and factorial designs, allowing users to build their own design-based models. For more information on the underlying models, see the works by Voss, Rothermund, and Voss (2004) <doi:10.3758/BF03196893>, Ratcliff and McKoon
(2008) <doi:10.1162/neco.2008.12-06-420>, and Brown and Heathcote (2008) <doi:10.1016/j.cogpsych.2007.12.002>.
An integrative toolbox of word embedding research that provides: (1) a collection of pre-trained static word vectors in the .RData compressed format <https://psychbruce.github.io/WordVector_RData.pdf>
; (2) a group of functions to process, analyze, and visualize word vectors; (3) a range of tests to examine conceptual associations, including the Word Embedding Association Test <doi:10.1126/science.aal4230> and the Relative Norm Distance <doi:10.1073/pnas.1720347115>, with permutation test of significance; and (4) a set of training methods to locally train (static) word vectors from text corpora, including Word2Vec <doi:10.48550/arXiv.1301.3781>
, GloVe
<doi:10.3115/v1/D14-1162>, and FastText
<doi:10.48550/arXiv.1607.04606>
.
This tool fits a non-parametric Bayesian model called a "hierarchically coupled mixture model with local dependence (HCMM-LD)" to the original microdata in order to generate synthetic microdata for privacy protection. The non-parametric feature of the adopted model is useful for capturing the joint distribution of the original input data in a highly flexible manner, leading to the generation of synthetic data whose distributional features are similar to that of the input data. The package allows the original input data to have missing values and impute them with the posterior predictive distribution, so no missing values exist in the synthetic data output. The method builds on the work of Murray and Reiter (2016) <doi:10.1080/01621459.2016.1174132>.
Facilitates the analysis of SNP (single nucleotide polymorphism) and silicodart (presence/absence) data. dartR.popgen
provides a suit of functions to analyse such data in a population genetics context. It provides several functions to calculate population genetic metrics and to study population structure. Quite a few functions need additional software to be able to run (gl.run.structure()
, gl.blast()
, gl.LDNe()
). You find detailed description in the help pages how to download and link the packages so the function can run the software. dartR.popgen
is part of the the dartRverse
suit of packages. Gruber et al. (2018) <doi:10.1111/1755-0998.12745>. Mijangos et al. (2022) <doi:10.1111/2041-210X.13918>.
Several statistical methods for analyzing survival data under various forms of dependent censoring are implemented in the package. In addition to accounting for dependent censoring, it offers tools to adjust for unmeasured confounding factors. The implemented approaches allow users to estimate the dependency between survival time and dependent censoring time, based solely on observed survival data. For more details on the methods, refer to Deresa and Van Keilegom (2021) <doi:10.1093/biomet/asaa095>, Czado and Van Keilegom (2023) <doi:10.1093/biomet/asac067>, Crommen et al. (2024) <doi:10.1007/s11749-023-00903-9>, Deresa and Van Keilegom (2024) <doi:10.1080/01621459.2022.2161387>, Rutten et al. (2024+) <doi:10.48550/arXiv.2403.11860>
and Ding and Van Keilegom (2024).
This package provides a function that implements the acceptance-rejection method in an optimized manner to generate pseudo-random observations for discrete or continuous random variables. Proposed by von Neumann J. (1951), <https://mcnp.lanl.gov/pdf_files/>, the function is optimized to work in parallel on Unix-based operating systems and performs well on Windows systems. The acceptance-rejection method implemented optimizes the probability of generating observations from the desired random variable, by simply providing the probability function or probability density function, in the discrete and continuous cases, respectively. Implementation is based on references CASELLA, George at al. (2004) <https://www.jstor.org/stable/4356322>, NEAL, Radford M. (2003) <https://www.jstor.org/stable/3448413> and Bishop, Christopher M. (2006, ISBN: 978-0387310732).
This package provides a framework to help construct R data packages in a reproducible manner. Potentially time consuming processing of raw data sets into analysis ready data sets is done in a reproducible manner and decoupled from the usual R CMD build process so that data sets can be processed into R objects in the data package and the data package can then be shared, built, and installed by others without the need to repeat computationally costly data processing. The package maintains data provenance by turning the data processing scripts into package vignettes, as well as enforcing documentation and version checking of included data objects. Data packages can be version controlled on GitHub
', and used to share data for manuscripts, collaboration and reproducible research.
Identification of putative causal variants in genome-wide association studies with trio and duo families. The package calculates the W feature statistics from KnockoffTrio
and p-values from the family-based association test (FBAT) using trio and/or duo data. Compared to previous versions, a significant improvement has been made in Version 1.1.0 to allow the package to be applied not only to trio families but also to duo families. The package implements the methods in the paper: "Yang, Y., Wang, C., Liu, L., Buxbaum, J., He, Z., & Ionita-Laza, I. (2022). KnockoffTrio
: A knockoff framework for the identification of putative causal variants in genome-wide association studies with trio design. The American Journal of Human Genetics, 109(10), 1761-1776.".
Two-stage design for single-arm phase II trials with time-to-event endpoints (e.g., clinical trials on immunotherapies among cancer patients) can be calculated using this package. Two notable advantages of the package: 1) It provides flexible choices from three design methods (optimal, minmax, and admissible), and 2) the power of the design is more accurately calculated using the exact variance in the one-sample log-rank test. The package can be used for 1) planning the sample sizes and other design parameters, and 2) conducting the interim and final analyses for the Go/No-go decisions. More details about the design method can be found in: Wu, J, Chen L, Wei J, Weiss H, Chauhan A. (2020). <doi:10.1002/pst.1983>.
Develop, evaluate, and score multiple choice examinations, psychological scales, questionnaires, and similar types of data involving sequences of choices among one or more sets of answers. This version of the package should be considered as brand new. Almost all of the functions have been changed, including their argument list. See the file NEWS.Rd in the Inst folder for more information. Using the package does not require any formal statistical knowledge beyond what would be provided by a first course in statistics in a social science department. There the user would encounter the concept of probability and how it is used to model data and make decisions, and would become familiar with basic mathematical and statistical notation. Most of the output is in graphical form.
Estimation of bifurcating autoregressive models of any order, p, BAR(p) as well as several types of bias correction for the least squares estimators of the autoregressive parameters as described in Zhou and Basawa (2005) <doi:10.1016/j.spl.2005.04.024> and Elbayoumi and Mostafa (2020) <doi:10.1002/sta4.342>. Currently, the bias correction methods supported include bootstrap (single, double and fast-double) bias correction and linear-bias-function-based bias correction. Functions for generating and plotting bifurcating autoregressive data from any BAR(p) model are also included. This new version includes calculating several type of bias-corrected and -uncorrected confidence intervals for the least squares estimators of the autoregressive parameters as described in Elbayoumi and Mostafa (2023) <doi:10.6339/23-JDS1092>.
This package provides a comprehensive set of wrapper functions for the analysis of multiplex metabarcode data. It includes robust wrappers for Cutadapt and DADA2 to trim primers, filter reads, perform amplicon sequence variant (ASV) inference, and assign taxonomy. The package can handle single metabarcode datasets, datasets with two pooled metabarcodes, or multiple datasets simultaneously. The final output is a matrix per metabarcode, containing both ASV abundance data and associated taxonomic assignments. An optional function converts these matrices into phyloseq and taxmap objects. For more information on DADA2', including information on how DADA2 infers samples sequences, see Callahan et al. (2016) <doi:10.1038/nmeth.3869>. For more details on the demulticoder R package see Sudermann et al. (2025) <doi:10.1094/PHYTO-02-25-0043-FI>.
This package provides a simulator for reticulate evolution under a birth-death-hybridization process. Here the birth-death process is extended to consider reticulate Evolution by allowing hybridization events to occur. The general purpose simulator allows the modeling of three different reticulate patterns: lineage generative hybridization, lineage neutral hybridization, and lineage degenerative hybridization. Users can also specify hybridization events to be dependent on a trait value or genetic distance. We also extend some phylogenetic tree utility and plotting functions for networks. We allow two different stopping conditions: simulated to a fixed time or number of taxa. When simulating to a fixed number of taxa, the user can simulate under the Generalized Sampling Approach that properly simulates phylogenies when assuming a uniform prior on the root age.
Fast, flexible and user-friendly tools for distribution comparison through direct density ratio estimation. The estimated density ratio can be used for covariate shift adjustment, outlier-detection, change-point detection, classification and evaluation of synthetic data quality. The package implements multiple non-parametric estimation techniques (unconstrained least-squares importance fitting, ulsif()
, Kullback-Leibler importance estimation procedure, kliep()
, spectral density ratio estimation, spectral()
, kernel mean matching, kmm()
, and least-squares hetero-distributional subspace search, lhss()
). with automatic tuning of hyperparameters. Helper functions are available for two-sample testing and visualizing the density ratios. For an overview on density ratio estimation, see Sugiyama et al. (2012) <doi:10.1017/CBO9781139035613> for a general overview, and the help files for references on the specific estimation techniques.
This package provides R with the Glottolog database <https://glottolog.org/> and some more abilities for purposes of linguistic mapping. The Glottolog database contains the catalogue of languages of the world. This package helps researchers to make a linguistic maps, using philosophy of the Cross-Linguistic Linked Data project <https://clld.org/>, which allows for while at the same time facilitating uniform access to the data across publications. A tutorial for this package is available on GitHub
pages <https://docs.ropensci.org/lingtypology/> and package vignette. Maps created by this package can be used both for the investigation and linguistic teaching. In addition, package provides an ability to download data from typological databases such as WALS, AUTOTYP and some others and to create your own database website.