This package provides functions to implement and simulate the partial order continual reassessment method (PO-CRM) of Wages, Conaway and O'Quigley (2011) <doi:10.1177/1740774511408748> for use in Phase I trials of combinations of agents. Provides a function for generating a set of initial guesses (skeleton) for the toxicity probabilities at each combination that correspond to the set of possible orderings of the toxicity probabilities specified by the user.
Power analysis for AB testing. The calculations are based on the Welch's unequal variances t-test, which is generally preferred over the Student's t-test when sample sizes and variances of the two groups are unequal, which is frequently the case in AB testing. In such situations, the Student's t-test will give biased results due to using the pooled standard deviation, unlike the Welch's t-test.
The straightforward filtering index (SFINX) identifies true positive protein interactions in a fast, user-friendly, and highly accurate way. It is not only useful for the filtering of affinity purification - mass spectrometry (AP-MS) data, but also for similar types of data resulting from other co-complex interactomics technologies, such as TAP-MS, Virotrap and BioID. SFINX can also be used via the website interface at <http://sfinx.ugent.be>.
Utilities for rapidly loading specified rows and/or columns of data from large tab-separated value (tsv) files (large: e.g. 1 GB file of 10000 x 10000 matrix). tsvio is an R wrapper to C code that creates an index file for the rows of the tsv file, and uses that index file to collect rows and/or columns from the tsv file without reading the whole file into memory.
Extends standard penalized regression (Lasso, Ridge, and Elastic-net) to allow feature-specific shrinkage based on external information with the goal of achieving a better prediction accuracy and variable selection. Examples of external information include the grouping of predictors, prior knowledge of biological importance, external p-values, function annotations, etc. The choice of multiple tuning parameters is done using an Empirical Bayes approach. A majorization-minimization algorithm is employed for implementation.
The development of high-throughput sequencing led to increased use of co-expression analysis to go beyong single feature (i.e. gene) focus. We propose GWENA (Gene Whole co-Expression Network Analysis) , a tool designed to perform gene co-expression network analysis and explore the results in a single pipeline. It includes functional enrichment of modules of co-expressed genes, phenotypcal association, topological analysis and comparison of networks configuration between conditions.
This package enables the interpretation and analysis of results from a gene set enrichment analysis using network-based and text-mining approaches. Most enrichment analyses result in large lists of significant gene sets that are difficult to interpret. Tools in this package help build a similarity-based network of significant gene sets from a gene set enrichment analysis that can then be investigated for their biological function using text-mining approaches.
This package provides an R wrapper for libnabo, an exact or approximate k nearest neighbour library which is optimised for low dimensional spaces (e.g. 3D). nabor includes a knn function that is designed as a drop-in replacement for the RANN function nn2. In addition, objects which include the k-d tree search structure can be returned to speed up repeated queries of the same set of target points.
R comes with a suite of utilities for linear algebra with "numeric" (double precision) vectors/matrices. However, sometimes single precision (or less!) is more than enough for a particular task. This package extends R's linear algebra facilities to include 32-bit float (single precision) data. Float vectors/matrices have half the precision of their "numeric"-type counterparts but are generally faster to numerically operate on, for a performance vs accuracy trade-off.
This package provides an implementation of the ACME estimator, described in Wolpert (2015), ACME: A Partially Periodic Estimator of Avian & Chiropteran Mortality at Wind Turbines. Unlike most other models, this estimator supports decreasing-hazard Weibull model for persistence; decreasing search proficiency as carcasses age; variable bleed-through at successive searches; and interval mortality estimates. The package provides, based on search data, functions for estimating the mortality inflation factor in Frequentist and Bayesian settings.
This package provides a collection of functions for structure learning of causal networks and estimation of joint causal effects from observational Gaussian data. Main algorithm consists of a Markov chain Monte Carlo scheme for posterior inference of causal structures, parameters and causal effects between variables. References: F. Castelletti and A. Mascaro (2021) <doi:10.1007/s10260-021-00579-1>, F. Castelletti and A. Mascaro (2022) <doi:10.48550/arXiv.2201.12003>.
Supplies higher-order coordinatized data specification and fluid transform operators that include pivot and anti-pivot as special cases. The methodology is describe in Zumel', 2018, "Fluid data reshaping with cdata'", <https://winvector.github.io/FluidData/FluidDataReshapingWithCdata.html> , <DOI:10.5281/zenodo.1173299> . This package introduces the idea of explicit control table specification of data transforms. Works on in-memory data or on remote data using rquery and SQL database interfaces.
Stan based functions to estimate CAR-MM models. These models allow to estimate Generalised Linear Models with CAR (conditional autoregressive) spatial random effects for spatially and temporally misaligned data, provided a suitable Multiple Membership matrix. The main references are Gramatica, Liverani and Congdon (2023) <doi:10.1214/23-BA1370>, Petrof, Neyens, Nuyts, Nackaerts, Nemery and Faes (2020) <doi:10.1002/sim.8697> and Gramatica, Congdon and Liverani <doi:10.1111/rssc.12480>.
We offer an implementation of the series representation put forth in "A series representation for multidimensional Rayleigh distributions" by Wiegand and Nadarajah <DOI: 10.1002/dac.3510>. Furthermore we have implemented an integration approach proposed by Beaulieu et al. for 3 and 4-dimensional Rayleigh densities (Beaulieu, Zhang, "New simplest exact forms for the 3D and 4D multivariate Rayleigh PDFs with applications to antenna array geometrics", <DOI: 10.1109/TCOMM.2017.2709307>).
Interface to the python package dgpsi for Gaussian process, deep Gaussian process, and linked deep Gaussian process emulations of computer models and networks using stochastic imputation (SI). The implementations follow Ming & Guillas (2021) <doi:10.1137/20M1323771> and Ming, Williamson, & Guillas (2023) <doi:10.1080/00401706.2022.2124311> and Ming & Williamson (2023) <doi:10.48550/arXiv.2306.01212>. To get started with the package, see <https://mingdeyu.github.io/dgpsi-R/>.
Generally, most of the packages specify the probability density function, cumulative distribution function, quantile function, and random numbers generation of the probability distributions. The present package allows to compute some important distributional properties, including the first four ordinary and central moments, Pearson's coefficient of skewness and kurtosis, the mean and variance, coefficient of variation, median, and quartile deviation at some parametric values of several well-known and extensively used probability distributions.
This package provides functions for calculating various measures of foreign policy similarity or association commonly used in the study of international relations. These include Signorino and Ritter's S statistic (weighted and unweighted), Cohen's weighted kappa, Scott's pi, and Kendall's tau-b. The package facilitates the generation of dyadic similarity scores for empirical analyses and can also serve as an educational resource for understanding how such measures are derived.
Statistical testing procedures for detecting GxE (gene-environment) interactions. The main focus lies on GRSxE interaction tests that aim at detecting GxE interactions through GRS (genetic risk scores). Moreover, a novel testing procedure based on bagging and OOB (out-of-bag) predictions is implemented for incorporating all available observations at both GRS construction and GxE testing (Lau et al., 2023, <doi:10.1038/s41598-023-28172-4>).
Designed to simplify geospatial data access from the Statistics Finland Web Feature Service API <https://geo.stat.fi/geoserver/index.html>, the geofi package offers researchers and analysts a set of tools to obtain and harmonize administrative spatial data for a wide range of applications, from urban planning to environmental research. The package contains annually updated time series of municipality key datasets that can be used for data aggregation and language translations.
This package provides functions and methods for: splitting large raster objects into smaller chunks, transferring images from a binary format into raster layers, transferring raster layers into an RData file, calculating the maximum gap (amount of consecutive missing values) of a numeric vector, and fitting harmonic regression models to periodic time series. The homoscedastic harmonic regression model is based on G. Roerink, M. Menenti and W. Verhoef (2000) <doi:10.1080/014311600209814>.
This package provides tools to estimate, compare, and visualize healthcare resource utilization using data derived from electronic health records or real-world evidence sources. The package supports pre index and post index analysis, patient cohort comparison, and customizable summaries and visualizations for clinical and health economics research. Methods implemented are based on Scott et al. (2022) <doi:10.1080/13696998.2022.2037917> and Xia et al. (2024) <doi:10.14309/ajg.0000000000002901>.
Generation of synthetic data from a real dataset using the combination of rank normal inverse transformation with the calculation of correlation matrix <doi:10.1055/a-2048-7692>. Completely artificial data may be generated through the use of Generalized Lambda Distribution and Generalized Poisson Distribution <doi:10.1201/9781420038040>. Quantitative, binary, ordinal categorical, and survival data may be simulated. Functionalities are offered to generate synthetic data sets according to user's needs.
Partial informational correlation (PIC) is used to identify the meaningful predictors to the response from a large set of potential predictors. Details of methodologies used in the package can be found in Sharma, A., Mehrotra, R. (2014). <doi:10.1002/2013WR013845>, Sharma, A., Mehrotra, R., Li, J., & Jha, S. (2016). <doi:10.1016/j.envsoft.2016.05.021>, and Mehrotra, R., & Sharma, A. (2006). <doi:10.1016/j.advwatres.2005.08.007>.
Calculates, via simulation, power and appropriate stopping alpha boundaries (and/or futility bounds) for sequential analyses (i.e., group sequential design) as well as for multiple hypotheses (multiple tests included in an analysis), given any specified global error rate. This enables the sequential use of practically any significance test, as long as the underlying data can be simulated in advance to a reasonable approximation. Lukács (2022) <doi:10.21105/joss.04643>.