Models with skewâ normally distributed and thus asymmetric error terms, implementing the methods developed in Badunenko and Henderson (2023) "Production analysis with asymmetric noise" <doi:10.1007/s11123-023-00680-5>. The package provides tools to estimate regression models with skewâ normal error terms, allowing both the variance and skewness parameters to be heteroskedastic. It also includes a stochastic frontier framework that accommodates both i.i.d. and heteroskedastic inefficiency terms.
Utilities for rapidly loading specified rows and/or columns of data from large tab-separated value (tsv) files (large: e.g. 1 GB file of 10000 x 10000 matrix). tsvio is an R wrapper to C code that creates an index file for the rows of the tsv file, and uses that index file to collect rows and/or columns from the tsv file without reading the whole file into memory.
Interface to the TA-Lib (Technical Analysis Library) C library, providing access to 150+ indicators (e.g. Average Directional Movement Index (ADX), Moving Average Convergence Divergence (MACD), Relative Strength Index (RSI), Stochastic Oscillator, Bollinger Bands), candlestick pattern recognition, and rolling-window utilities. Core computations are implemented in C for fast Open-High-Low-Close-Volume (OHLCV) time-series feature engineering and rule-based signal generation, with optional interactive visualization via plotly'.
R comes with a suite of utilities for linear algebra with "numeric" (double precision) vectors/matrices. However, sometimes single precision (or less!) is more than enough for a particular task. This package extends R's linear algebra facilities to include 32-bit float (single precision) data. Float vectors/matrices have half the precision of their "numeric"-type counterparts but are generally faster to numerically operate on, for a performance vs accuracy trade-off.
This package provides an R wrapper for libnabo, an exact or approximate k nearest neighbour library which is optimised for low dimensional spaces (e.g. 3D). nabor includes a knn function that is designed as a drop-in replacement for the RANN function nn2. In addition, objects which include the k-d tree search structure can be returned to speed up repeated queries of the same set of target points.
This package provides a C++ header library for using the libsoda-cxx library with R. The C++ header reimplements the lsoda function from the ODEPACK library for solving initial value problems for first order ordinary differential equations. The C++ header can be used by other R packages by linking against this package. The C++ functions can be called inline using Rcpp. Finally, the package provides an ode function to call from R.
This package provides an implementation of the ACME estimator, described in Wolpert (2015), ACME: A Partially Periodic Estimator of Avian & Chiropteran Mortality at Wind Turbines. Unlike most other models, this estimator supports decreasing-hazard Weibull model for persistence; decreasing search proficiency as carcasses age; variable bleed-through at successive searches; and interval mortality estimates. The package provides, based on search data, functions for estimating the mortality inflation factor in Frequentist and Bayesian settings.
Celda is a suite of Bayesian hierarchical models for clustering single-cell RNA-sequencing (scRNA-seq) data. It is able to perform "bi-clustering" and simultaneously cluster genes into gene modules and cells into cell subpopulations. It also contains DecontX, a novel Bayesian method to computationally estimate and remove RNA contamination in individual cells without empty droplet information. A variety of scRNA-seq data visualization functions is also included.
scCB2 is an R package implementing CB2 for distinguishing real cells from empty droplets in droplet-based single cell RNA-seq experiments (especially for 10x Chromium). It is based on clustering similar barcodes and calculating Monte-Carlo p-value for each cluster to test against background distribution. This cluster-level test outperforms single-barcode-level tests in dealing with low count barcodes and homogeneous sequencing library, while keeping FDR well controlled.
This package provides a collection of functions for structure learning of causal networks and estimation of joint causal effects from observational Gaussian data. Main algorithm consists of a Markov chain Monte Carlo scheme for posterior inference of causal structures, parameters and causal effects between variables. References: F. Castelletti and A. Mascaro (2021) <doi:10.1007/s10260-021-00579-1>, F. Castelletti and A. Mascaro (2022) <doi:10.48550/arXiv.2201.12003>.
Supplies higher-order coordinatized data specification and fluid transform operators that include pivot and anti-pivot as special cases. The methodology is describe in Zumel', 2018, "Fluid data reshaping with cdata'", <https://winvector.github.io/FluidData/FluidDataReshapingWithCdata.html> , <DOI:10.5281/zenodo.1173299> . This package introduces the idea of explicit control table specification of data transforms. Works on in-memory data or on remote data using rquery and SQL database interfaces.
Stan based functions to estimate CAR-MM models. These models allow to estimate Generalised Linear Models with CAR (conditional autoregressive) spatial random effects for spatially and temporally misaligned data, provided a suitable Multiple Membership matrix. The main references are Gramatica, Liverani and Congdon (2023) <doi:10.1214/23-BA1370>, Petrof, Neyens, Nuyts, Nackaerts, Nemery and Faes (2020) <doi:10.1002/sim.8697> and Gramatica, Congdon and Liverani <doi:10.1111/rssc.12480>.
We offer an implementation of the series representation put forth in "A series representation for multidimensional Rayleigh distributions" by Wiegand and Nadarajah <DOI: 10.1002/dac.3510>. Furthermore we have implemented an integration approach proposed by Beaulieu et al. for 3 and 4-dimensional Rayleigh densities (Beaulieu, Zhang, "New simplest exact forms for the 3D and 4D multivariate Rayleigh PDFs with applications to antenna array geometrics", <DOI: 10.1109/TCOMM.2017.2709307>).
Generally, most of the packages specify the probability density function, cumulative distribution function, quantile function, and random numbers generation of the probability distributions. The present package allows to compute some important distributional properties, including the first four ordinary and central moments, Pearson's coefficient of skewness and kurtosis, the mean and variance, coefficient of variation, median, and quartile deviation at some parametric values of several well-known and extensively used probability distributions.
Interface to the python package dgpsi for Gaussian process, deep Gaussian process, and linked deep Gaussian process emulations of computer models and networks using stochastic imputation (SI). The implementations follow Ming & Guillas (2021) <doi:10.1137/20M1323771> and Ming, Williamson, & Guillas (2023) <doi:10.1080/00401706.2022.2124311> and Ming & Williamson (2023) <doi:10.48550/arXiv.2306.01212>. To get started with the package, see <https://mingdeyu.github.io/dgpsi-R/>.
This package provides functions for calculating various measures of foreign policy similarity or association commonly used in the study of international relations. These include Signorino and Ritter's S statistic (weighted and unweighted), Cohen's weighted kappa, Scott's pi, and Kendall's tau-b. The package facilitates the generation of dyadic similarity scores for empirical analyses and can also serve as an educational resource for understanding how such measures are derived.
Designed to simplify geospatial data access from the Statistics Finland Web Feature Service API <https://geo.stat.fi/geoserver/index.html>, the geofi package offers researchers and analysts a set of tools to obtain and harmonize administrative spatial data for a wide range of applications, from urban planning to environmental research. The package contains annually updated time series of municipality key datasets that can be used for data aggregation and language translations.
Statistical testing procedures for detecting GxE (gene-environment) interactions. The main focus lies on GRSxE interaction tests that aim at detecting GxE interactions through GRS (genetic risk scores). Moreover, a novel testing procedure based on bagging and OOB (out-of-bag) predictions is implemented for incorporating all available observations at both GRS construction and GxE testing (Lau et al., 2023, <doi:10.1038/s41598-023-28172-4>).
The official implementation of the Global Livestock Environmental Assessment Model (GLEAM) of the Food and Agriculture Organization of the United Nations (FAO) in R. GLEAM-X provides a modular, transparent framework for simulating livestock production systems and quantifying their environmental impacts. Methodological background: MacLeod et al. (2017) "Invited review: A position on the Global Livestock Environmental Assessment Model (GLEAM)" <doi:10.1017/S1751731117001847>. Further information: <https://www.fao.org/gleam/en/>.
This package provides functions and methods for: splitting large raster objects into smaller chunks, transferring images from a binary format into raster layers, transferring raster layers into an RData file, calculating the maximum gap (amount of consecutive missing values) of a numeric vector, and fitting harmonic regression models to periodic time series. The homoscedastic harmonic regression model is based on G. Roerink, M. Menenti and W. Verhoef (2000) <doi:10.1080/014311600209814>.
This package provides tools to estimate, compare, and visualize healthcare resource utilization using data derived from electronic health records or real-world evidence sources. The package supports pre index and post index analysis, patient cohort comparison, and customizable summaries and visualizations for clinical and health economics research. Methods implemented are based on Scott et al. (2022) <doi:10.1080/13696998.2022.2037917> and Xia et al. (2024) <doi:10.14309/ajg.0000000000002901>.
Generation of synthetic data from a real dataset using the combination of rank normal inverse transformation with the calculation of correlation matrix <doi:10.1055/a-2048-7692>. Completely artificial data may be generated through the use of Generalized Lambda Distribution and Generalized Poisson Distribution <doi:10.1201/9781420038040>. Quantitative, binary, ordinal categorical, and survival data may be simulated. Functionalities are offered to generate synthetic data sets according to user's needs.
Partial informational correlation (PIC) is used to identify the meaningful predictors to the response from a large set of potential predictors. Details of methodologies used in the package can be found in Sharma, A., Mehrotra, R. (2014). <doi:10.1002/2013WR013845>, Sharma, A., Mehrotra, R., Li, J., & Jha, S. (2016). <doi:10.1016/j.envsoft.2016.05.021>, and Mehrotra, R., & Sharma, A. (2006). <doi:10.1016/j.advwatres.2005.08.007>.
This package provides a comprehensive suite of tools for analyzing omics data. It includes functionalities for alpha diversity analysis, beta diversity analysis, differential abundance analysis, community assembly analysis, visualization of phylogenetic tree, and functional enrichment analysis. With a progressive approach, the package offers a range of analysis methods to explore and understand the complex communities. It is designed to support researchers and practitioners in conducting in-depth and professional omics data analysis.