Regression inference for multiple populations by integrating summary-level data using stacked imputations. Gu, T., Taylor, J.M.G. and Mukherjee, B. (2021) A synthetic data integration framework to leverage external summary-level information from heterogeneous populations <arXiv:2106.06835>
.
Tidies up the forecasting modeling and prediction work flow, extends the broom package with sw_tidy', sw_glance', sw_augment', and sw_tidy_decomp functions for various forecasting models, and enables converting forecast objects to "tidy" data frames with sw_sweep'.
Omics data (e.g. transcriptomics, proteomics, metagenomics...) offer a detailed and multi-dimensional perspective on the molecular components and interactions within complex biological (eco)systems. Analyzing these data requires adapted procedures, which are implemented as steps according to the recipes package.
Visualisation, analysis and quality control of conversational data. Rapid and visual insights into the nature, timing and quality of time-aligned annotations in conversational corpora. For more details, see Dingemanse et al., (2022) <doi:10.18653/v1/2022.acl-long.385>.
This package creates some WebGL
shaders. They can be used as the background of a Shiny app. They also can be visualized in the RStudio viewer pane or included in Rmd documents, but this is pretty useless, besides contemplating them.
Wraps the unrtf utility <https://www.gnu.org/software/unrtf/> to extract text from RTF files. Supports document conversion to HTML, LaTeX
or plain text. Output in HTML is recommended because unrtf has limited support for converting between character encodings.
Estimates the type of variables in non-quality controlled data. The prediction is based on a random forest model, trained on over 5000 medical variables with accuracy of 99%. The accuracy can hardy depend on type and coding style of data.
This package contains a mixture of functions and data sets referred to in the introductory e-book "YaRrr
!: The Pirate's Guide to R". The latest version of the e-book is available for free at <https://www.thepiratesguidetor.com>.
This package provides a package built under the Bayesian framework of applying hierarchical latent Dirichlet allocation. It statistically tests whether the mutational exposures of mutational signatures (Shiraishi-model signatures) are different between two groups. The package also provides inference and visualization.
Hidden Ising models are implemented to identify enriched genomic regions in ChIP-chip
data. They can be used to analyze the data from multiple platforms (e.g., Affymetrix, Agilent, and NimbleGen
), and the data with single to multiple replicates.
Builds hexbin plots for variables and dimension reduction stored in single cell omics data such as SingleCellExperiment
. The ideas used in this package are based on the excellent work of Dan Carr, Nicholas Lewin-Koh, Martin Maechler and Thomas Lumley.
The topGO package provides tools for testing gene ontology (GO) terms while accounting for the topology of the GO graph. Different test statistics and different methods for eliminating local similarities and dependencies between GO terms can be implemented and applied.
The biglm package lets you create a linear model object that uses only codep^2 memory for p
variables. It can be updated with more data using update
. This allows linear regression on data sets larger than memory.
This tool generates high number of both single- and multi-objective test functions. These functions are frequently used for the benchmarking of (numerical) optimization algorithms. Moreover, it offers a set of convenient functions to generate, plot and work with objective functions.
This package lets you build regression models using the techniques in Friedman's papers "Fast MARS" and "Multivariate Adaptive Regression Splines" <doi:10.1214/aos/1176347963>. The term "MARS" is trademarked and thus not used in the name of the package.
This package provides support for measurement units in R vectors, matrices and arrays: automatic propagation, conversion, derivation and simplification of units; raising errors in case of unit incompatibility. It is compatible with the POSIXct
, Date
and difftime
classes.
This package enables variogram modelling, including: simple, ordinary and universal point or block (co)kriging; spatio-temporal kriging; and sequential Gaussian or indicator (co)simulation. It includes variogram and variogram map plotting utility functions, and supports sf
and stars
.
This package provides R6 abstract classes for building machine learning models with a scikit-learn like API. Scikit-learn is a popular module for the Python programming language whose design became a de facto standard in industry for machine learning tasks.
Implementation of the RESTK algorithm based on Markov's Inequality from Vilardell, Sergi, Serra, Isabel, Mezzetti, Enrico, Abella, Jaume, Cazorla, Francisco J. and Del Castillo, J. (2022). "Using Markov's Inequality with Power-Of-k Function for Probabilistic WCET Estimation". In 34th Euromicro Conference on Real-Time Systems (ECRTS 2022). Leibniz International Proceedings in Informatics (LIPIcs) 231 20:1-20:24. <doi:10.4230/LIPIcs.ECRTS.2022.20>. This work has been supported by the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No. 772773).
Higher-order spectra or polyspectra of time series, such as bispectrum and bicoherence, have been investigated in abundant literature and applied to problems of signal detection in a wide range of fields. This package aims to provide a simple API to estimate and analyze them. The current implementation is based on Brillinger and Irizarry (1998) <doi:10.1016/S0165-1684(97)00217-X> for estimating bispectrum or bicoherence, Lii and Helland (1981) <doi:10.1145/355958.355961> for cross-bispectrum, and Kim and Powers (1979) <doi:10.1109/TPS.1979.4317207> for cross-bicoherence.
Assesses the robustness of the community structure of a network found by one or more community detection algorithm to give indications about their reliability. It detects if the community structure found by a set of algorithms is statistically significant and compares the different selected detection algorithms on the same network. robin helps to choose among different community detection algorithms the one that better fits the network of interest. Reference in Policastro V., Righelli D., Carissimo A., Cutillo L., De Feis I. (2021) <https://journal.r-project.org/archive/2021/RJ-2021-040/index.html>.
KEEL is a popular Java software for a large number of different knowledge data discovery tasks. This package takes the advantages of KEEL and R, allowing to use KEEL algorithms in simple R code. The implemented R code layer between R and KEEL makes easy both using KEEL algorithms in R as implementing new algorithms for RKEEL in a very simple way. It includes more than 100 algorithms for classification, regression, preprocess, association rules and imbalance learning, which allows a more complete experimentation process. For more information about KEEL', see <http://www.keel.es/>.
Root Expected Proportion Squared Difference (REPSD) is a nonparametric differential item functioning (DIF) method that (a) allows practitioners to explore for DIF related to small, fine-grained focal groups of examinees, and (b) compares the focal group directly to the composite group that will be used to develop the reported test score scale. Using your provided response matrix with a column that identifies focal group membership, this package provides the REPSD values, a simulated null distribution of possible REPSD values, and the simulated p-values identifying items possibly displaying DIF without requiring enormous sample sizes.
This package provides alternatives to the normal adjusted R-squared estimator for the estimation of the multiple squared correlation in regression models, as fitted by the lm()
function. The alternative estimators are described in Karch (2020) <DOI:10.1525/collabra.343>.