This package is a collection of baseline correction algorithms. Beside those it provides a framework and a Tcl/Tk enabled GUI for optimizing baseline algorithm parameters. Typical use is the removal of the background effects from spectra, which are originating from various types of spectroscopy and spectrometry. Also, there is a possibility of optimizing this with regard to regression or classification results. Correction methods include polynomial fitting, weighted local smoothers and many more.
Tree based algorithms can be improved by introducing boosting frameworks. LightGBM is one such framework, based on Ke, Guolin et al. (2017). This package offers an R interface to work with it. It is designed to be distributed and efficient with the following goals:
Faster training speed and higher efficiency;
lower memory usage;
better accuracy;
parallel learning supported; and
capable of handling large-scale data.
This package aims to make it easy to use various types of fonts (TrueType, OpenType, Type 1, web fonts, etc.) in R graphs, and supports most output formats of R graphics including PNG, PDF and SVG. Text glyphs will be converted into polygons or raster images, hence after the plot has been created, it no longer relies on the font files. No external software such as Ghostscript is needed to use this package.
This package provides a Bayesian data modeling scheme that performs four interconnected tasks: (i) characterizes the uncertainty of the elicited parametric prior; (ii) provides exploratory diagnostic for checking prior-data conflict; (iii) computes the final statistical prior density estimate; and (iv) executes macro- and micro-inference. Primary reference is Mukhopadhyay, S. and Fletcher, D. 2018 paper "Generalized Empirical Bayes via Frequentist Goodness of Fit" (<https://www.nature.com/articles/s41598-018-28130-5 >).
Cluster Evolution Analytics allows us to use exploratory what if questions in the sense that the present information of an object is plugged-in a dataset in a previous time frame so that we can explore its evolution (and of its neighbors) to the present. See the URL for the papers associated with this package, as for instance, Morales-Oñate and Morales-Oñate (2024) <doi:10.1016/j.softx.2024.101921>.
Simulate and fitting exponential multivariate Hawkes model. This package simulates a multivariate Hawkes model, introduced by Hawkes (1971) <doi:10.2307/2334319>, with an exponential kernel and fits the parameters from the data. Models with the constant parameters, as well as complex dependent structures, can also be simulated and estimated. The estimation is based on the maximum likelihood method, introduced by introduced by Ozaki (1979) <doi:10.1007/BF02480272>, with maxLik package.
This package contains an implementation of an independent component analysis (ICA) for grouped data. The main function groupICA() performs a blind source separation, by maximizing an independence across sources and allows to adjust for varying confounding for user-specified groups. Additionally, the package contains the function uwedge() which can be used to approximately jointly diagonalize a list of matrices. For more details see the project website <https://sweichwald.de/groupICA/>.
The hotspots package is designed to look within a set of measured values of a variable and identify values that are disproportionately high based on both the deviance of any given value from a statistical distribution and its similarity to other values. Because this relative magnitude of each value is taken into account, a value that is a statistical outlier may not always be a hot spot if other values are similarly large.
Generic code for estimating treatment effects with panel data. The idea is to break into separate steps organizing the data, looping over groups and time periods, computing group-time average treatment effects, and aggregating group-time average treatment effects. Often, one is able to implement a new identification/estimation procedure by simply replacing the step on estimating group-time average treatment effects. See several different examples of this approach in the package documentation.
An implementation of the Elston-Stewart algorithm for calculating pedigree likelihoods given genetic marker data (Elston and Stewart (1971) <doi:10.1159/000152448>). The standard algorithm is extended to allow inbred founders. pedprobr is part of the pedsuite', a collection of packages for pedigree analysis in R. In particular, pedprobr depends on pedtools for pedigree manipulations and pedmut for mutation modelling. For more information, see Pedigree Analysis in R (Vigeland, 2021, ISBN:9780128244302).
Interactively explore various dependencies of a package(s) (on the Comprehensive R Archive Network Like repositories) and perform analysis using tidy philosophy. Most of the functions return a tibble object (enhancement of dataframe') which can be used for further analysis. The package offers functions to produce network and igraph dependency graphs. The plot method produces a static plot based on ggnetwork and plotd3 function produces an interactive D3 plot based on networkD3'.
This package provides a suite of multivariate methods and data visualization tools to implement profile analysis and cross-validation techniques described in Davison & Davenport (2002) <DOI: 10.1037/1082-989X.7.4.468>, Bulut (2013), and other published and unpublished resources. The package includes routines to perform criterion-related profile analysis, profile analysis via multidimensional scaling, moderated profile analysis, profile analysis by group, and a within-person factor model to derive score profiles.
Statistical pattern recognition and dating using archaeological artefacts assemblages. Package of statistical tools for archaeology. hclustcompro()/perioclust(): Bellanger Lise, Coulon Arthur, Husi Philippe (2021, ISBN:978-3-030-60103-4). mapclust(): Bellanger Lise, Coulon Arthur, Husi Philippe (2021) <doi:10.1016/j.jas.2021.105431>. seriograph(): Desachy Bruno (2004) <doi:10.3406/pica.2004.2396>. cerardat(): Bellanger Lise, Husi Philippe (2012) <doi:10.1016/j.jas.2011.06.031>.
This package provides tools for the stochastic simulation of effectiveness scores to mitigate data-related limitations of Information Retrieval evaluation research, as described in Urbano and Nagler (2018) <doi:10.1145/3209978.3210043>. These tools include: fitting, selection and plotting distributions to model system effectiveness, transformation towards a prespecified expected value, proxy to fitting of copula models based on these distributions, and simulation of new evaluation data from these distributions and copula models.
This package provides interface to sparsepp - fast, memory efficient hash map. It is derived from Google's excellent sparsehash implementation. We believe sparsepp provides an unparalleled combination of performance and memory usage, and will outperform your compiler's unordered_map on both counts. Only Google's dense_hash_map is consistently faster, at the cost of much greater memory usage (especially when the final size of the map is not known in advance).
This package provides a variational Bayesian finite mixture model for the clustering of categorical data, and can implement variable selection and semi-supervised outcome guiding if desired. Incorporates an option to perform model averaging over multiple initialisations to reduce the effects of local optima and improve the automatic estimation of the true number of clusters. For further details, see the paper by Rao and Kirk (2024) <doi:10.48550/arXiv.2406.16227>.
Graphical tools for visualizing high-dimensional data along a path of alternating one- and two-dimensional plots. Includes optional interactive graphics via loon (which uses tcltk from base R). Support is provided for constructing graph structures and, when available, plotting them with Bioconductor packages (e.g., graph', Rgraphviz'); these are optional and examples/vignettes are skipped if they are not installed. For algorithms and further details, see <doi:10.18637/jss.v095.i04>.
This package provides a BiocBook can be created by authors (e.g. R developers, but also scientists, teachers, communicators, ...) who wish to 1) write (compile a body of biological and/or bioinformatics knowledge), 2) containerize (provide Docker images to reproduce the examples illustrated in the compendium), 3) publish (deploy an online book to disseminate the compendium), and 4) version (automatically generate specific online book versions and Docker images for specific Bioconductor releases).
SNPediaR provides some tools for downloading and parsing data from the SNPedia web site <http://www.snpedia.com>. The implemented functions allow users to import the wiki text available in SNPedia pages and to extract the most relevant information out of them. If some information in the downloaded pages is not automatically processed by the library functions, users can easily implement their own parsers to access it in an efficient way.
For distributions whose probability density functions are log-concave, the adaptive rejection sampling algorithm can be used to build envelope functions for sampling. For others, the modified adaptive rejection sampling algorithm, the concave-convex adaptive rejection sampling algorithm, and the adaptive slice sampling algorithm can be used. This R package mainly includes these four functions: rARS(), rMARS(), rCCARS(), and rASS(). These functions can realize sampling based on the algorithms above.
This package implements Bayesian dynamic factor analysis with Stan'. Dynamic factor analysis is a dimension reduction tool for multivariate time series. bayesdfa extends conventional dynamic factor models in several ways. First, extreme events may be estimated in the latent trend by modeling process error with a student-t distribution. Second, alternative constraints (including proportions are allowed). Third, the estimated dynamic factors can be analyzed with hidden Markov models to evaluate support for latent regimes.
In the context of high-throughput genetic data, CoDaCoRe identifies a set of sparse biomarkers that are predictive of a response variable of interest (Gordon-Rodriguez et al., 2021) <doi:10.1093/bioinformatics/btab645>. More generally, CoDaCoRe can be applied to any regression problem where the independent variable is Compositional (CoDa), to derive a set of scale-invariant log-ratios (ILR or SLR) that are maximally associated to a dependent variable.
Easily import multi-frequency acoustic data stored in HAC files (see <doi:10.17895/ices.pub.5482> for more information on the format), and produce echogram visualisations with predefined or customized color palettes. It is also possible to merge consecutive echograms; mask or delete unwanted echogram areas; model and subtract background noise; and more important, develop, test and interpret different combinations of frequencies in order to perform acoustic filtering of the echogram's data.
Interface to Eurostatâ s API (SDMX 2.1) with fast data.table-based import of data, labels, and metadata. On top of the core functionality, data search and data description/comparison functions are also provided. Use <https://github.com/alekrutkowski/eurodata_codegen> â a point-and-click app for rapid and easy generation of richly-commented R code â to import a Eurostat dataset or its subset (based on the eurodata::importData() function).