The HURRECON model estimates wind speed, wind direction, enhanced Fujita scale wind damage, and duration of EF0 to EF5 winds as a function of hurricane location and maximum sustained wind speed. Results may be generated for a single site or an entire region. Hurricane track and intensity data may be imported directly from the US National Hurricane Center's HURDAT2 database. For details on the original version of the model written in Borland Pascal, see: Boose, Chamberlin, and Foster (2001) <doi:10.1890/0012-9615(2001)071[0027:LARIOH]2.0.CO;2> and Boose, Serrano, and Foster (2004) <doi:10.1890/02-4057>.
Different algorithms to perform approximate joint diagonalization of a finite set of square matrices. Depending on the algorithm, orthogonal or non-orthogonal diagonalizer is found. These algorithms are particularly useful in the context of blind source separation. Original publications of the algorithms can be found in Ziehe et al. (2004), Pham and Cardoso (2001) <doi:10.1109/78.942614>, Souloumiac (2009) <doi:10.1109/TSP.2009.2016997>, Vollgraff and Obermayer <doi:10.1109/TSP.2006.877673>. An example of application in the context of Brain-Computer Interfaces EEG denoising can be found in Gouy-Pailler et al (2010) <doi:10.1109/TBME.2009.2032162>.
The primary purpose of lavaan.mi is to extend the functionality of the R package lavaan', which implements structural equation modeling (SEM). When incomplete data have been multiply imputed, the imputed data sets can be analyzed by lavaan using complete-data estimation methods, but results must be pooled across imputations (Rubin, 1987, <doi:10.1002/9780470316696>). The lavaan.mi package automates the pooling of point and standard-error estimates, as well as a variety of test statistics, using a familiar interface that allows users to fit an SEM to multiple imputations as they would to a single data set using the lavaan package.
Identification of ring borders on scanned image sections from dendrochronological samples. Processing of image reflectances to produce gray matrices and time series of smoothed gray values. Luminance data is plotted on segmented images for users to perform both: visual identification of ring borders or control of automatic detection. Routines to visually include/exclude ring borders on the R graphical devices, or automatically detect ring borders using a linear detection algorithm. This algorithm detects ring borders according to positive/negative extreme values in the smoothed time-series of gray values. Most of the in-package routines can be recursively implemented using the multiDetect()
function.
The cartogram heatmaps generated by the included methods are an alternative to choropleth maps for the United States and are based on work by the Washington Post graphics department in their report on "The states most threatened by trade" (<http://www.washingtonpost.com/wp-srv/special/business/states-most-threatened-by-trade/>). "State bins" preserve as much of the geographic placement of the states as possible but have the look and feel of a traditional heatmap. Functions are provided that allow for use of a binned, discrete scale, a continuous scale or manually specified colors depending on what is needed for the underlying data.
This package provides functions to filter GPS/Argos locations, as well as assessing the sample size for the analysis of animal distributions. The filters remove temporal and spatial duplicates, fixes located at a given height from estimated high tide line, and locations with high error as described in Shimada et al. (2012) <doi:10.3354/meps09747> and Shimada et al. (2016) <doi:10.1007/s00227-015-2771-0>. Sample size for the analysis of animal distributions can be assessed by the conventional area-based approach or the alternative probability-based approach as described in Shimada et al. (2021) <doi:10.1111/2041-210X.13506>.
The XNAString package allows for description of base sequences and associated chemical modifications in a single object. XNAString is able to capture single stranded, as well as double stranded molecules. Chemical modifications are represented as independent strings associated with different features of the molecules (base sequence, sugar sequence, backbone sequence, modifications) and can be read or written to a HELM notation. It also enables secondary structure prediction using RNAfold from ViennaRNA
. XNAString is designed to be efficient representation of nucleic-acid based therapeutics, therefore it stores information about target sequences and provides interface for matching and alignment functions from Biostrings and pwalign packages.
Designed to simplify and streamline the process of reading and processing large volumes of data in R, this package offers a collection of functions tailored for bulk data operations. It enables users to efficiently read multiple sheets from Microsoft Excel and Google Sheets workbooks, as well as various CSV files from a directory. The data is returned as organized data frames, facilitating further analysis and manipulation. Ideal for handling extensive data sets or batch processing tasks, bulkreadr empowers users to manage data in bulk effortlessly, saving time and effort in data preparation workflows. Additionally, the package seamlessly works with labelled data from SPSS and Stata.
Provee un acceso conveniente a mas de 17 millones de registros de la base de datos del Censo 2017. Los datos fueron importados desde el DVD oficial del INE usando el Convertidor REDATAM creado por Pablo De Grande. Esta paquete esta documentado intencionalmente en castellano asciificado para que funcione sin problema en diferentes plataformas. (Provides convenient access to more than 17 million records from the Chilean Census 2017 database. The datasets were imported from the official DVD provided by the Chilean National Bureau of Statistics by using the REDATAM converter created by Pablo De Grande and in addition it includes the maps accompanying these datasets.).
Estimates RxC
(R by C) vote transfer matrices (ecological contingency tables) from aggregate data using the model described in Forcina et al. (2012), as extension of the model proposed in Brown and Payne (1986). Allows incorporation of covariates. References: Brown, P. and Payne, C. (1986). Aggregate data, ecological regression and voting transitions''. Journal of the American Statistical Association, 81, 453â 460. <DOI:10.1080/01621459.1986.10478290>. Forcina, A., Gnaldi, M. and Bracalente, B. (2012). A revised Brown and Payne model of voting behaviour applied to the 2009 elections in Italy''. Statistical Methods & Applications, 21, 109â 119. <DOI:10.1007/s10260-011-0184-x>.
This package provides functions to apply spatial fuzzy unsupervised classification, visualize and interpret results. This method is well suited when the user wants to analyze data with a fuzzy clustering algorithm and to account for the spatial dimension of the dataset. In addition, indexes for estimating the spatial consistency and classification quality are proposed. The methods were originally proposed in the field of brain imagery (seed Cai and al. 2007 <doi:10.1016/j.patcog.2006.07.011> and Zaho and al. 2013 <doi:10.1016/j.dsp.2012.09.016>) and recently applied in geography (see Gelb and Apparicio <doi:10.4000/cybergeo.36414>).
This package provides functions to estimate the intrinsic dimension of a dataset via likelihood-based approaches. Specifically, the package implements the TWO-NN and Gride estimators and the Hidalgo Bayesian mixture model. In addition, the first reference contains an extended vignette on the usage of the TWO-NN and Hidalgo models. References: Denti (2023, <doi:10.18637/jss.v106.i09>); Allegra et al. (2020, <doi:10.1038/s41598-020-72222-0>); Denti et al. (2022, <doi:10.1038/s41598-022-20991-1>); Facco et al. (2017, <doi:10.1038/s41598-017-11873-y>); Santos-Fernandez et al. (2021, <doi:10.1038/s41598-022-20991-1>).
This package provides modular, graph-based agents powered by large language models (LLMs) for intelligent task execution in R. Supports structured workflows for tasks such as forecasting, data visualization, feature engineering, data wrangling, data cleaning, SQL', code generation, weather reporting, and research-driven question answering. Each agent performs iterative reasoning: recommending steps, generating R code, executing, debugging, and explaining results. Includes built-in support for packages such as tidymodels', modeltime', plotly', ggplot2', and prophet'. Designed for analysts, developers, and teams building intelligent, reproducible AI workflows in R. Compatible with LLM providers such as OpenAI
', Anthropic', Groq', and Ollama'. Inspired by the Python package langagent'.
Use probability theory under the Bayesian framework for calculating the risk of selecting candidates in a multi-environment context. Contained are functions used to fit a Bayesian multi-environment model (based on the available presets), extract posterior values and maximum posterior values, compute the variance components, check the modelâ s convergence, and calculate the probabilities. For both across and within-environments scopes, the package computes the probability of superior performance and the pairwise probability of superior performance. Furthermore, the probability of superior stability and the pairwise probability of superior stability across environments is estimated. A joint probability of superior performance and stability is also provided.
Compositional data consisting of three-parts can be color mapped with a ternary color scale. Such a scale is provided by the tricolore packages with options for discrete and continuous colors, mean-centering and scaling. See Jonas Schöley (2021) "The centered ternary balance scheme. A technique to visualize surfaces of unbalanced three-part compositions" <doi:10.4054/DemRes.2021.44.19>
, Jonas Schöley, Frans Willekens (2017) "Visualizing compositional data on the Lexis surface" <doi:10.4054/DemRes.2017.36.21>
, and Ilya Kashnitsky, Jonas Schöley (2018) "Regional population structures at a glance" <doi:10.1016/S0140-6736(18)31194-2>.
This package implements the whitening methods (ZCA, PCA, Cholesky, ZCA-cor, and PCA-cor) discussed in Kessy, Lewin, and Strimmer (2018) "Optimal whitening and decorrelation", <doi:10.1080/00031305.2016.1277159>, as well as the whitening approach to canonical correlation analysis allowing negative canonical correlations described in Jendoubi and Strimmer (2019) "A whitening approach to probabilistic canonical correlation analysis for omics data integration", <doi:10.1186/s12859-018-2572-9>. The package also offers functions to simulate random orthogonal matrices, compute (correlation) loadings and explained variation. It also contains four example data sets (extended UCI wine data, TCGA LUSC data, nutrimouse data, extended pitprops data).
The CellScore
package contains functions to evaluate the cell identity of a test sample, given a cell transition defined with a starting (donor) cell type and a desired target cell type. The evaluation is based upon a scoring system, which uses a set of standard samples of known cell types, as the reference set. The functions have been carried out on a large set of microarray data from one platform (Affymetrix Human Genome U133 Plus 2.0). In principle, the method could be applied to any expression dataset, provided that there are a sufficient number of standard samples and that the data are normalized.
HiCBricks
is a library designed for handling large high-resolution Hi-C datasets. Over the years, the Hi-C field has experienced a rapid increase in the size and complexity of datasets. HiCBricks
is meant to overcome the challenges related to the analysis of such large datasets within the R environment. HiCBricks
offers user-friendly and efficient solutions for handling large high-resolution Hi-C datasets. The package provides an R/Bioconductor framework with the bricks to build more complex data analysis pipelines and algorithms. HiCBricks
already incorporates example algorithms for calling domain boundaries and functions for high quality data visualization.
This package provides functions in this package fit a stratified Cox proportional hazards and a proportional subdistribution hazards model by extending Zhang et al., (2007) <doi: 10.1016/j.cmpb.2007.07.010> and Zhang et al., (2011) <doi: 10.1016/j.cmpb.2010.07.005> respectively to clustered right-censored data. The functions also provide the estimates of the cumulative baseline hazard along with their standard errors. Furthermore, the adjusted survival and cumulative incidence probabilities are also provided along with their standard errors. Finally, the estimate of cumulative incidence and survival probabilities given a vector of covariates along with their standard errors are also provided.
DNA methylation (6mA
) is a major epigenetic process by which alteration in gene expression took place without changing the DNA sequence. Predicting these sites in-vitro is laborious, time consuming as well as costly. This EpiSemble
package is an in-silico pipeline for predicting DNA sequences containing the 6mA
sites. It uses an ensemble-based machine learning approach by combining Support Vector Machine (SVM), Random Forest (RF) and Gradient Boosting approach to predict the sequences with 6mA
sites in it. This package has been developed by using the concept of Chen et al. (2019) <doi:10.1093/bioinformatics/btz015>.
Interactive labelling of scatter plots, volcano plots and Manhattan plots using a shiny and plotly interface. Users can hover over points to see where specific points are located and click points on/off to easily label them. Labels can be dragged around the plot to place them optimally. Plots can be exported directly to PDF for publication. For plots with large numbers of points, points can optionally be rasterized as a bitmap, while all other elements (axes, text, labels & lines) are preserved as vector objects. This can dramatically reduce file size for plots with millions of points such as Manhattan plots, and is ideal for publication.
Bayesian model averaging (BMA) algorithms for univariate link latent Gaussian models (ULLGMs). For detailed information, refer to Steel M.F.J. & Zens G. (2024) "Model Uncertainty in Latent Gaussian Models with Univariate Link Function" <doi:10.48550/arXiv.2406.17318>
. The package supports various g-priors and a beta-binomial prior on the model space. It also includes auxiliary functions for visualizing and tabulating BMA results. Currently, it offers an out-of-the-box solution for model averaging of Poisson log-normal (PLN) and binomial logistic-normal (BiL
) models. The codebase is designed to be easily extendable to other likelihoods, priors, and link functions.
Cooperative learning combines the usual squared error loss of predictions with an agreement penalty to encourage the predictions from different data views to agree. By varying the weight of the agreement penalty, we get a continuum of solutions that include the well-known early and late fusion approaches. Cooperative learning chooses the degree of agreement (or fusion) in an adaptive manner, using a validation set or cross-validation to estimate test set prediction error. In the setting of cooperative regularized linear regression, the method combines the lasso penalty with the agreement penalty (Ding, D., Li, S., Narasimhan, B., Tibshirani, R. (2021) <doi:10.1073/pnas.2202113119>).
Quantitative RT-PCR data are analyzed using generalized linear mixed models based on lognormal-Poisson error distribution, fitted using MCMC. Control genes are not required but can be incorporated as Bayesian priors or, when template abundances correlate with conditions, as trackers of global effects (common to all genes). The package also implements a lognormal model for higher-abundance data and a "classic" model involving multi-gene normalization on a by-sample basis. Several plotting functions are included to extract and visualize results. The detailed tutorial is available here: <https://matzlab.weebly.com/uploads/7/6/2/2/76229469/mcmc.qpcr.tutorial.v1.2.4.pdf>.