This package provides several direct search optimization algorithms based on the simplex method. The provided algorithms are direct search algorithms, i.e. algorithms which do not use the derivative of the cost function. They are based on the update of a simplex. The following algorithms are available: the fixed shape simplex method of Spendley, Hext and Himsworth (unconstrained optimization with a fixed shape simplex, 1962) <doi:10.1080/00401706.1962.10490033>, the variable shape simplex method of Nelder and Mead (unconstrained optimization with a variable shape simplex made, 1965) <doi:10.1093/comjnl/7.4.308>, and Box's complex method (constrained optimization with a variable shape simplex, 1965) <doi: 10.1093/comjnl/8.1.42>.
This package implements the framework presented in Cucci, D. A., Voirol, L., Khaghani, M. and Guerrier, S. (2023) <doi:10.1109/TIM.2023.3267360> which allows to analyze the impact of sensor error modeling on the performance of integrated navigation (sensor fusion) based on inertial measurement unit (IMU), Global Positioning System (GPS), and barometer data. The framework relies on Monte Carlo simulations in which a Vanilla Extended Kalman filter is coupled with realistic and user-configurable noise generation mechanisms to recover a reference trajectory from noisy measurements. The evaluation of several statistical metrics of the solution, aggregated over hundreds of simulated realizations, provides reasonable estimates of the expected performances of the system in real-world conditions.
An extension to ggplot2 and magick'. It contains three groups of functions: Functions in the first group draw ggplot2 - based plots: geom_shading_bar()
draws barplot with shading colors in each bar. geom_rect_cm()
, geom_circle_cm()
and geom_ellipse_cm()
draw rectangles, circles and ellipses with centimeter as their unit. Thus their sizes do not change when the coordinate system or the aspect ratio changes. annotation_transparent_text()
draws labels with transparent texts. annotation_shading_polygon()
draws irregular polygons with shading colors. Functions in the second group generate coordinates for regular shapes and make linear transformations. Functions in the third group are magick - based functions facilitating image processing.
Analysis of protein expression data can be done through Principal Component Analysis (PCA), and this R package is designed to streamline the analysis. This package enables users to perform PCA and it generates biplot and scree plot for advanced graphical visualization. Optionally, it supports grouping/clustering visualization with PCA loadings and confidence ellipses. With this R package, researchers can quickly explore complex protein datasets, interpret variance contributions, and visualize sample clustering through intuitive biplots. For more details, see Jolliffe (2001) <doi:10.1007/b98835>, Gabriel (1971) <doi:10.1093/biomet/58.3.453>, Zhang et al. (2024) <doi:10.1038/s41467-024-53239-9>, and Anandan et al. (2022) <doi:10.1038/s41598-022-07781-5>.
This package provides tools to apply Ensemble Empirical Mode Decomposition (EEMD) for cyclostratigraphy purposes. Mainly: a new algorithm, extricate, that performs EEMD in seconds, a linear interpolation algorithm using the greatest rational common divisor of depth or time, different algorithms to compute instantaneous amplitude, frequency and ratios of frequencies, and functions to verify and visualise the outputs. The functions were developed during the CRASH project (Checking the Reproducibility of Astrochronology in the Hauterivian). When using for publication please cite Wouters, S., Crucifix, M., Sinnesael, M., Da Silva, A.C., Zeeden, C., Zivanovic, M., Boulvain, F., Devleeschouwer, X., 2022, "A decomposition approach to cyclostratigraphic signal processing". Earth-Science Reviews 225 (103894). <doi:10.1016/j.earscirev.2021.103894>.
This package implements fast, scalable optimization algorithms for fitting topic models ("grade of membership" models) and non-negative matrix factorizations to count data. The methods exploit the special relationship between the multinomial topic model (also, "probabilistic latent semantic indexing") and Poisson non-negative matrix factorization. The package provides tools to compare, annotate and visualize model fits, including functions to efficiently create "structure plots" and identify key features in topics. The fastTopics
package is a successor to the CountClust
package. For more information, see <doi:10.48550/arXiv.2105.13440>
and <doi:10.1186/s13059-023-03067-9>. Please also see the GitHub
repository for additional vignettes not included in the package on CRAN.
Power and Sample Size for Health Researchers is a Shiny application that brings together a series of functions related to sample size and power calculations for common analysis in the healthcare field. There are functionalities to calculate the power, sample size to estimate or test hypotheses for means and proportions (including test for correlated groups, equivalence, non-inferiority and superiority), association, correlations coefficients, regression coefficients (linear, logistic, gamma, and Cox), linear mixed model, Cronbach's alpha, interobserver agreement, intraclass correlation coefficients, limit of agreement on Bland-Altman plots, area under the curve, sensitivity and specificity incorporating the prevalence of disease. You can also use the online version at <https://hcpa-unidade-bioestatistica.shinyapps.io/PSS_Health/>.
Implement a promising, and yet little explored protocol for bioacoustical analysis, the eigensound method by MacLeod
, Krieger and Jones (2013) <doi:10.4404/hystrix-24.1-6299>. Eigensound is a multidisciplinary method focused on the direct comparison between stereotyped sounds from different species. SoundShape
', in turn, provide the tools required for anyone to go from sound waves to Principal Components Analysis, using tools extracted from traditional bioacoustics (i.e. tuneR
and seewave packages), geometric morphometrics (i.e. geomorph package) and multivariate analysis (e.g. stats package). For more information, please see Rocha and Romano (2021) and check SoundShape
repository on GitHub
for news and updates <https://github.com/p-rocha/SoundShape>
.
Uses the Distorted Wave Born Approximation (DWBA) to compute the acoustic backward scattering, the geometry of the object is formed by a volumetric mesh, composed of tetrahedrons. This computation is done efficiently through an analytical 3D integration that allows for a solution which is expressed in terms of elementary functions for each tetrahedron. It is important to note that this method is only valid for objects whose acoustic properties, such as density and sound speed, do not vary significantly compared to the surrounding medium. (See Lavia, Cascallares and Gonzalez, J. D. (2023). TetraScatt
model: Born approximation for the estimation of acoustic dispersion of fluid-like objects of arbitrary geometries. arXiv
preprint <arXiv:2312.16721>
).
Estimation of crop water demand can be processed via this package. As example, the data from TerraClimate
dataset (<https://www.climatologylab.org/terraclimate.html>) calibrated with automatic weather stations of National Meteorological Institute of Brazil is available in a coarse spatial resolution to do the crop water demand. However, the user have also the option to download the variables directly from TerraClimate
repository with the download.terraclimate function and access the original TerraClimate
products. If the user believes that is necessary calibrate the variables, there is another function to do it. Lastly, the estimation of the crop water demand present in this package can be run for all the Brazilian territory with TerraClimate
dataset.
Calculates marginal effects and conducts process analysis in exponential family random graph models (ERGM). Includes functions to conduct mediation and moderation analyses and to diagnose multicollinearity. URL: <https://github.com/sduxbury/ergMargins>
. BugReports
: <https://github.com/sduxbury/ergMargins/issues>
. Duxbury, Scott W (2021) <doi:10.1177/0049124120986178>. Long, J. Scott, and Sarah Mustillo (2018) <doi:10.1177/0049124118799374>. Mize, Trenton D. (2019) <doi:10.15195/v6.a4>. Karlson, Kristian Bernt, Anders Holm, and Richard Breen (2012) <doi:10.1177/0081175012444861>. Duxbury, Scott W (2018) <doi:10.1177/0049124118782543>. Duxbury, Scott W, Jenna Wertsching (2023) <doi:10.1016/j.socnet.2023.02.003>. Huang, Peng, Carter Butts (2023) <doi:10.1016/j.socnet.2023.07.001>.
Some functions of ade4 and stats are combined in order to obtain a partition of the rows of a data table, with columns representing variables of scales: quantitative, qualitative or frequency. First, a principal axes method is performed and then, a combination of Ward agglomerative hierarchical classification and K-means is performed, using some of the first coordinates obtained from the previous principal axes method. In order to permit different weights of the elements to be clustered, the function kmeansW
', programmed in C++, is included. It is a modification of kmeans'. Some graphical functions include the option: gg=FALSE'. When gg=TRUE', they use the ggplot2 and ggrepel packages to avoid the super-position of the labels.
An implementation of the International Bureau of Weights and Measures (BIPM) generalized consensus estimators used to assign the reference value in a key comparison exercise. This can also be applied to any interlaboratory study. Given a set of different sources, primary laboratories or measurement methods this package provides an evaluation of the variance components according to the selected statistical method for consensus building. It also implements the comparison among different consensus builders and evaluates the participating method or sources against the consensus reference value. Based on a diverse set of references, DerSimonian-Laird
(1986) <doi:10.1016/0197-2456(86)90046-2>, for a complete list of references look at the reference section in the package documentation.
This package provides a collection of functions for processing Gen5 2.06 exported data. Gen5 is an essential data analysis software for BioTek
plate readers <https://www.biotek.com/products/software-robotics-software/gen5-microplate-reader-and-imager-software/>. This package contains functions for data cleaning, modeling and plotting using exported data from Gen5 version 2.06. It exports technically correct data defined in (Edwin de Jonge and Mark van der Loo (2013) <https://cran.r-project.org/doc/contrib/de_Jonge+van_der_Loo-Introduction_to_data_cleaning_with_R.pdf>) for customized analysis. It contains Boltzmann fitting for general kinetic analysis. See <https://www.github.com/yanxianUCSB/gen5helper>
for more information, documentation and examples.
Easily construct prompts and associated logic for interacting with large language models (LLMs). tidyprompt introduces the concept of prompt wraps, which are building blocks that you can use to quickly turn a simple prompt into a complex one. Prompt wraps do not just modify the prompt text, but also add extraction and validation functions that will be applied to the response of the LLM. This ensures that the user gets the desired output. tidyprompt can add various features to prompts and their evaluation by LLMs, such as structured output, automatic feedback, retries, reasoning modes, autonomous R function calling, and R code generation and evaluation. It is designed to be compatible with any LLM provider that offers chat completion.
Students learning both econometrics and R may find the introduction to both challenging. The wooldridge data package aims to lighten the task by efficiently loading any data set found in the text with a single command. Data sets have been compressed to a fraction of their original size. Documentation files contain page numbers, the original source, time of publication, and notes from the author suggesting avenues for further analysis and research. If one needs an introduction to R model syntax, a vignette contains solutions to examples from chapters of the text. Data sets are from the 7th edition (Wooldridge 2020, ISBN-13 978-1-337-55886-0), and are backwards compatible with all previous versions of the text.
In computationally demanding data analysis pipelines, the targets R package (2021, <doi:10.21105/joss.02959>) maintains an up-to-date set of results while skipping tasks that do not need to rerun. This process increases speed and increases trust in the final end product. However, it also overwrites old output with new output, and past results disappear by default. To preserve historical output, the gittargets package captures version-controlled snapshots of the data store, and each snapshot links to the underlying commit of the source code. That way, when the user rolls back the code to a previous branch or commit, gittargets can recover the data contemporaneous with that commit so that all targets remain up to date.
This package provides a small package containing functions to perform a joint calibration of totals and quantiles. The calibration for totals is based on Deville and Särndal (1992) <doi:10.1080/01621459.1992.10475217>, the calibration for quantiles is based on Harms and Duchesne (2006) <https://www150.statcan.gc.ca/n1/en/catalogue/12-001-X20060019255>. The package uses standard calibration via the survey', sampling or laeken packages. In addition, entropy balancing via the ebal package and empirical likelihood based on codes from Wu (2005) <https://www150.statcan.gc.ca/n1/pub/12-001-x/2005002/article/9051-eng.pdf> can be used. See the paper by BerÄ sewicz and Szymkowiak (2023) for details <arXiv:2308.13281>
.
This package provides functions to classify mass spectra in known categories, and to determine discriminant mass-to-charge values. It includes easy-to-use functions for preprocessing mass spectra, functions to determine discriminant mass-to-charge values (m/z) from a library of mass spectra corresponding to different categories, and functions to predict the category (species, phenotypes, etc.) associated to a mass spectrum from a list of selected mass-to-charge values. If you use this package in your research, please cite the associated publication (<doi:10.1016/j.eswa.2025.128796>). For a comprehensive guide, additional applications, and detailed examples of using this package, please visit our GitHub
repository (<https://github.com/agodmer/MSclassifR_examples>
).
We propose a pair of summary measures for the predictive power of a prediction function based on a regression model. The regression model can be linear or nonlinear, parametric, semi-parametric, or nonparametric, and correctly specified or mis-specified. The first measure, R-squared, is an extension of the classical R-squared statistic for a linear model, quantifying the prediction function's ability to capture the variability of the response. The second measure, L-squared, quantifies the prediction function's bias for predicting the mean regression function. When used together, they give a complete summary of the predictive power of a prediction function. Please refer to Gang Li and Xiaoyan Wang (2016) <arXiv:1611.03063>
for more details.
This is a package for normalization, testing for differential variability and differential methylation and gene set testing for data from Illumina's Infinium HumanMethylation arrays. The normalization procedure is subset-quantile within-array normalization (SWAN), which allows Infinium I and II type probes on a single array to be normalized together. The test for differential variability is based on an empirical Bayes version of Levene's test. Differential methylation testing is performed using RUV, which can adjust for systematic errors of unknown origin in high-dimensional data by using negative control probes. Gene ontology analysis is performed by taking into account the number of probes per gene on the array, as well as taking into account multi-gene associated probes.
It is designed to work with text written in Bahasa Malaysia. We provide functions and data sets that will make working with Bahasa Malaysia text much easier. For word stemming in particular, we will look up the Malay words in a dictionary and then proceed to remove "extra suffix" as explained in Khan, Rehman Ullah, Fitri Suraya Mohamad, Muh Inam UlHaq
, Shahren Ahmad Zadi Adruce, Philip Nuli Anding, Sajjad Nawaz Khan, and Abdulrazak Yahya Saleh Al-Hababi (2017) <https://ijrest.net/vol-4-issue-12.html> . This package includes a dictionary of Malay words that may be used to perform word stemming, a dataset of Malay stop words, a dataset of sentiment words and a dataset of normalized words.
This package contains efficient implementations of Discrete Optimal Transport algorithms for the computation of Kantorovich-Wasserstein distances between pairs of large spatial maps (Bassetti, Gualandi, Veneroni (2020), <doi:10.1137/19M1261195>). All the algorithms are based on an ad-hoc implementation of the Network Simplex algorithm. The package has four main helper functions: compareOneToOne()
(to compare two spatial maps), compareOneToMany()
(to compare a reference map with a list of other maps), compareAll()
(to compute a matrix of distances between a list of maps), and focusArea()
(to compute the KWD distance within a focus area). In non-convex maps, the helper functions first build the convex-hull of the input bins and pad the weights with zeros.
This package implements comprehensive test data engineering methods as described in Shojima (2022, ISBN:978-9811699856). Provides statistical techniques for engineering and processing test data: Classical Test Theory (CTT) with reliability coefficients for continuous ability assessment; Item Response Theory (IRT) including Rasch, 2PL, and 3PL models with item/test information functions; Latent Class Analysis (LCA) for nominal clustering; Latent Rank Analysis (LRA) for ordinal clustering with automatic determination of cluster numbers; Biclustering methods including infinite relational models for simultaneous clustering of examinees and items without predefined cluster numbers; and Bayesian Network Models (BNM) for visualizing inter-item dependencies. Features local dependence analysis through LRA and biclustering, parameter estimation, dimensionality assessment, and network structure visualization for educational, psychological, and social science research.