This package provides a toolkit of high-level functions for DNA motif scanning and enrichment analysis built upon Biostrings. The main functionality is PWM enrichment analysis of already known PWMs (e.g. from databases such as MotifDb), but the package also implements high-level functions for PWM scanning and visualisation. The package does not perform "de novo" motif discovery, but is instead focused on using motifs that are either experimentally derived or computationally constructed by other tools.
This package performs a Gene Set Analysis with the approach adopted by PADOG on the genes that are reported as translationally regulated (ie. exhibit a significant change in TE) by the DeltaTE package. It can be used on its own to see the impact of translation regulation on gene sets, but it is also integrated as an additional analysis method within ReactomeGSA, where results are further contextualised in terms of pathways and directionality of the change.
Computational tools for outlier detection and influence diagnostics in meta-analysis (Noma et al. (2025) <doi:10.1101/2025.09.18.25336125>). Bootstrap distributions of influence statistics are computed, and explicit thresholds for identifying outliers are provided. These methods can also be applied to the analysis of influential centers or regions in multicenter or multiregional clinical trials (Aoki, Noma and Gosho (2021) <doi:10.1080/24709360.2021.1921944>, Nakamura and Noma (2021) <doi:10.5691/jjb.41.117>).
This package provides a set of functions for organising and analysing datasets from experiments run using Eyelink eye-trackers. Organising functions help to clean and prepare eye-tracking datasets for analysis, and mark up key events such as display changes and responses made by participants. Analysing functions help to create means for a wide range of standard measures (such as mean fixation durations'), which can then be fed into the appropriate statistical analyses and graphing packages as necessary.
An implementation of the fair data adaptation with quantile preservation described in Plecko & Meinshausen (JMLR 2020, 21(242), 1-44). The adaptation procedure uses the specified causal graph to pre-process the given training and testing data in such a way to remove the bias caused by the protected attribute. The procedure uses tree ensembles for quantile regression. Instructions for using the methods are further elaborated in the corresponding JSS manuscript, see <doi:10.18637/jss.v110.i04>.
An implementation of the Fizz Buzz algorithm, as defined e.g. in <https://en.wikipedia.org/wiki/Fizz_buzz>. It provides the standard algorithm with 3 replaced by Fizz and 5 replaced by Buzz, with the option of specifying start and end numbers, step size and the numbers being replaced by fizz and buzz, respectively. This package gives interviewers the optional answer of "I use fizzbuzzR::fizzbuzz()" when interviewing rather than having to write an algorithm themselves.
Tool for import and process data from Lattes curriculum platform (<http://lattes.cnpq.br/>). The Brazilian government keeps an extensive base of curricula for academics from all over the country, with over 5 million registrations. The academic life of the Brazilian researcher, or related to Brazilian universities, is documented in Lattes'. Some information that can be obtained: professional formation, research area, publications, academics advisories, projects, etc. getLattes package allows work with Lattes data exported to XML format.
This package provides tools for the estimation of Heckman selection models with robust variance-covariance matrices. It includes functions for computing the bread and meat matrices, as well as clustered standard errors for generalized Heckman models, see Fernando de Souza Bastos and Wagner Barreto-Souza and Marc G. Genton (2022, ISSN: <https://www.jstor.org/stable/27164235>). The package also offers cluster-robust inference with sandwich estimators, and tools for handling issues related to eigenvalues in covariance matrices.
This package provides tools for parsing NOAA Integrated Surface Data ('ISD') files, described at <https://www.ncdc.noaa.gov/isd>. Data includes for example, wind speed and direction, temperature, cloud data, sea level pressure, and more. Includes data from approximately 35,000 stations worldwide, though best coverage is in North America/Europe/Australia. Data is stored as variable length ASCII character strings, with most fields optional. Included are tools for parsing entire files, or individual lines of data.
This package provides a hybrid of the K-means algorithm and a Majorization-Minimization method to introduce a robust clustering. The reference paper is: Julien Mairal, (2015) <doi:10.1137/140957639>. The two most important functions in package MajKMeans are cluster_km() and cluster_MajKm(). cluster_km() clusters data without Majorization-Minimization and cluster_MajKm() clusters data with Majorization-Minimization method. Both of these functions calculate the sum of squares (SS) of clustering.
Evaluate a function across a grid of parameters. The function may be evaluated once, or many times for simulation. Parallel computing is facilitated. Utilities aim at performing analyses of power and sample size, allowing for easy search of minimum n (or min/max of any other parameter) to achieve a desired minimal level of power (or maximum of any other objective). Plotting functions are included that present the dependency of n and power in relation to further assumptions.
Introducing a novel and updated database showcasing Peru's endemic plants. This meticulously compiled and revised botanical collection encompasses a remarkable assemblage of over 7,898 distinct species. The data for this resource was sourced from the work of Govaerts, R., Nic Lughadha, E., Black, N. et al., titled The World Checklist of Vascular Plants: A continuously updated resource for exploring global plant diversity', published in Sci Data 8, 215 (2021) <doi:10.1038/s41597-021-00997-6>.
Create sets of variables based on a mutual information approach. In this context, a set is a collection of distinct elements (e.g., variables) that can also be treated as a single entity. Mutual information, a concept from probability theory, quantifies the dependence between two variables by expressing how much information about one variable can be gained from observing the other. Furthermore, you can analyze, and visualize these sets in order to better understand the relationships among variables.
Generates interactive plots for analysing and visualising three-class high dimensional data. It is particularly suited to visualising differences in continuous attributes such as gene/protein/biomarker expression levels between three groups. Differential gene/biomarker expression analysis between two classes is typically shown as a volcano plot. However, with three groups this type of visualisation is particularly difficult to interpret. This package generates 3D volcano plots and 3-way polar plots for easier interpretation of three-class data.
This package provides a collection of functions to compute the Rao-Stirling diversity index (Porter and Rafols, 2009) <DOI:10.1007/s11192-008-2197-2> and its extension to acknowledge missing data (i.e., uncategorized references) by calculating its interval of uncertainty using mathematical optimization as proposed in Calatrava et al. (2016) <DOI:10.1007/s11192-016-1842-4>. The Rao-Stirling diversity index is a well-established bibliometric indicator to measure the interdisciplinarity of scientific publications. Apart from the obligatory dataset of publications with their respective references and a taxonomy of disciplines that categorizes references as well as a measure of similarity between the disciplines, the Rao-Stirling diversity index requires a complete categorization of all references of a publication into disciplines. Thus, it fails for a incomplete categorization; in this case, the robust extension has to be used, which encodes the uncertainty caused by missing bibliographic data as an uncertainty interval. Classification / ACM - 2012: Information systems ~ Similarity measures, Theory of computation ~ Quadratic programming, Applied computing ~ Digital libraries and archives.
This package provides a wrapper for several FFTW functions. It provides access to the two-dimensional FFT, the multivariate FFT, and the one-dimensional real to complex FFT using the FFTW3 library. The package includes the functions fftw() and mvfftw() which are designed to mimic the functionality of the R functions fft() and mvfft(). The FFT functions have a parameter that allows them to not return the redundant complex conjugate when the input is real data.
Coralysis is an R package featuring a multi-level integration algorithm for sensitive integration, reference-mapping, and cell-state identification in single-cell data. The multi-level integration algorithm is inspired by the process of assembling a puzzle - where one begins by grouping pieces based on low-to high-level features, such as color and shading, before looking into shape and patterns. This approach progressively blends the batch effects and separates cell types across multiple rounds of divisive clustering.
This package produces metagene plots to compare coverages of sequencing experiments at selected groups of genomic regions. It can be used for such analyses as assessing the binding of DNA-interacting proteins at promoter regions or surveying antisense transcription over the length of a gene. The metagene2 package can manage all aspects of the analysis, from normalization of coverages to plot facetting according to experimental metadata. Bootstraping analysis is used to provide confidence intervals of per-sample mean coverages.
This package provides pathway enrichment techniques for miRNA expression data. Specifically, the set of methods handles the many-to-many relationship between miRNAs and the multiple genes they are predicted to target (and thus affect.) It also handles the gene-to-pathway relationships separately. Both steps are designed to preserve the additive effects of miRNAs on genes, many miRNAs affecting one gene, one miRNA affecting multiple genes, or many miRNAs affecting many genes.
seqsetvis enables the visualization and analysis of sets of genomic sites in next gen sequencing data. Although seqsetvis was designed for the comparison of mulitple ChIP-seq samples, this package is domain-agnostic and allows the processing of multiple genomic coordinate files (bed-like files) and signal files (bigwig files pileups from bam file). seqsetvis has multiple functions for fetching data from regions into a tidy format for analysis in data.table or tidyverse and visualization via ggplot2.
This package provides functions to conduct title and abstract screening in systematic reviews using large language models, such as the Generative Pre-trained Transformer (GPT) models from OpenAI <https://platform.openai.com/>. These functions can enhance the quality of title and abstract screenings while reducing the total screening time significantly. In addition, the package includes tools for quality assessment of title and abstract screenings, as described in Vembye, Christensen, Mølgaard, and Schytt (2025) <DOI:10.1037/met0000769>.
Extends ACER ConQuest through a family of functions designed to improve graphical outputs and help with advanced analysis (e.g., differential item functioning). Allows R users to call ACER ConQuest from within R and read ACER ConQuest System Files (generated by the command `put` <https://conquestmanual.acer.org/s4-00.html#put>). Requires ACER ConQuest version 5.40 or later. A demonstration version can be downloaded from <https://shop.acer.org/acer-conquest-5.html>.
This package provides R bindings to the dockview JavaScript library <https://dockview.dev/>. Create fully customizable grid layouts (docks) in seconds to include in interactive R reports with R Markdown or Quarto or in shiny apps <https://shiny.posit.co/>. In shiny mode, modify docks by dynamically adding, removing or moving panels or groups of panels from the server function. Choose among 8 stunning themes (dark and light), serialise the state of a dock to restore it later.
Using variational techniques we address some epidemiological problems as the incidence curve decomposition by inverting the renewal equation as described in Alvarez et al. (2021) <doi:10.1073/pnas.2105112118> and Alvarez et al. (2022) <doi:10.3390/biology11040540> or the estimation of the functional relationship between epidemiological indicators. We also propose a learning method for the short time forecast of the trend incidence curve as described in Morel et al. (2022) <doi:10.1101/2022.11.05.22281904>.