Tool for import and process data from Lattes curriculum platform (<http://lattes.cnpq.br/>). The Brazilian government keeps an extensive base of curricula for academics from all over the country, with over 5 million registrations. The academic life of the Brazilian researcher, or related to Brazilian universities, is documented in Lattes'. Some information that can be obtained: professional formation, research area, publications, academics advisories, projects, etc. getLattes package allows work with Lattes data exported to XML format.
This package provides tools for the estimation of Heckman selection models with robust variance-covariance matrices. It includes functions for computing the bread and meat matrices, as well as clustered standard errors for generalized Heckman models, see Fernando de Souza Bastos and Wagner Barreto-Souza and Marc G. Genton (2022, ISSN: <https://www.jstor.org/stable/27164235>). The package also offers cluster-robust inference with sandwich estimators, and tools for handling issues related to eigenvalues in covariance matrices.
Identifies chromatin interaction modules by constructing a Hi-C contact network based on statistically significant interactions, followed by network clustering. The method enables comparison of module connectivity across two Hi-C datasets and is capable of detecting cell-type-specific regulatory modules. By integrating network analysis with chromatin conformation data, this approach provides insights into the spatial organization of the genome and its functional implications in gene regulation. Author: Sora Yoon (2025) <https://github.com/ysora/HiCociety>.
This package provides tools for parsing NOAA Integrated Surface Data ('ISD') files, described at <https://www.ncdc.noaa.gov/isd>. Data includes for example, wind speed and direction, temperature, cloud data, sea level pressure, and more. Includes data from approximately 35,000 stations worldwide, though best coverage is in North America/Europe/Australia. Data is stored as variable length ASCII character strings, with most fields optional. Included are tools for parsing entire files, or individual lines of data.
This package provides a hybrid of the K-means algorithm and a Majorization-Minimization method to introduce a robust clustering. The reference paper is: Julien Mairal, (2015) <doi:10.1137/140957639>. The two most important functions in package MajKMeans are cluster_km() and cluster_MajKm(). cluster_km() clusters data without Majorization-Minimization and cluster_MajKm() clusters data with Majorization-Minimization method. Both of these functions calculate the sum of squares (SS) of clustering.
Evaluate a function across a grid of parameters. The function may be evaluated once, or many times for simulation. Parallel computing is facilitated. Utilities aim at performing analyses of power and sample size, allowing for easy search of minimum n (or min/max of any other parameter) to achieve a desired minimal level of power (or maximum of any other objective). Plotting functions are included that present the dependency of n and power in relation to further assumptions.
Introducing a novel and updated database showcasing Peru's endemic plants. This meticulously compiled and revised botanical collection encompasses a remarkable assemblage of over 7,898 distinct species. The data for this resource was sourced from the work of Govaerts, R., Nic Lughadha, E., Black, N. et al., titled The World Checklist of Vascular Plants: A continuously updated resource for exploring global plant diversity', published in Sci Data 8, 215 (2021) <doi:10.1038/s41597-021-00997-6>.
This package provides a collection of functions to compute the Rao-Stirling diversity index (Porter and Rafols, 2009) <DOI:10.1007/s11192-008-2197-2> and its extension to acknowledge missing data (i.e., uncategorized references) by calculating its interval of uncertainty using mathematical optimization as proposed in Calatrava et al. (2016) <DOI:10.1007/s11192-016-1842-4>. The Rao-Stirling diversity index is a well-established bibliometric indicator to measure the interdisciplinarity of scientific publications. Apart from the obligatory dataset of publications with their respective references and a taxonomy of disciplines that categorizes references as well as a measure of similarity between the disciplines, the Rao-Stirling diversity index requires a complete categorization of all references of a publication into disciplines. Thus, it fails for a incomplete categorization; in this case, the robust extension has to be used, which encodes the uncertainty caused by missing bibliographic data as an uncertainty interval. Classification / ACM - 2012: Information systems ~ Similarity measures, Theory of computation ~ Quadratic programming, Applied computing ~ Digital libraries and archives.
This package provides a wrapper for several FFTW functions. It provides access to the two-dimensional FFT, the multivariate FFT, and the one-dimensional real to complex FFT using the FFTW3 library. The package includes the functions fftw() and mvfftw() which are designed to mimic the functionality of the R functions fft() and mvfft(). The FFT functions have a parameter that allows them to not return the redundant complex conjugate when the input is real data.
Coralysis is an R package featuring a multi-level integration algorithm for sensitive integration, reference-mapping, and cell-state identification in single-cell data. The multi-level integration algorithm is inspired by the process of assembling a puzzle - where one begins by grouping pieces based on low-to high-level features, such as color and shading, before looking into shape and patterns. This approach progressively blends the batch effects and separates cell types across multiple rounds of divisive clustering.
This package produces metagene plots to compare coverages of sequencing experiments at selected groups of genomic regions. It can be used for such analyses as assessing the binding of DNA-interacting proteins at promoter regions or surveying antisense transcription over the length of a gene. The metagene2 package can manage all aspects of the analysis, from normalization of coverages to plot facetting according to experimental metadata. Bootstraping analysis is used to provide confidence intervals of per-sample mean coverages.
This package provides pathway enrichment techniques for miRNA expression data. Specifically, the set of methods handles the many-to-many relationship between miRNAs and the multiple genes they are predicted to target (and thus affect.) It also handles the gene-to-pathway relationships separately. Both steps are designed to preserve the additive effects of miRNAs on genes, many miRNAs affecting one gene, one miRNA affecting multiple genes, or many miRNAs affecting many genes.
seqsetvis enables the visualization and analysis of sets of genomic sites in next gen sequencing data. Although seqsetvis was designed for the comparison of mulitple ChIP-seq samples, this package is domain-agnostic and allows the processing of multiple genomic coordinate files (bed-like files) and signal files (bigwig files pileups from bam file). seqsetvis has multiple functions for fetching data from regions into a tidy format for analysis in data.table or tidyverse and visualization via ggplot2.
This package provides functions to conduct title and abstract screening in systematic reviews using large language models, such as the Generative Pre-trained Transformer (GPT) models from OpenAI <https://platform.openai.com/>. These functions can enhance the quality of title and abstract screenings while reducing the total screening time significantly. In addition, the package includes tools for quality assessment of title and abstract screenings, as described in Vembye, Christensen, Mølgaard, and Schytt (2025) <DOI:10.1037/met0000769>.
Extends ACER ConQuest through a family of functions designed to improve graphical outputs and help with advanced analysis (e.g., differential item functioning). Allows R users to call ACER ConQuest from within R and read ACER ConQuest System Files (generated by the command `put` <https://conquestmanual.acer.org/s4-00.html#put>). Requires ACER ConQuest version 5.40 or later. A demonstration version can be downloaded from <https://shop.acer.org/acer-conquest-5.html>.
This package provides R bindings to the dockview JavaScript library <https://dockview.dev/>. Create fully customizable grid layouts (docks) in seconds to include in interactive R reports with R Markdown or Quarto or in shiny apps <https://shiny.posit.co/>. In shiny mode, modify docks by dynamically adding, removing or moving panels or groups of panels from the server function. Choose among 8 stunning themes (dark and light), serialise the state of a dock to restore it later.
Using variational techniques we address some epidemiological problems as the incidence curve decomposition by inverting the renewal equation as described in Alvarez et al. (2021) <doi:10.1073/pnas.2105112118> and Alvarez et al. (2022) <doi:10.3390/biology11040540> or the estimation of the functional relationship between epidemiological indicators. We also propose a learning method for the short time forecast of the trend incidence curve as described in Morel et al. (2022) <doi:10.1101/2022.11.05.22281904>.
This package contains four main functions (i.e., four pieces of furniture): table1() which produces a well-formatted table of descriptive statistics common as Table 1 in research articles, tableC() which produces a well-formatted table of correlations, tableF() which provides frequency counts, and washer() which is helpful in cleaning up the data. These furniture-themed functions are designed to simplify common tasks in quantitative analysis. Other data summary and cleaning tools are also available.
This package creates presentation-ready tables summarizing data sets, regression models, and more. The code to create the tables is concise and highly customizable. Data frames can be summarized with any function, e.g. mean(), median(), even user-written functions. Regression models are summarized and include the reference rows for categorical variables. Common regression models, such as logistic regression and Cox proportional hazards regression, are automatically identified and the tables are pre-filled with appropriate column headers.
When you prepare a presentation or a report, you often need to manage a large number of ggplot figures. You need to change the figure size, modify the title, label, themes, etc. It is inconvenient to go back to the original code to make these changes. This package provides a simple way to manage ggplot figures. You can easily add the figure to the database and update them later using CLI (command line interface) or GUI (graphical user interface).
Bayesian inference analysis for bivariate meta-analysis of diagnostic test studies using integrated nested Laplace approximation with INLA. A purpose built graphic user interface is available. The installation of R package INLA is compulsory for successful usage. The INLA package can be obtained from <https://www.r-inla.org>. We recommend the testing version, which can be downloaded by running: install.packages("INLA", repos=c(getOption("repos"), INLA="https://inla.r-inla-download.org/R/testing"), dep=TRUE).
Offers an easy and automated way to scale up individual-level space use analysis to that of groups. Contains a function from the move package to calculate a dynamic Brownian bridge movement model from movement data for individual animals, as well as functions to visualize and quantify space use for individuals aggregated in groups. Originally written with passive acoustic telemetry in mind, this package also provides functionality to account for unbalanced acoustic receiver array designs, and satellite tag data.
The `scorecard` package makes the development of credit risk scorecard easier and efficient by providing functions for some common tasks, such as data partition, variable selection, woe binning, scorecard scaling, performance evaluation and report generation. These functions can also used in the development of machine learning models. The references including: 1. Refaat, M. (2011, ISBN: 9781447511199). Credit Risk Scorecard: Development and Implementation Using SAS. 2. Siddiqi, N. (2006, ISBN: 9780471754510). Credit risk scorecards. Developing and Implementing Intelligent Credit Scoring.
This package provides a set of functions to build a scoring model from beginning to end, leading the user to follow an efficient and organized development process, reducing significantly the time spent on data exploration, variable selection, feature engineering, binning and model selection among other recurrent tasks. The package also incorporates monotonic and customized binning, scaling capabilities that transforms logistic coefficients into points for a better business understanding and calculates and visualizes classic performance metrics of a classification model.