This package provides a tidy interface for integrating large language model (LLM) APIs such as Claude', Openai', Gemini','Mistral and local models via Ollama into R workflows. The package supports text and media-based interactions, interactive message history, batch request APIs, and a tidy, pipeline-oriented interface for streamlined integration into data workflows. Web services are available at <https://www.anthropic.com>, <https://openai.com>, <https://aistudio.google.com/>, <https://mistral.ai/> and <https://ollama.com>.
An R API providing easy access to a relational database with macroeconomic, financial and development related time series data for Uganda. Overall more than 5000 series at varying frequency (daily, monthly, quarterly, annual in fiscal or calendar years) can be accessed through the API. The data is provided by the Bank of Uganda, the Ugandan Ministry of Finance, Planning and Economic Development, the IMF and the World Bank. The database is being updated once a month.
Various semiparametric and nonparametric statistical tools for immune correlates analysis of vaccine clinical trial data. This includes calculation of summary statistics and estimation of risk, vaccine efficacy, controlled effects (controlled risk and controlled vaccine efficacy), and mediation effects (natural direct effect, natural indirect effect, proportion mediated). See Gilbert P, Fong Y, Kenny A, and Carone, M (2022) <doi:10.1093/biostatistics/kxac024> and Fay MP and Follmann DA (2023) <doi:10.48550/arXiv.2208.06465>.
Infectious disease surveillance requires early outbreak detection. This package provides statistical tools for analyzing time-series monitoring data through three core methods: a) EWMA (Exponentially Weighted Moving Average) b) Modified-CUSUM (Modified Cumulative Sum) c) Adjusted-Serfling models Methodologies are based on: - Wang et al. (2010) <doi:10.1016/j.jbi.2009.08.003> - Wang et al. (2015) <doi:10.1371/journal.pone.0119923> Designed for epidemiologists and public health researchers working with disease surveillance systems.
This tool enables in-database scoring of XGBoost models built in R, by translating trained model objects into SQL query. XGBoost <https://github.com/dmlc/xgboost> provides parallel tree boosting (also known as gradient boosting machine, or GBM) algorithms in a highly efficient, flexible and portable way. GBM algorithm is introduced by Friedman (2001) <doi:10.1214/aos/1013203451>, and more details on XGBoost can be found in Chen & Guestrin (2016) <doi:10.1145/2939672.2939785>.
Fastseg implements a very fast and efficient segmentation algorithm. It can segment data from DNA microarrays and data from next generation sequencing for example to detect copy number segments. Further it can segment data from RNA microarrays like tiling arrays to identify transcripts. Most generally, it can segment data given as a matrix or as a vector. Various data formats can be used as input to fastseg like expression set objects for microarrays or GRanges for sequencing data.
This package provides a flexible approach to Bayesian optimization / model based optimization building on the bbotk package. The mlr3mbo is a toolbox providing both ready-to-use optimization algorithms as well as their fundamental building blocks allowing for straightforward implementation of custom algorithms. Single- and multi-objective optimization is supported as well as mixed continuous, categorical and conditional search spaces. Moreover, using mlr3mbo for hyperparameter optimization of machine learning models within the mlr3 ecosystem is straightforward via mlr3tuning.
HuBMAP provides an open, global bio-molecular atlas of the human body at the cellular level. The `datasets()`, `samples()`, `donors()`, `publications()`, and `collections()` functions retrieves the information for each of these entity types. `*_details()` are available for individual entries of each entity type. `*_derived()` are available for retrieving derived datasets or samples for individual entries of each entity type. Data files can be accessed using `bulk_data_transfer()`.
Create a pie like plot to visualise if the aim or several aims of a project is achieved or close to be achieved i.e the aim is achieved when the point is at the center of the pie plot. Imagine it's like a dartboard and the center means 100% completeness/achievement. Achievement can also be understood as 100% coverage. The standard distribution of completeness allocated in the pie plot is 50%, 80% and 100% completeness.
Dynamic path analysis with estimation of the corresponding direct, indirect, and total effects, based on Fosen et al., (2006) <doi:10.1007/s10985-006-9004-2>. The main outcome of interest is a counting process from survival analysis (or recurrent events) data. At each time of event, ordinary linear regression is used to estimate the relation between the covariates, while Aalen's additive hazard model is used for the regression of the counting process on the covariates.
Main function "decode" is used to decode coded key values to plain text. Function "code" can be used to code plain text to code if there is a 1:1 relation between the two. The concept relies on keyvalue objects used for translation. There are several keyvalue objects included in the areas of geographical regional codes, administrative health care unit codes, diagnosis codes and more. It is also easy to extend the use by arbitrary code sets.
Collection of R functions and data sets for the support of spatial ecology analyses with a focus on pre, core and post modelling analyses of species distribution, niche quantification and community assembly. Written by current and former members and collaborators of the ecospat group of Antoine Guisan, Department of Ecology and Evolution (DEE) and Institute of Earth Surface Dynamics (IDYST), University of Lausanne, Switzerland. Read Di Cola et al. (2016) <doi:10.1111/ecog.02671> for details.
It allows running EViews (<https://eviews.com>) program from R, R Markdown and Quarto documents. EViews (Econometric Views) is a statistical software for Econometric analysis. This package integrates EViews and R and also serves as an EViews Knit-Engine for knitr package. Write all your EViews commands in R, R Markdown or Quarto documents. For details, please consult our peer-review article Mati S., Civcir I. and Abba S.I (2023) <doi:10.32614/RJ-2023-045>.
This package provides a consistent, unified and extensible framework for estimation of parameters for probability distributions, including parameter estimation procedures that allow for weighted samples; the current set of distributions included are: the standard beta, The four-parameter beta, Burr, gamma, Gumbel, Johnson SB and SU, Laplace, logistic, normal, symmetric truncated normal, truncated normal, symmetric-reflected truncated beta, standard symmetric-reflected truncated beta, triangular, uniform, and Weibull distributions; decision criteria and selections based on these decision criteria.
Download and process public education data from INEP (Instituto Nacional de Estudos e Pesquisas Educacionais Anà sio Teixeira). Provides functions to access microdata from the School Census (Censo Escolar), ENEM (Exame Nacional do Ensino Médio), IDEB (à ndice de Desenvolvimento da Educação Básica), and other educational datasets. Returns data in tidy format ready for analysis. Data source: INEP Open Data Portal <https://www.gov.br/inep/pt-br/acesso-a-informacao/dados-abertos>.
We consider optimal subset selection in the setting that one needs to use only one data subset to represent the whole data set with minimum information loss, and devise a novel intersection-based criterion on selecting optimal subset, called as the FPC criterion, to handle with the optimal sub-estimator in distributed principal component analysis; That is, the FPCdpca. The philosophy of the package is described in Guo G. (2025) <doi:10.1016/j.physa.2024.130308>.
Multifactor nonparametric analysis of variance based on ranks. Builds on the Kruskal-Wallis H test and its 2x2 Scheirer-Ray-Hare extension to handle any factorial designs. Provides effect sizes, Dunn-Bonferroni pairwise-comparison matrices, and simple-effects analyses. Tailored for psychology and the social sciences, with beginner-friendly R syntax and outputs that can be dropped into journal reports. Includes helpers to export tab-separated results and compact tables of descriptive statistics (to APA-style reports).
We provide the monthly number of HIV and antiretroviral therapy (ART) cases of male, female, children and transgender as well as for the whole of Pakistan reported at various treatment centers in Pakistan from January 2016 to December 2021. Related works include: a) Imran, M., Nasir, J. A., & Riaz, S. (2018). Regional pattern of HIV cases in Pakistan. Journal of Postgraduate Medical Institute, 32(1), 9-13. <https://jpmi.org.pk/index.php/jpmi/article/view/2108>.
Categorization and scoring of injury severity typically involves trained personnel with access to injured persons or their medical records. icdpicr contains a function that provides automated calculation of Abbreviated Injury Scale ('AIS') and Injury Severity Score ('ISS') from International Classification of Diseases ('ICD') codes and may be a useful substitute to manual injury severity scoring. ICDPIC was originally developed in Stata', and icdpicr is an open-access update that accepts both ICD-9 and ICD-10 codes.
Infix functions in R are those that comes between its arguments such as %in%, +, and *. These are useful in R programming when manipulating data, performing logical operations, and making new functions. infixit extends the infix functions found in R to simplify frequent tasks, such as finding elements that are NOT in a set, in-line text concatenation, augmented assignment operations, additional logical and control flow operators, and identifying if a number or date lies between two others.
This package creates a consensus genetic map by merging linkage maps from different populations. The software uses linear programming (LP) to efficiently minimize the mean absolute error between the consensus map and the linkage maps. This minimization is performed subject to linear inequality constraints that ensure the ordering of the markers in the linkage maps is preserved. When marker order is inconsistent between linkage maps, a minimum set of ordinal constraints is deleted to resolve the conflicts.
This package provides access to the LDlink API (<https://ldlink.nih.gov/?tab=apiaccess>) using the R console. This programmatic access facilitates researchers who are interested in performing batch queries in 1000 Genomes Project (2015) <doi:10.1038/nature15393> data using LDlink'. LDlink is an interactive and powerful suite of web-based tools for querying germline variants in human population groups of interest. For more details, please see Machiela et al. (2015) <doi:10.1093/bioinformatics/btv402>.
Lightweight maps of mammals of the world. These maps are a comprehensive collection of maps aligned with the Mammal Diversity Database taxonomy of the American Society of Mammalogists. They are generated at low resolution for easy access, consultation and manipulation in shapefile format. The package connects to a binary backup hosted in the Digital Ocean cloud service and allows individual or batch download of any mammal species in the mdd taxonomy by providing the scientific species name.
This package provides a specialized collection of measles epidemiological models built on the epiworldR framework. This package is a spinoff from epiworldR focusing specifically on measles transmission dynamics. It includes models for school settings with quarantine and isolation policies, mixing models with population groups, and risk-based quarantine strategies. The models use Agent-Based Models (ABM) with a fast C++ backend from the epiworld library. Ideal for studying measles outbreaks, vaccination strategies, and intervention policies.