An R implementation of the cross-platform, language-independent "port4me" algorithm (<https://github.com/HenrikBengtsson/port4me>), which (1) finds a free Transmission Control Protocol ('TCP') port in [1024,65535] that the user can open, (2) is designed to work in multi-user environments, (3), gives different users, different ports, (4) gives the user the same port over time with high probability, (5) gives different ports for different software tools, and (6) requires no configuration.
This package provides tools developed to facilitate the establishment of the rank and social hierarchy for gregarious animals by the Si method developed by Kondo & Hurnik (1990)<doi:10.1016/0168-1591(90)90125-W>. It is also possible to determine the number of agonistic interactions between two individuals, sociometric and dyadics matrix from dataset obtained through electronic bins. In addition, it is possible plotting the results using a bar plot, box plot, and sociogram.
Perform association test within linear mixed model framework using score test integrated with Empirical Bayes for genome-wide association study. Firstly, score test was conducted for each marker under linear mixed model framework, taking into account the genetic relatedness and population structure. And then all the potentially associated markers were selected with a less stringent criterion. Finally, all the selected markers were placed into a multi-locus model to identify the true quantitative trait nucleotide.
This package contains more modern tools for causal inference using regression standardization. Four general classes of models are implemented; generalized linear models, conditional generalized estimating equation models, Cox proportional hazards models, and shared frailty gamma-Weibull models. Methodological details are described in Sjölander, A. (2016) <doi:10.1007/s10654-016-0157-3>. Also includes functionality for doubly robust estimation for generalized linear models in some special cases, and the ability to implement custom models.
cytofQC is a package for initial cleaning of CyTOF data. It uses a semi-supervised approach for labeling cells with their most likely data type (bead, doublet, debris, dead) and the probability that they belong to each label type. This package does not remove data from the dataset, but provides labels and information to aid the data user in cleaning their data. Our algorithm is able to distinguish between doublets and large cells.
This package provides an R interface to Extreme Gradient Boosting, which is an efficient implementation of the gradient boosting framework from Chen and Guestrin (2016). The package includes efficient linear model solver and tree learning algorithms. The package can automatically do parallel computation on a single machine. It supports various objective functions, including regression, classification and ranking. The package is made to be extensible, so that users are also allowed to define their own objectives easily.
Archimax copulas are a mixture of Archimedean and EV copulas. This package provides definitions of several parametric families of generator and dependence function, computes CDF and PDF, estimates parameters, tests for goodness of fit, generates random sample and checks copula properties for custom constructs. In the 2-dimensional case explicit formulas for density are used, contrary to higher dimensions when all derivatives are linearly approximated. Several non-archimax families (normal, FGM, Plackett) are provided as well.
This package provides the basic functionality to interact with the Collatz conjecture. The parameterisation uses the same (P,a,b) notation as Conway's generalisations. Besides the function and reverse function, there is also functionality to retrieve the hailstone sequence, the "stopping time"/"total stopping time", or tree-graph. The only restriction placed on parameters is that both P and a can't be 0. For further reading, see <https://en.wikipedia.org/wiki/Collatz_conjecture>.
This package provides a collection of functions that have been developed to assist experimenter in modeling chemical degradation kinetic data. The selection of the appropriate degradation model and parameter estimation is carried out automatically as far as possible and is driven by a rigorous statistical interpretation of the results. The package integrates already available goodness-of-fit statistics for nonlinear models. In addition it allows data fitting with the nonlinear first-order multi-target (FOMT) model.
Computes the Extended Chen-Poisson (ecp) distribution, survival, density, hazard, cumulative hazard and quantile functions. It also allows to generate a pseudo-random sample from this distribution. The corresponding graphics are available. Functions to obtain measures of skewness and kurtosis, k-th raw moments, conditional k-th moments and mean residual life function were added. For details about ecp distribution, see Sousa-Ferreira, I., Abreu, A.M. & Rocha, C. (2023). <doi:10.57805/revstat.v21i2.405>.
Analyze functional data and its change points. Includes functionality to store and process data, summarize and validate assumptions, characterize and perform inference of change points, and provide visualizations. Data is stored as discretely collected observations without requiring the selection of basis functions. For more details see chapter 8 of Horvath and Rice (2024) <doi:10.1007/978-3-031-51609-2>. Additional papers are forthcoming. Focused works are also included in the documentation of corresponding functions.
Statistical tests widely utilized in biostatistics, public policy, and law. Along with the well-known tests for equality of means and variances, randomness, and measures of relative variability, the package contains new robust tests of symmetry, omnibus and directional tests of normality, and their graphical counterparts such as robust QQ plot, robust trend tests for variances, etc. All implemented tests and methods are illustrated by simulations and real-life examples from legal statistics, economics, and biostatistics.
This package provides statistical components, tables, and graphs that are useful in Quarto and RMarkdown reports and that produce Quarto elements for special formatting such as tabs and marginal notes and graphs. Some of the functions produce entire report sections with tabs, e.g., the missing data report created by missChk(). Functions for inserting variables and tables inside graphviz and mermaid diagrams are included, and so are special clinical trial graphics for adverse event reporting.
Tool for statistical simulations that have two components. One component generates the data and the other one analyzes the data. The main aims of the package are the reduction of the administrative source code (mainly loops and management code for the results) and a simple applicability of the package that allows the user to quickly learn how to work with it. Parallel computing is also supported. Finally, convenient functions are provided to summarize the simulation results.
Several statistical test functions as well as a function for exploratory data analysis to investigate classifiers allocating individuals to one of three disjoint and ordered classes. In a single classifier assessment the discriminatory power is compared to classification by chance. In a comparison of two classifiers the null hypothesis corresponds to equal discriminatory power of the two classifiers. See also "ROC Analysis for Classification and Prediction in Practice" by Nakas, Bantis and Gatsonis (2023), ISBN 9781482233704.
This package provides a tidy interface for integrating large language model (LLM) APIs such as Claude', Openai', Gemini','Mistral and local models via Ollama into R workflows. The package supports text and media-based interactions, interactive message history, batch request APIs, and a tidy, pipeline-oriented interface for streamlined integration into data workflows. Web services are available at <https://www.anthropic.com>, <https://openai.com>, <https://aistudio.google.com/>, <https://mistral.ai/> and <https://ollama.com>.
An R API providing easy access to a relational database with macroeconomic, financial and development related time series data for Uganda. Overall more than 5000 series at varying frequency (daily, monthly, quarterly, annual in fiscal or calendar years) can be accessed through the API. The data is provided by the Bank of Uganda, the Ugandan Ministry of Finance, Planning and Economic Development, the IMF and the World Bank. The database is being updated once a month.
Various semiparametric and nonparametric statistical tools for immune correlates analysis of vaccine clinical trial data. This includes calculation of summary statistics and estimation of risk, vaccine efficacy, controlled effects (controlled risk and controlled vaccine efficacy), and mediation effects (natural direct effect, natural indirect effect, proportion mediated). See Gilbert P, Fong Y, Kenny A, and Carone, M (2022) <doi:10.1093/biostatistics/kxac024> and Fay MP and Follmann DA (2023) <doi:10.48550/arXiv.2208.06465>.
Infectious disease surveillance requires early outbreak detection. This package provides statistical tools for analyzing time-series monitoring data through three core methods: a) EWMA (Exponentially Weighted Moving Average) b) Modified-CUSUM (Modified Cumulative Sum) c) Adjusted-Serfling models Methodologies are based on: - Wang et al. (2010) <doi:10.1016/j.jbi.2009.08.003> - Wang et al. (2015) <doi:10.1371/journal.pone.0119923> Designed for epidemiologists and public health researchers working with disease surveillance systems.
This package provides tools to perform hierarchical inference for one or multiple studies / data sets based on high-dimensional multivariate (generalised) linear models. A possible application is to perform hierarchical inference for GWA studies to find significant groups or single SNPs (if the signal is strong) in a data-driven and automated procedure. The method is based on an efficient hierarchical multiple testing correction and controls the FWER. The functions can easily be run in parallel.
Set of utility functions for viral quasispecies analysis with NGS data. Most functions are equally useful for metagenomic studies. There are three main types: (1) data manipulation and exploration—functions useful for converting reads to haplotypes and frequencies, repairing reads, intersecting strand haplotypes, and visualizing haplotype alignments. (2) diversity indices—functions to compute diversity and entropy, in which incidence, abundance, and functional indices are considered. (3) data simulation—functions useful for generating random viral quasispecies data.
This package provides a Davidian curve defines a seminonparametric density, whose shape and flexibility can be tuned by easy to estimate parameters. Since a special case of a Davidian curve is the standard normal density, Davidian curves can be used for relaxing normality assumption in statistical applications (Zhang & Davidian, 2001) <doi:10.1111/j.0006-341X.2001.00795.x>. This package provides the density function, the gradient of the loglikelihood and a random generator for Davidian curves.
Create a pie like plot to visualise if the aim or several aims of a project is achieved or close to be achieved i.e the aim is achieved when the point is at the center of the pie plot. Imagine it's like a dartboard and the center means 100% completeness/achievement. Achievement can also be understood as 100% coverage. The standard distribution of completeness allocated in the pie plot is 50%, 80% and 100% completeness.
Main function "decode" is used to decode coded key values to plain text. Function "code" can be used to code plain text to code if there is a 1:1 relation between the two. The concept relies on keyvalue objects used for translation. There are several keyvalue objects included in the areas of geographical regional codes, administrative health care unit codes, diagnosis codes and more. It is also easy to extend the use by arbitrary code sets.