Estimation of large Vector AutoRegressive (VAR), Vector AutoRegressive with Exogenous Variables X (VARX) and Vector AutoRegressive Moving Average (VARMA) Models with Structured Lasso Penalties, see Nicholson, Wilms, Bien and Matteson (2020) <https://jmlr.org/papers/v21/19-777.html> and Wilms, Basu, Bien and Matteson (2021) <doi:10.1080/01621459.2021.1942013>.
Boosting Regression Quantiles is a component-wise boosting algorithm, that embeds all boosting steps in the well-established framework of quantile regression. It is initialized with the corresponding quantile, uses a quantile-specific learning rate, and uses quantile regression as its base learner. The package implements this algorithm and allows cross-validation and stability selection.
Gain access to the Spark Catalog API making use of the sparklyr API. Catalog <https://spark.apache.org/docs/2.4.3/api/java/org/apache/spark/sql/catalog/Catalog.html> is the interface for managing a metastore (aka metadata catalog) of relational entities (e.g. database(s), tables, functions, table columns and temporary views).
Fast fitting of Stable Isotope Mixing Models in R. Allows for the inclusion of covariates. Also has built-in summary functions and plot functions which allow for the creation of isospace plots. Variational Bayes is used to fit these models, methods as described in: Tran et al., (2021) <doi:10.48550/arXiv.2103.01327>.
Phone numbers are often represented as strings because there is no obvious and suitable native representation for them. This leads to high memory use and a lack of standard representation. The package provides integer representation of Australian phone numbers with optional raw vector calling code. The package name is an extension of au and ph'.
The main functions are emmreml', and emmremlMultiKernel'. emmreml solves a mixed model with known covariance structure using the EMMA algorithm. emmremlMultiKernel is a wrapper for emmreml to handle multiple random components with known covariance structures. The function emmremlMultivariate solves a multivariate gaussian mixed model with known covariance structure using the ECM algorithm.
This package provides tools for efficient processing of large, whole genome genotype data sets in variant call format (VCF). It includes several functions to calculate commonly used population genomic metrics and a method for reference panel free genotype imputation, which is described in the preprint Gurke & Mayer (2024) <doi:10.22541/au.172515591.10119928/v1>.
Automatic open data acquisition from resources of IGN ('Institut National de Information Geographique et forestiere') (<https://www.ign.fr/>). Available datasets include various types of raster and vector data, such as digital elevation models, state borders, spatial databases, cadastral parcels, and more. happign also provide access to API Carto (<https://apicarto.ign.fr/api/doc/>).
Dieses R-Paket stellt Zusatzmaterial in Form von Daten, Funktionen und R-Hilfe-Seiten für den Herausgeberband Breit, S. und Schreiner, C. (Hrsg.). (2016). "Large-Scale Assessment mit R: Methodische Grundlagen der österreichischen Bildungsstandardüberprüfung." Wien: facultas. (ISBN: 978-3-7089-1343-8, <https://www.iqs.gv.at/themen/bildungsforschung/publikationen/veroeffentlichte-publikationen>) zur Verfügung.
This package provides functionality to produce graphs of sampling distributions of test statistics from a variety of common statistical tests. With only a few keystrokes, the user can conduct a hypothesis test and visualize the test statistic and corresponding p-value through the shading of its sampling distribution. Initially created for statistics at Middlebury College.
This package provides methods and functions to analyze the quantitative or qualitative performance for diagnostic assays, and outliers detection, reader precision and reference range are discussed. Most of the methods and algorithms refer to CLSI (Clinical & Laboratory Standards Institute) recommendations and NMPA (National Medical Products Administration) guidelines. In additional, relevant plots are constructed by ggplot2'.
Fit and compare nonlinear mixed-effects models in differential equations with flexible dosing information commonly seen in pharmacokinetics and pharmacodynamics (Almquist, Leander, and Jirstrand 2015 <doi:10.1007/s10928-015-9409-1>). Differential equation solving is by compiled C code provided in the rxode2 package (Wang, Hallow, and James 2015 <doi:10.1002/psp4.12052>).
Predicting the structure of a graph including new nodes and edges using a time series of graphs. Flux balance analysis, a linear and integer programming technique used in biochemistry is used with time series prediction methods to predict the graph structure at a future time point Kandanaarachchi (2025) <doi:10.48550/arXiv.2507.05806>.
The openFDA API facilitates access to Federal Drug Agency (FDA) data on drugs, devices, foodstuffs, tobacco, and more with httr2'. This package makes the API easily accessible, returning objects which the user can convert to JSON data and parse. Kass-Hout TA, Xu Z, Mohebbi M et al. (2016) <doi:10.1093/jamia/ocv153>.
Access a variety of PubMed data through a single, user-friendly interface, including abstracts <https://pubmed.ncbi.nlm.nih.gov/>, bibliometrics from iCite <https://icite.od.nih.gov/>, pubtations from PubTator3 <https://www.ncbi.nlm.nih.gov/research/pubtator3/>, and full-text records from PMC <https://www.ncbi.nlm.nih.gov/pmc/>.
Programs for Martinussen and Scheike (2006), `Dynamic Regression Models for Survival Data', Springer Verlag. Plus more recent developments. Additive survival model, semiparametric proportional odds model, fast cumulative residuals, excess risk models and more. Flexible competing risks regression including GOF-tests. Two-stage frailty modelling. PLS for the additive risk model. Lasso in the ahaz package.
Principal Component Analysis (PCA) is a statistical technique used to reduce the dimensionality of a dataset while preserving as much variability as possible. By transforming the original variables into a new set of uncorrelated variables called principal components, PCA helps in identifying patterns and simplifying the complexity of high-dimensional data. The RankPCA package provides a streamlined workflow for performing PCA on datasets containing both categorical and continuous variables. It facilitates data preprocessing, encoding of categorical variables, and computes PCA to determine the optimal number of principal components based on a specified variance threshold. The package also computes composite indices for ranking observations, which can be useful for various analytical purposes. Garai, S., & Paul, R. K. (2023) <doi:10.1016/j.iswa.2023.200202>.
Generate causally-simulated data to serve as ground truth for evaluating methods in causal discovery and effect estimation. The package provides tools to assist in defining functions based on specified edges, and conversely, defining edges based on functions. It enables the generation of data according to these predefined functions and causal structures. This is particularly useful for researchers in fields such as artificial intelligence, statistics, biology, medicine, epidemiology, economics, and social sciences, who are developing a general or a domain-specific methods to discover causal structures and estimate causal effects. Data simulation adheres to principles of structural causal modeling. Detailed methodologies and examples are documented in our vignette, available at <https://htmlpreview.github.io/?https://github.com/herdiantrisufriyana/rcausim/blob/master/doc/causal_simulation_exemplar.html>.
This package is an R package designed for QC, analysis, and exploration of single cell RNA-seq data. It easily enables widely-used analytical techniques, including the identification of highly variable genes, dimensionality reduction; PCA, ICA, t-SNE, standard unsupervised clustering algorithms; density clustering, hierarchical clustering, k-means, and the discovery of differentially expressed genes and markers.
The r-nleqslv package solves a system of nonlinear equations using a Broyden or a Newton method with a choice of global strategies such as line search and trust region. There are options for using a numerical or user supplied Jacobian, for specifying a banded numerical Jacobian and for allowing a singular or ill-conditioned Jacobian.
This package provides a method to refit and correct the diploid region in copy number profiles. It uses a clustering algorithm to identify pathology-specific normal (diploid) chromosomes and then use their copy number signal to refit the whole profile. The package is composed by three functions: DRrefit (the main function), ComputeNormalChromosome and PlotCluster.
The hdxmsqc package enables us to analyse and visualise the quality of HDX-MS experiments. Either as a final quality check before downstream analysis and publication or as part of a interative procedure to determine the quality of the data. The package builds on the QFeatures and Spectra packages to integrate with other mass-spectrometry data.
Plots simulation results of clinical trials. Its main feature is allowing users to simultaneously investigate the impact of several simulation input dimensions through dynamic filtering of the simulation results. A more detailed description of the app can be found in Meyer et al. <DOI:10.1016/j.softx.2023.101347> or the vignettes on GitHub'.
Querying, extracting, and processing large-scale network data from Neo4j databases using the Neo4j Bolt <https://neo4j.com/docs/bolt/current/bolt/> protocol. This interface supports efficient data retrieval, batch processing for large datasets, and seamless conversion of query results into R data frames, making it ideal for bioinformatics, computational biology, and other graph-based applications.