Stringr is a consistent, simple and easy to use set of wrappers around the fantastic stringi package. All function and argument names (and positions) are consistent, all functions deal with "NA"'s and zero length vectors in the same way, and the output from one function is easy to feed into the input of another.
r-selectr translates a CSS3 selector into an equivalent XPath expression. This allows you to use CSS selectors when working with the XML package as it can only evaluate XPath expressions. Also provided are convenience functions useful for using CSS selectors on XML nodes. This package is a port of the Python package cssselect.
ibus-rime provides the Rime input method engine for IBus. Rime is a lightweight, extensible input method engine supporting various input schemas including glyph-based input methods, romanization-based input methods as well as those for Chinese dialects. It has the ability to compose phrases and sentences intelligently and provide very accurate traditional Chinese output.
The diffUTR package provides a uniform interface and plotting functions for limma/edgeR/DEXSeq -powered differential bin/exon usage. It includes in addition an improved version of the limma::diffSplice method. Most importantly, diffUTR further extends the application of these frameworks to differential UTR usage analysis using poly-A site databases.
metaCCA performs multivariate analysis of a single or multiple GWAS based on univariate regression coefficients. It allows multivariate representation of both phenotype and genotype. metaCCA extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.
Estimation of large Vector AutoRegressive (VAR), Vector AutoRegressive with Exogenous Variables X (VARX) and Vector AutoRegressive Moving Average (VARMA) Models with Structured Lasso Penalties, see Nicholson, Wilms, Bien and Matteson (2020) <https://jmlr.org/papers/v21/19-777.html> and Wilms, Basu, Bien and Matteson (2021) <doi:10.1080/01621459.2021.1942013>.
Boosting Regression Quantiles is a component-wise boosting algorithm, that embeds all boosting steps in the well-established framework of quantile regression. It is initialized with the corresponding quantile, uses a quantile-specific learning rate, and uses quantile regression as its base learner. The package implements this algorithm and allows cross-validation and stability selection.
Gain access to the Spark Catalog API making use of the sparklyr API. Catalog <https://spark.apache.org/docs/2.4.3/api/java/org/apache/spark/sql/catalog/Catalog.html> is the interface for managing a metastore (aka metadata catalog) of relational entities (e.g. database(s), tables, functions, table columns and temporary views).
Fast fitting of Stable Isotope Mixing Models in R. Allows for the inclusion of covariates. Also has built-in summary functions and plot functions which allow for the creation of isospace plots. Variational Bayes is used to fit these models, methods as described in: Tran et al., (2021) <doi:10.48550/arXiv.2103.01327>.
Phone numbers are often represented as strings because there is no obvious and suitable native representation for them. This leads to high memory use and a lack of standard representation. The package provides integer representation of Australian phone numbers with optional raw vector calling code. The package name is an extension of au and ph'.
The main functions are emmreml', and emmremlMultiKernel'. emmreml solves a mixed model with known covariance structure using the EMMA algorithm. emmremlMultiKernel is a wrapper for emmreml to handle multiple random components with known covariance structures. The function emmremlMultivariate solves a multivariate gaussian mixed model with known covariance structure using the ECM algorithm.
This package provides tools for efficient processing of large, whole genome genotype data sets in variant call format (VCF). It includes several functions to calculate commonly used population genomic metrics and a method for reference panel free genotype imputation, which is described in the preprint Gurke & Mayer (2024) <doi:10.22541/au.172515591.10119928/v1>.
Automatic open data acquisition from resources of IGN ('Institut National de Information Geographique et forestiere') (<https://www.ign.fr/>). Available datasets include various types of raster and vector data, such as digital elevation models, state borders, spatial databases, cadastral parcels, and more. happign also provide access to API Carto (<https://apicarto.ign.fr/api/doc/>).
Dieses R-Paket stellt Zusatzmaterial in Form von Daten, Funktionen und R-Hilfe-Seiten für den Herausgeberband Breit, S. und Schreiner, C. (Hrsg.). (2016). "Large-Scale Assessment mit R: Methodische Grundlagen der österreichischen Bildungsstandardüberprüfung." Wien: facultas. (ISBN: 978-3-7089-1343-8, <https://www.iqs.gv.at/themen/bildungsforschung/publikationen/veroeffentlichte-publikationen>) zur Verfügung.
This package provides functionality to produce graphs of sampling distributions of test statistics from a variety of common statistical tests. With only a few keystrokes, the user can conduct a hypothesis test and visualize the test statistic and corresponding p-value through the shading of its sampling distribution. Initially created for statistics at Middlebury College.
This package provides methods and functions to analyze the quantitative or qualitative performance for diagnostic assays, and outliers detection, reader precision and reference range are discussed. Most of the methods and algorithms refer to CLSI (Clinical & Laboratory Standards Institute) recommendations and NMPA (National Medical Products Administration) guidelines. In additional, relevant plots are constructed by ggplot2'.
Predicting the structure of a graph including new nodes and edges using a time series of graphs. Flux balance analysis, a linear and integer programming technique used in biochemistry is used with time series prediction methods to predict the graph structure at a future time point Kandanaarachchi (2025) <doi:10.48550/arXiv.2507.05806>.
Fit and compare nonlinear mixed-effects models in differential equations with flexible dosing information commonly seen in pharmacokinetics and pharmacodynamics (Almquist, Leander, and Jirstrand 2015 <doi:10.1007/s10928-015-9409-1>). Differential equation solving is by compiled C code provided in the rxode2 package (Wang, Hallow, and James 2015 <doi:10.1002/psp4.12052>).
This package implements univariate continuous probability distributions and associated model diagnostics based on the Lindley, Logistic, Half-Cauchy, Half-Logistic, and Poisson families. Provides functions for probability density, cumulative distribution, quantile, and hazard evaluation, random variate generation, and diagnostic procedures including Q-Q and P-P plots, goodness-of-fit tests, and model selection criteria.
The openFDA API facilitates access to Federal Drug Agency (FDA) data on drugs, devices, foodstuffs, tobacco, and more with httr2'. This package makes the API easily accessible, returning objects which the user can convert to JSON data and parse. Kass-Hout TA, Xu Z, Mohebbi M et al. (2016) <doi:10.1093/jamia/ocv153>.
Efficient algorithm for solving PU (Positive and Unlabeled) problem in low or high dimensional setting with lasso or group lasso penalty. The algorithm uses Maximization-Minorization and (block) coordinate descent. Sparse calculation and parallel computing are supported for the computational speed-up. See Hyebin Song, Garvesh Raskutti (2018) <doi:10.48550/arXiv.1711.08129>.
Programs for Martinussen and Scheike (2006), `Dynamic Regression Models for Survival Data', Springer Verlag. Plus more recent developments. Additive survival model, semiparametric proportional odds model, fast cumulative residuals, excess risk models and more. Flexible competing risks regression including GOF-tests. Two-stage frailty modelling. PLS for the additive risk model. Lasso in the ahaz package.
Principal Component Analysis (PCA) is a statistical technique used to reduce the dimensionality of a dataset while preserving as much variability as possible. By transforming the original variables into a new set of uncorrelated variables called principal components, PCA helps in identifying patterns and simplifying the complexity of high-dimensional data. The RankPCA package provides a streamlined workflow for performing PCA on datasets containing both categorical and continuous variables. It facilitates data preprocessing, encoding of categorical variables, and computes PCA to determine the optimal number of principal components based on a specified variance threshold. The package also computes composite indices for ranking observations, which can be useful for various analytical purposes. Garai, S., & Paul, R. K. (2023) <doi:10.1016/j.iswa.2023.200202>.
Generate causally-simulated data to serve as ground truth for evaluating methods in causal discovery and effect estimation. The package provides tools to assist in defining functions based on specified edges, and conversely, defining edges based on functions. It enables the generation of data according to these predefined functions and causal structures. This is particularly useful for researchers in fields such as artificial intelligence, statistics, biology, medicine, epidemiology, economics, and social sciences, who are developing a general or a domain-specific methods to discover causal structures and estimate causal effects. Data simulation adheres to principles of structural causal modeling. Detailed methodologies and examples are documented in our vignette, available at <https://htmlpreview.github.io/?https://github.com/herdiantrisufriyana/rcausim/blob/master/doc/causal_simulation_exemplar.html>.