The bestridge package is designed to provide a one-stand service for users to successfully carry out best ridge regression in various complex situations via the primal dual active set algorithm proposed by Wen, C., Zhang, A., Quan, S. and Wang, X. (2020) <doi:10.18637/jss.v094.i04>. This package allows users to perform the regression, classification, count regression and censored regression for (ultra) high dimensional data, and it also supports advanced usages like group variable selection and nuisance variable selection.
Every research team have their own script for data management, statistics and most importantly hemodynamic indices. The purpose is to standardize scripts utilized in clinical research. The hemodynamic indices can be used in a long-format dataframe, and add both periods of interest (trigger-periods), and delete artifacts with deleter-files. Transfer function analysis (Claassen et al. (2016) <doi:10.1177/0271678X15626425>) and Mx (Czosnyka et al. (1996) <doi:10.1161/01.str.27.10.1829>) can be calculated using this package.
This package provides a set of wrapper functions that mainly re-produces most of the sequence plots rendered with TraMineR::seqplot()
. Whereas TraMineR
uses base R to produce the plots this library draws on ggplot2'. The plots are produced on the basis of a sequence object defined with TraMineR::seqdef()
. The package automates the reshaping and plotting of sequence data. Resulting plots are of class ggplot', i.e. components can be added and tweaked using + and regular ggplot2 functions.
Estimates the Gini index and computes variances and confidence intervals for finite and infinite populations, using different methods; also computes Gini index for continuous probability distributions, draws samples from continuous probability distributions with Gini indices set by the user; uses Rcpp'. References: Muñoz et al. (2023) <doi:10.1177/00491241231176847>. à lvarez et al. (2021) <doi:10.3390/math9243252>. Giorgi and Gigliarano (2017) <doi:10.1111/joes.12185>. Langel and Tillé (2013) <doi:10.1111/j.1467-985X.2012.01048.x>.
Simulate full B-cell and T-cell receptor repertoires using an in silico recombination process that includes a wide variety of tunable parameters to introduce noise and biases. Additional post-simulation modification functions allow the user to implant motifs or codon biases as well as remodeling sequence similarity architecture. The output repertoires contain records of all relevant repertoire dimensions and can be analyzed using provided repertoire analysis functions. Preprint is available at bioRxiv
(Weber et al., 2019 <doi:10.1101/759795>).
Solver for linear, quadratic, and rational programs with linear, quadratic, and rational constraints. A unified interface to different R packages is provided. Optimization problems are transformed into equivalent formulations and solved by the respective package. For example, quadratic programming problems with linear, quadratic and rational constraints can be solved by augmented Lagrangian minimization using package alabama', or by sequential quadratic programming using solver slsqp'. Alternatively, they can be reformulated as optimization problems with second order cone constraints and solved with package cccp'.
G-computation for a set of time-fixed exposures with quantile-based basis functions, possibly under linearity and homogeneity assumptions. Effect measure modification in this method is a way to assess how the effect of the mixture varies by a binary, categorical or continuous variable. Reference: Alexander P. Keil, Jessie P. Buckley, Katie M. OBrien, Kelly K. Ferguson, Shanshan Zhao, and Alexandra J. White (2019) A quantile-based g-computation approach to addressing the effects of exposure mixtures; <doi:10.1289/EHP5838>.
This package provides a pipeline that can process single or multiple Single Cell RNAseq samples primarily specializes in Clustering and Dimensionality Reduction. Meanwhile we use common cell type marker genes for T cells, B cells, Myeloid cells, Epithelial cells, and stromal cells (Fiboblast, Endothelial cells, Pericyte, Smooth muscle cells) to visualize the Seurat clusters, to facilitate labeling them by biological names. Once users named each cluster, they can evaluate the quality of them again and find the de novo marker genes also.
Identifies a bicluster, a submatrix of the data such that the features and observations within the submatrix differ from those not contained in submatrix, using a two-step method. In the first step, observations in the bicluster are identified to maximize the sum of weighted between cluster feature differences. The method is described in Helgeson et al. (2020) <doi:10.1111/biom.13136>. SCBiclust can be used to identify biclusters which differ based on feature means, feature variances, or more general differences.
Estimates the parameter of small area in binary data without auxiliary variable using Empirical Bayes technique, mainly from Rao and Molina (2015,ISBN:9781118735787) with book entitled "Small Area Estimation Second Edition". This package provides another option of direct estimation using weight. This package also features alpha and beta parameter estimation on calculating process of small area. Those methods are Newton-Raphson and Moment which based on Wilcox (1979) <doi:10.1177/001316447903900302> and Kleinman (1973) <doi:10.1080/01621459.1973.10481332>.
Genomic analysis can be utilised to identify differences between RNA populations in two conditions, both in production and abundance. This includes the identification of RNAs produced by multiple genomes within a biological system. For example, RNA produced by pathogens within a host or mobile RNAs in plant graft systems. The mobileRNA
package provides methods to pre-process, analyse and visualise the sRNA
and mRNA
populations based on the premise of mapping reads to all genotypes at the same time.
scRecover
is an R package for imputation of single-cell RNA-seq (scRNA-seq
) data. It will detect and impute dropout values in a scRNA-seq
raw read counts matrix while keeping the real zeros unchanged, since there are both dropout zeros and real zeros in scRNA-seq
data. By combination with scImpute
, SAVER and MAGIC, scRecover
not only detects dropout and real zeros at higher accuracy, but also improve the downstream clustering and visualization results.
An implementation of sensitivity and robustness methods in Bayesian networks in R. It includes methods to perform parameter variations via a variety of co-variation schemes, to compute sensitivity functions and to quantify the dissimilarity of two Bayesian networks via distances and divergences. It further includes diagnostic methods to assess the goodness of fit of a Bayesian networks to data, including global, node and parent-child monitors. Reference: M. Leonelli, R. Ramanathan, R.L. Wilkerson (2022) <doi:10.1016/j.knosys.2023.110882>.
This package provides methods for powering cluster-randomized trials with two continuous co-primary outcomes using five key design techniques. Includes functions for calculating required sample size and statistical power. For more details on methodology, see Owen et al. (2025) <doi:10.1002/sim.70015>, Yang et al. (2022) <doi:10.1111/biom.13692>, Pocock et al. (1987) <doi:10.2307/2531989>, Vickerstaff et al. (2019) <doi:10.1186/s12874-019-0754-4>, and Li et al. (2020) <doi:10.1111/biom.13212>.
This package provides tools for Delphi's COVIDcast Epidata API: data access, maps and time series plotting, and basic signal processing. The API includes a collection of numerous indicators relevant to the COVID-19 pandemic in the United States, including official reports, de-identified aggregated medical claims data, large-scale surveys of symptoms and public behavior, and mobility data, typically updated daily and at the county level. All data sources are documented at <https://cmu-delphi.github.io/delphi-epidata/api/covidcast.html>.
This package implements methods for calculating disproportionate impact: the percentage point gap, proportionality index, and the 80% index. California Community Colleges Chancellor's Office (2017). Percentage Point Gap Method. <https://www.cccco.edu/-/media/CCCCO-Website/About-Us/Divisions/Digital-Innovation-and-Infrastructure/Research/Files/PercentagePointGapMethod2017.ashx>
. California Community Colleges Chancellor's Office (2014). Guidelines for Measuring Disproportionate Impact in Equity Plans. <https://www.cccco.edu/-/media/CCCCO-Website/Files/DII/guidelines-for-measuring-disproportionate-impact-in-equity-plans-tfa-ada.pdf>.
Analysis and visualization of plant disease progress curve data. Functions for fitting two-parameter population dynamics models (exponential, monomolecular, logistic and Gompertz) to proportion data for single or multiple epidemics using either linear or no-linear regression. Statistical and visual outputs are provided to aid in model selection. Synthetic curves can be simulated for any of the models given the parameters. See Laurence V. Madden, Gareth Hughes, and Frank van den Bosch (2007) <doi:10.1094/9780890545058> for further information on the methods.
Comparisons of floating point numbers are problematic due to errors associated with the binary representation of decimal numbers. Despite being aware of these problems, people still use numerical methods that fail to account for these and other rounding errors (this pitfall is the first to be highlighted in Circle 1 of Burns (2012) The R Inferno <https://www.burns-stat.com/pages/Tutor/R_inferno.pdf>). This package provides new relational operators useful for performing floating point number comparisons with a set tolerance.
Genome-wide association study (GWAS) performed with SLOPE, short for Sorted L-One Penalized Estimation, a method for estimating the vector of coefficients in a linear model. In the first step of GWAS, single nucleotide polymorphisms (SNPs) are clumped according to their correlations and distances. Then, SLOPE is performed on the data where each clump has one representative. Malgorzata Bogdan, Ewout van den Berg, Chiara Sabatti, Weijie Su and Emmanuel Candes (2014) "SLOPE - Adaptive Variable Selection via Convex Optimization" <arXiv:1407.3824>
.
This package provides classes and functions to calculate various distance measures and routes in heterogeneous geographic spaces represented as grids. The package implements measures to model dispersal histories first presented by van Etten and Hijmans (2010) <doi:10.1371/journal.pone.0012060>. Least-cost distances as well as more complex distances based on (constrained) random walks can be calculated. The distances implemented in the package are used in geographical genetics, accessibility indicators, and may also have applications in other fields of geospatial analysis.
High level functions for hyperplane fitting (hyper.fit()
) and visualising (hyper.plot2d()
/ hyper.plot3d()
). In simple terms this allows the user to produce robust 1D linear fits for 2D x vs y type data, and robust 2D plane fits to 3D x vs y vs z type data. This hyperplane fitting works generically for any N-1 hyperplane model being fit to a N dimension dataset. All fits include intrinsic scatter in the generative model orthogonal to the hyperplane.
In order to improve performance for HTTP API clients, httpcache provides simple tools for caching and invalidating cache. It includes the HTTP verb functions GET, PUT, PATCH, POST, and DELETE, which are drop-in replacements for those in the httr package. These functions are cache-aware and provide default settings for cache invalidation suitable for RESTful APIs; the package also enables custom cache-management strategies. Finally, httpcache includes a basic logging framework to facilitate the measurement of HTTP request time and cache performance.
It provides a generic set of tools for initializing a synthetic population with each individual in specific disease states, and making transitions between those disease states according to the rates calculated on each timestep. The new version 1.0.0 has C++ code integration to make the functions run faster. It has also a higher level function to actually run the transitions for the number of timesteps that users specify. Additional functions will follow for changing attributes on demographic, health belief and movement.
This package provides a joint mixture model has been developed by Majumdar et al. (2025) <doi:10.48550/arXiv.2412.17511>
that integrates information from gene expression data and methylation data at the modelling stage to capture their inherent dependency structure, enabling simultaneous identification of differentially methylated cytosine-guanine dinucleotide (CpG
) sites and differentially expressed genes. The model leverages a joint likelihood function that accounts for the nested structure in the data, with parameter estimation performed using an expectation-maximisation algorithm.