Statistical distribution in OOP (Object Oriented Programming) way. This package proposes a R6 class interface to classic statistical distribution, and new distributions can be easily added with the class AbstractDist
. A useful point is the generic fit()
method for each class, which uses a maximum likelihood estimation to find the parameters of a dataset, see, e.g. Hastie, T. and al (2009) <isbn:978-0-387-84857-0>. Furthermore, the rv_histogram class gives a non-parametric fit, with the same accessors that for the classic distribution. Finally, three random generators useful to build synthetic data are given: a multivariate normal generator, an orthogonal matrix generator, and a symmetric positive definite matrix generator, see Mezzadri, F. (2007) <arXiv:math-ph/0609050>
.
This package is an R package designed for QC, analysis, and exploration of single cell RNA-seq data. It easily enables widely-used analytical techniques, including the identification of highly variable genes, dimensionality reduction; PCA, ICA, t-SNE, standard unsupervised clustering algorithms; density clustering, hierarchical clustering, k-means, and the discovery of differentially expressed genes and markers.
Made to make your life simpler with packages, by installing and loading a list of packages, whether they are on CRAN, Bioconductor or github. For github, if you do not have the full path, with the maintainer name in it (e.g. "achateigner/topReviGO
"), it will be able to load it but not to install it.
Estimates hidden Markov models from the family of Cholesky-decomposed Gaussian hidden Markov models (CDGHMM) under various missingness schemes. This family improves upon estimation of traditional Gaussian HMMs by introducing parsimony, as well as, controlling for dropped out observations and non-random missingness. See Neal, Sochaniwsky and McNicholas
(2024) <DOI:10.1007/s11222-024-10462-0>.
This package implements Firth's penalized maximum likelihood bias reduction method for Cox regression which has been shown to provide a solution in case of monotone likelihood (nonconvergence of likelihood function), see Heinze and Schemper (2001) and Heinze and Dunkler (2008). The program fits profile penalized likelihood confidence intervals which were proved to outperform Wald confidence intervals.
This package provides a function for fast computation of the connected components of an undirected graph (though not faster than the components()
function of the igraph package) from the edges or the adjacency matrix of the graph. Based on this one, a function to compute the connected components of a triangle rgl mesh is also provided.
Composite likelihood approach is implemented to estimating statistical models for spatial ordinal and proportional data based on Feng et al. (2014) <doi:10.1002/env.2306>. Parameter estimates are identified by maximizing composite log-likelihood functions using the limited memory BFGS optimization algorithm with bounding constraints, while standard errors are obtained by estimating the Godambe information matrix.
This package provides a system for combining two diagnostic tests using various approaches that include statistical and machine-learning-based methodologies. These approaches are divided into four groups: linear combination methods, non-linear combination methods, mathematical operators, and machine learning algorithms. See the <https://biotools.erciyes.edu.tr/dtComb/>
website for more information, documentation, and examples.
An implementation of Dcifer (Distance for complex infections: fast estimation of relatedness), an identity by descent (IBD) based method to calculate genetic relatedness between polyclonal infections from biallelic and multiallelic data. The package includes functions that format and preprocess the data, implement the method, and visualize the results. Gerlovina et al. (2022) <doi:10.1093/genetics/iyac126>.
This package provides an interface to the FORCIS database (Chaabane et al. (2024) <doi:10.5281/zenodo.7390791>) on global foraminifera distribution. This package allows to download and to handle FORCIS data. It is part of the FRB-CESAB working group FORCIS. <https://www.fondationbiodiversite.fr/en/the-frb-in-action/programs-and-projects/le-cesab/forcis/>.
Current status data abounds in the field of epidemiology and public health, where the only observable data for a subject is the random inspection time and the event status at inspection. Motivated by such a current status data from a periodontal study where data are inherently clustered, we propose a unified methodology to analyze such complex data.
This package implements a new multiple imputation method that draws imputations from a latent joint multivariate normal model which underpins generally structured data. This model is constructed using a sequence of flexible conditional linear models that enables the resulting procedure to be efficiently implemented on high dimensional datasets in practice. See Robbins (2021) <arXiv:2008.02243>
.
This package provides tools for sparse regression modelling with grouped predictors using the group subset selection penalty. Uses coordinate descent and local search algorithms to rapidly deliver near optimal estimates. The group subset penalty can be combined with a group lasso or ridge penalty for added shrinkage. Linear and logistic regression are supported, as are overlapping groups.
This package implements the Hierarchical Incremental GRAdient Descent (HiGrad
) algorithm, a first-order algorithm for finding the minimizer of a function in online learning just like stochastic gradient descent (SGD). In addition, this method attaches a confidence interval to assess the uncertainty of its predictions. See Su and Zhu (2018) <arXiv:1802.04876>
for details.
The "Manual on Low-flow Estimation and Prediction" (Gustard & Demuth (2009, ISBN:978-92-63-11029-9)), published by the World Meteorological Organisation, gives a comprehensive summary on how to analyse stream flow data focusing on low-flows. This packages provides functions to compute the described statistics and produces plots similar to the ones in the manual.
Import xyz data from the NOAA (National Oceanic and Atmospheric Administration, <https://www.noaa.gov>), GEBCO (General Bathymetric Chart of the Oceans, <https://www.gebco.net>) and other sources, plot xyz data to prepare publication-ready figures, analyze xyz data to extract transects, get depth / altitude based on geographical coordinates, or calculate z-constrained least-cost paths.
Toolset that enriches mlr with a diverse set of preprocessing operators. Composable Preprocessing Operators ("CPO"s) are first-class R objects that can be applied to data.frames and mlr "Task"s to modify data, can be attached to mlr "Learner"s to add preprocessing to machine learning algorithms, and can be composed to form preprocessing pipelines.
Implement surrogate-assisted feature extraction (SAFE) and common machine learning approaches to train and validate phenotyping models. Background and details about the methods can be found at Zhang et al. (2019) <doi:10.1038/s41596-019-0227-6>, Yu et al. (2017) <doi:10.1093/jamia/ocw135>, and Liao et al. (2015) <doi:10.1136/bmj.h1885>.
Quantile regression (QR) for Nonlinear Mixed-Effects Models via the asymmetric Laplace distribution (ALD). It uses the Stochastic Approximation of the EM (SAEM) algorithm for deriving exact maximum likelihood estimates and full inference result is for the fixed-effects and variance components. It also provides prediction and graphical summaries for assessing the algorithm convergence and fitting results.
This package implements moving-blocks bootstrap and extended tapered-blocks bootstrap, as well as smooth versions of each, for quantile regression in time series. This package accompanies the paper: Gregory, K. B., Lahiri, S. N., & Nordman, D. J. (2018). A smooth block bootstrap for quantile regression with time series. The Annals of Statistics, 46(3), 1138-1166.
This package provides tools for the simulation of data in the context of small area estimation. Combine all steps of your simulation - from data generation over drawing samples to model fitting - in one object. This enables easy modification and combination of different scenarios. You can store your results in a folder or start the simulation in parallel.
This package provides a comprehensive resource for data on Taylor Swift songs. Data is included for all officially released studio albums, extended plays (EPs), and individual singles are included. Data comes from Genius (lyrics) and Spotify (song characteristics). Additional functions are included for easily creating data visualizations with color palettes inspired by Taylor Swift's album covers.
Greedy optimal subset selection for transformation models (Hothorn et al., 2018, <doi:10.1111/sjos.12291> ) based on the abess algorithm (Zhu et al., 2020, <doi:10.1073/pnas.2014241117> ). Applicable to models from packages tram and cotram'. Application to shift-scale transformation models are described in Siegfried et al. (2024, <doi:10.1080/00031305.2023.2203177>).
Topological data analysis studies structure and shape of the data using topological features. We provide a variety of algorithms to learn with persistent homology of the data based on functional summaries for clustering, hypothesis testing, visualization, and others. We refer to Wasserman (2018) <doi:10.1146/annurev-statistics-031017-100045> for a statistical perspective on the topic.