An implementation of a taxonomy of models of restricted diffusion in biological tissues parametrized by the tissue geometry (axis, diameter, density, etc.). This is primarily used in the context of diffusion magnetic resonance (MR) imaging to model the MR signal attenuation in the presence of diffusion gradients. The goal is to provide tools to simulate the MR signal attenuation predicted by these models under different experimental conditions. The package feeds a companion shiny app available at <https://midi-pastrami.apps.math.cnrs.fr> that serves as a graphical interface to the models and tools provided by the package. Models currently available are the ones in Neuman (1974) <doi:10.1063/1.1680931>, Van Gelderen et al. (1994) <doi:10.1006/jmrb.1994.1038>, Stanisz et al. (1997) <doi:10.1002/mrm.1910370115>, Soderman & Jonsson (1995) <doi:10.1006/jmra.1995.0014> and Callaghan (1995) <doi:10.1006/jmra.1995.1055>.
Count data is prevalent and informative, with widespread application in many fields such as social psychology, personality, and public health. Classical statistical methods for the analysis of count outcomes are commonly variants of the log-linear model, including Poisson regression and Negative Binomial regression. However, a typical problem with count data modeling is inflation, in the sense that the counts are evidently accumulated on some integers. Such an inflation problem could distort the distribution of the observed counts, further bias estimation and increase error, making the classic methods infeasible. Traditional inflated value selection methods based on histogram inspection are easy to neglect true points and computationally expensive in addition. Therefore, we propose a multiple-inflated negative binomial model to handle count data modeling with multiple inflated values, achieving data-driven inflated value selection. The proposed approach provides simultaneous identification of important regression predictors on the target count response as well. More details about the proposed method are described in Li, Y., Wu, M., Wu, M., & Ma, S. (2023) <arXiv:2309.15585>.
Multivariate Information-based Inductive Causation, better known by its acronym MIIC, is a causal discovery method, based on information theory principles, which learns a large class of causal or non-causal graphical models from purely observational data, while including the effects of unobserved latent variables. Starting from a complete graph, the method iteratively removes dispensable edges, by uncovering significant information contributions from indirect paths, and assesses edge-specific confidences from randomization of available data. The remaining edges are then oriented based on the signature of causality in observational data. The recent more interpretable MIIC extension (iMIIC) further distinguishes genuine causes from putative and latent causal effects, while scaling to very large datasets (hundreds of thousands of samples). Since the version 2.0, MIIC also includes a temporal mode (tMIIC) to learn temporal causal graphs from stationary time series data. MIIC has been applied to a wide range of biological and biomedical data, such as single cell gene expression data, genomic alterations in tumors, live-cell time-lapse imaging data (CausalXtract), as well as medical records of patients. MIIC brings unique insights based on causal interpretation and could be used in a broad range of other data science domains (technology, climatology, economy, ...). For more information, you can refer to: Simon et al., eLife 2024, <doi:10.1101/2024.02.06.579177>, Ribeiro-Dantas et al., iScience 2024, <doi:10.1016/j.isci.2024.109736>, Cabeli et al., NeurIPS 2021, <https://why21.causalai.net/papers/WHY21_24.pdf>, Cabeli et al., Comput. Biol. 2020, <doi:10.1371/journal.pcbi.1007866>, Li et al., NeurIPS 2019, <https://papers.nips.cc/paper/9573-constraint-based-causal-structure-learning-with-consistent-separating-sets>, Verny et al., PLoS Comput. Biol. 2017, <doi:10.1371/journal.pcbi.1005662>, Affeldt et al., UAI 2015, <https://auai.org/uai2015/proceedings/papers/293.pdf>. Changes from the previous 1.5.3 release on CRAN are available at <https://github.com/miicTeam/miic_R_package/blob/master/NEWS.md>.
This package provides tools to analyze and visualize Illumina Infinium methylation arrays.
This package implements various algorithms for inferring mutual information networks from data.
Various functions for random number generation, density estimation, classification, curve fitting, and spatial data analysis.
This package is intended to help users to efficiently analyze genomic data resulting from various experiments.
This package provides a flexible computational framework for mixture distributions with the focus on the composite models.
This GUI for the mi package walks the user through the steps of multiple imputation and the analysis of completed data.
Generates multivariate imputations using sequential regression with L2 penalty. For more details see Zahid and Heumann (2018) <doi:10.1177/0962280218755574>.
This package provides a derivative-free optimization by quadratic approximation based on an interface to Fortran implementations by M. J. D. Powell.
This is a port of the type guesser from the readr package, the so-called readr first edition parsing engine, now superseded by vroom.
Implementation of methods for minimizing ill-conditioned problems. Currently only includes regularized (quasi-)newton optimization (Kanzow and Steck et al. (2023), <doi:10.1007/s12532-023-00238-4>).
This package provides a set of classes and methods to set up and run multi-species, trait based and community size spectrum ecological models, focused on the marine environment.
Milo performs single-cell differential abundance testing. Cell states are modelled as representative neighbourhoods on a nearest neighbour graph. Hypothesis testing is performed using a negative bionomial generalized linear model.
This package contains functions for converting existing HTML/JavaScript source into equivalent shiny functions. Bootstraps the process of making new shiny functions by allowing us to turn HTML snippets directly into R functions.
Imputes missing values of an incomplete data matrix by minimizing the Mahalanobis distance of each sample from the overall mean [Labita, GJ.D. and Tubo, B.F. (2024) <doi:10.24412/1932-2321-2024-278-115-123>].
This package provides tools for multiple imputation of missing data in multilevel modeling. It includes a user-friendly interface to the packages pan and jomo, and several functions for visualization, data management and the analysis of multiply imputed data sets.
It offers random-forest-based functions to impute clustered incomplete data. The package is tailored for but not limited to imputing multitissue expression data, in which a gene's expression is measured on the collected tissues of an individual but missing on the uncollected tissues.
Model time series using mixture autoregressive (MAR) models. Implemented are frequentist (EM) and Bayesian methods for estimation, prediction and model evaluation. See Wong and Li (2002) <doi:10.1111/1467-9868.00222>, Boshnakov (2009) <doi:10.1016/j.spl.2009.04.009>), and the extensive references in the documentation.
An implementation of the iterative proportional fitting (IPFP), maximum likelihood, minimum chi-square and weighted least squares procedures for updating a N-dimensional array with respect to given target marginal distributions (which, in turn can be multidimensional). The package also provides an application of the IPFP to simulate multivariate Bernoulli distributions.
Classify missing data as missing completely at random (MCAR), missing at random (MAR), or missing not at random (MNAR). This step is required before handling missing data (e.g. mean imputation) so that bias is not introduced. See Little (1988) <doi:10.1080/01621459.1988.10478722> for the statistical rationale for the methods used.
The main functions perform mixed models analysis by least squares or REML by adding the function r() to formulas of lm() and glm(). A collection of text-book statistics for higher education is also included, e.g. modifications of the functions lm(), glm() and associated summaries from the package stats'.
Multiple imputation using XGBoost', subsampling, and predictive mean matching as described in Deng and Lumley (2023) <doi:10.1080/10618600.2023.2252501>. The package supports various types of variables, offers flexible settings, and enables saving an imputation model to impute new data. Data processing and memory usage have been optimised to speed up the imputation process.