Fits multiple variable mixtures of various parametric proportional hazard models using the EM-Algorithm. Proportionality restrictions can be imposed on the latent groups and/or on the variables. Several survival distributions can be specified. Missing values and censored values are allowed. Independence is assumed over the single variables.
Box-constrained multiobjective optimization using the elitist non-dominated sorting genetic algorithm - NSGA-II. Fast non-dominated sorting, crowding distance, tournament selection, simulated binary crossover, and polynomial mutation are called in the main program. The methods are described in Deb et al. (2002) <doi:10.1109/4235.996017>.
Statistical analysis methods for environmental data are implemented. There is a particular focus on robust methods, and on methods for compositional data. In addition, larger data sets from geochemistry are provided. The statistical methods are described in Reimann, Filzmoser, Garrett, Dutter (2008, ISBN:978-0-470-98581-6).
Analysis of metacommunities based on functional traits and phylogeny of the community components. The functions that are offered here implement for the R environment methods that have been available in the SYNCSA application written in C++ (by Valerio Pillar, available at <http://ecoqua.ecologia.ufrgs.br/SYNCSA.html>).
This package provides a framework for data stream modeling and associated data mining tasks such as clustering and classification. The development of this package was supported in part by NSF IIS-0948893, NSF CMMI 1728612, and NIH R21HG005912. Hahsler et al (2017) <doi:10.18637/jss.v076.i14>.
This package provides a collection of simple parameter estimation and tests for the comparison of multivariate means and variation, to accompany Chapters 4 and 5 of the book Multivariate Statistical Methods. A Primer (5th edition), by Manly BFJ, Navarro Alberto JA & Gerow K (2024) <doi:10.1201/9781003453482>.
Perform biomarker evaluation and comparison in terms of specificity at a controlled sensitivity level, or sensitivity at a controlled specificity level. Point estimation and exact bootstrap of Huang, Parakati, Patil, and Sanda (2023) <doi:10.5705/ss.202021.0020> for the one- and two-biomarker problems are implemented.
The Brazilian system for diploma registration and validation on technical and superior courses are managing by Sistec platform, see <https://sistec.mec.gov.br/>. This package provides tools for Brazilian institutions to update the student's registration and make data analysis about their situation, retention and drop out.
This package provides drop-in replacements for common R functions (mean(), sum(), sd(), min(), etc.) that default to na.rm = TRUE and issue warnings when missing values are removed. It handles some special cases. The table() default is set to useNA = ifany'.
Helper functions to easily add functionality to functions. The package can assign functions to have an lazy evaluation allowing you to save and update the arguments before and after each function call. You can set a temporary working directory within functions and wrap console messages around other functions.
Mass cytometry enables the simultaneous measurement of dozens of protein markers at the single-cell level, producing high dimensional datasets that provide deep insights into cellular heterogeneity and function. However, these datasets often contain unwanted covariance introduced by technical variations, such as differences in cell size, staining efficiency, and instrument-specific artifacts, which can obscure biological signals and complicate downstream analysis. This package addresses this challenge by implementing a robust framework of linear models designed to identify and remove these sources of unwanted covariance. By systematically modeling and correcting for technical noise, the package enhances the quality and interpretability of mass cytometry data, enabling researchers to focus on biologically relevant signals.
The kappa statistic implemented by Fleiss is a very popular index for assessing the reliability of agreement among multiple observers. It is used both in the psychological and in the psychiatric field. Other fields of application are typically medicine, biology and engineering. Unfortunately,the kappa statistic may behave inconsistently in case of strong agreement between raters, since this index assumes lower values than it would have been expected. We propose a modification kappa implemented by Fleiss in case of nominal and ordinal variables. Monte Carlo simulations are used both to testing statistical hypotheses and to calculating percentile bootstrap confidence intervals based on proposed statistic in case of nominal and ordinal data.
This package includes positive ionization mode data in NetCDF file format. Centroided subset from 200-600 m/z and 2500-4500 seconds. Data originally reported in "Assignment of Endogenous Substrates to Enzymes by Global Metabolite Profiling" Biochemistry; 2004; 43(45). It also includes detected peaks in an xcmsSet.
This package provides tools For analyzing Illumina Infinium DNA methylation arrays. SeSAMe provides utilities to support analyses of multiple generations of Infinium DNA methylation BeadChips, including preprocessing, quality control, visualization and inference. SeSAMe features accurate detection calling, intelligent inference of ethnicity, sex and advanced quality control routines.
This package provides primitives for visualizing distributions using ggplot2 that are particularly tuned for visualizing uncertainty in either a frequentist or Bayesian mode. Both analytical distributions (such as frequentist confidence distributions or Bayesian priors) and distributions represented as samples (such as bootstrap distributions or Bayesian posterior samples) are easily visualized.
This package provides classes and methods for spatial objects that have a registered time column, in particular for irregular spatiotemporal data. The time column can be of any type, but needs to be ordinal. Regularly laid out spatiotemporal data (vector or raster data cubes) are handled by package stars'.
GNU Recutils is a set of tools and libraries for creating and manipulating text-based, human-editable databases. Despite being text-based, databases created with Recutils carry all of the expected features such as unique fields, primary keys, time stamps and more. Many different field types are supported, as is encryption.
The package is user friendly interface based on the cgdsr and other modeling packages to explore, compare, and analyse all available Cancer Data (Clinical data, Gene Mutation, Gene Methylation, Gene Expression, Protein Phosphorylation, Copy Number Alteration) hosted by the Computational Biology Center at Memorial-Sloan-Kettering Cancer Center (MSKCC).
This package provides functions for estimating the attributable burden of disease due to risk factors. The posterior simulation is performed using arm::sim as described in Gelman, Hill (2012) <doi:10.1017/CBO9780511790942> and the attributable burden method is based on Nielsen, Krause, Molbak <doi:10.1111/irv.12564>.
Obtain network structures from animal GPS telemetry observations and statistically analyse them to assess their adequacy for social network analysis. Methods include pre-network data permutations, bootstrapping techniques to obtain confidence intervals for global and node-level network metrics, and correlation and regression analysis of the local network metrics.
Implementation of two-dimensional (2D) correlation analysis based on the Fourier-transformation approach described by Isao Noda (I. Noda (1993) <DOI:10.1366/0003702934067694>). Additionally there are two plot functions for the resulting correlation matrix: The first one creates colored 2D plots, while the second one generates 3D plots.
This package provides equations commonly used in clinical pharmacokinetics and clinical pharmacology, such as equations for dose individualization, compartmental pharmacokinetics, drug exposure, anthropomorphic calculations, clinical chemistry, and conversion of common clinical parameters. Where possible and relevant, it provides multiple published and peer-reviewed equations within the respective R function.
Feed longitudinal data into a Bayesian Latent Factor Model to obtain a low-rank representation. Parameters are estimated using a Hamiltonian Monte Carlo algorithm with STAN. See G. Weinrott, B. Fontez, N. Hilgert and S. Holmes, "Bayesian Latent Factor Model for Functional Data Analysis", Actes des JdS 2016.
Estimates RxC (R by C) vote transfer matrices (ecological contingency tables) from aggregate data by simultaneously minimizing Euclidean row-standardized unit-to-global distances. Acknowledgements: The authors wish to thank Generalitat Valenciana, Consellerà a de Educación, Cultura, Universidades y Empleo (grant CIAICO/2023/031) for supporting this research.