Pre-made models that can be rapidly tailored to various chemicals and species using chemical-specific in vitro data and physiological information. These tools allow incorporation of chemical toxicokinetics ("TK") and in vitro-in vivo extrapolation ("IVIVE") into bioinformatics, as described by Pearce et al. (2017) (<doi:10.18637/jss.v079.i04>). Chemical-specific in vitro data characterizing toxicokinetics have been obtained from relatively high-throughput experiments. The chemical-independent ("generic") physiologically-based ("PBTK") and empirical (for example, one compartment) "TK" models included here can be parameterized with in vitro data or in silico predictions which are provided for thousands of chemicals, multiple exposure routes, and various species. High throughput toxicokinetics ("HTTK") is the combination of in vitro data and generic models. We establish the expected accuracy of HTTK for chemicals without in vivo data through statistical evaluation of HTTK predictions for chemicals where in vivo data do exist. The models are systems of ordinary differential equations that are developed in MCSim and solved using compiled (C-based) code for speed. A Monte Carlo sampler is included for simulating human biological variability (Ring et al., 2017 <doi:10.1016/j.envint.2017.06.004>) and propagating parameter uncertainty (Wambaugh et al., 2019 <doi:10.1093/toxsci/kfz205>). Empirically calibrated methods are included for predicting tissue:plasma partition coefficients and volume of distribution (Pearce et al., 2017 <doi:10.1007/s10928-017-9548-7>). These functions and data provide a set of tools for using IVIVE to convert concentrations from high-throughput screening experiments (for example, Tox21, ToxCast) to real-world exposures via reverse dosimetry (also known as "RTK") (Wetmore et al., 2015 <doi:10.1093/toxsci/kfv171>).
Multivariate Information-based Inductive Causation, better known by its acronym MIIC, is a causal discovery method, based on information theory principles, which learns a large class of causal or non-causal graphical models from purely observational data, while including the effects of unobserved latent variables. Starting from a complete graph, the method iteratively removes dispensable edges, by uncovering significant information contributions from indirect paths, and assesses edge-specific confidences from randomization of available data. The remaining edges are then oriented based on the signature of causality in observational data. The recent more interpretable MIIC extension (iMIIC) further distinguishes genuine causes from putative and latent causal effects, while scaling to very large datasets (hundreds of thousands of samples). Since the version 2.0, MIIC also includes a temporal mode (tMIIC) to learn temporal causal graphs from stationary time series data. MIIC has been applied to a wide range of biological and biomedical data, such as single cell gene expression data, genomic alterations in tumors, live-cell time-lapse imaging data (CausalXtract), as well as medical records of patients. MIIC brings unique insights based on causal interpretation and could be used in a broad range of other data science domains (technology, climatology, economy, ...). For more information, you can refer to: Simon et al., eLife 2024, <doi:10.1101/2024.02.06.579177>, Ribeiro-Dantas et al., iScience 2024, <doi:10.1016/j.isci.2024.109736>, Cabeli et al., NeurIPS 2021, <https://why21.causalai.net/papers/WHY21_24.pdf>, Cabeli et al., Comput. Biol. 2020, <doi:10.1371/journal.pcbi.1007866>, Li et al., NeurIPS 2019, <https://papers.nips.cc/paper/9573-constraint-based-causal-structure-learning-with-consistent-separating-sets>, Verny et al., PLoS Comput. Biol. 2017, <doi:10.1371/journal.pcbi.1005662>, Affeldt et al., UAI 2015, <https://auai.org/uai2015/proceedings/papers/293.pdf>. Changes from the previous 1.5.3 release on CRAN are available at <https://github.com/miicTeam/miic_R_package/blob/master/NEWS.md>.
In many cases, experiments must be repeated across multiple seasons or locations to ensure applicability of findings. A single experiment conducted in one location and season may yield limited conclusions, as results can vary under different environmental conditions. In agricultural research, treatment à location and treatment à season interactions play a crucial role. Analyzing a series of experiments across diverse conditions allows for more generalized and reliable recommendations. The CANE package facilitates the pooled analysis of experiments conducted over multiple years, seasons, or locations. It is designed to assess treatment interactions with environmental factors (such as location and season) using various experimental designs. The package supports pooled analysis of variance (ANOVA) for the following designs: (1) PooledCRD()': completely randomized design; (2) PooledRBD()': randomized block design; (3) PooledLSD()': Latin square design; (4) PooledSPD()': split plot design; and (5) PooledStPD()': strip plot design. Each function provides the following outputs: (i) Individual ANOVA tables based on independent analysis for each location or year; (ii) Testing of homogeneity of error variances among distinct locations using Bartlettâ s Chi-Square test; (iii) If Bartlettâ s test is significant, Aitkenâ s transformation, defined as the ratio of the response to the square root of the error mean square, is applied to the response variable; otherwise, the data is used as is; (iv) Combined analysis to obtain a pooled ANOVA table; (v) Multiple comparison tests, including Tukey's honestly significant difference (Tukey's HSD) test, Duncanâ s multiple range test (DMRT), and the least significant difference (LSD) test, for treatment comparisons. The statistical theory and steps of analysis of these designs are available in Dean et al. (2017)<doi:10.1007/978-3-319-52250-0> and Ruà z et al. (2024)<doi:10.1007/978-3-031-65575-3>. By broadening the scope of experimental conclusions, CANE enables researchers to derive robust, widely applicable recommendations. This package is particularly valuable in agricultural research, where accounting for treatment à location and treatment à season interactions is essential for ensuring the validity of findings across multiple settings.
Understanding the current status of forest resources is essential for monitoring changes in forest ecosystems and generating related statistics. In South Korea, the National Forest Inventory (NFI) surveys over 4,500 sample plots nationwide every five years and records 70 items, including forest stand, forest resource, and forest vegetation surveys. Many researchers use NFI as the primary data for research, such as biomass estimation or analyzing the importance value of each species over time and space, depending on the research purpose. However, the large volume of accumulated forest survey data from across the country can make it challenging to manage and utilize such a vast dataset. To address this issue, we developed an R package that efficiently handles large-scale NFI data across time and space. The package offers a comprehensive workflow for NFI data analysis. It starts with data processing, where read_nfi() function reconstructs NFI data according to the researcher's needs while performing basic integrity checks for data quality.Following this, the package provides analytical tools that operate on the verified data. These include functions like summary_nfi() for summary statistics, diversity_nfi() for biodiversity analysis, iv_nfi() for calculating species importance value, and biomass_nfi() and cwd_biomass_nfi() for biomass estimation. Finally, for visualization, the tsvis_nfi() function generates graphs and maps, allowing users to visualize forest ecosystem changes across various spatial and temporal scales. This integrated approach and its specialized functions can enhance the efficiency of processing and analyzing NFI data, providing researchers with insights into forest ecosystems. The NFI Excel files (.xlsx) are not included in the R package and must be downloaded separately. Users can access these NFI Excel files by visiting the Korea Forest Service Forestry Statistics Platform <https://kfss.forest.go.kr/stat/ptl/article/articleList.do?curMenu=11694&bbsId=microdataboard> to download the annual NFI Excel files, which are bundled in .zip archives. Please note that this website is only available in Korean, and direct download links can be found in the notes section of the read_nfi() function.
EQ-5D is a popular health related quality of life instrument used in the clinical and economic evaluation of health care. Developed by the EuroQol group <https://euroqol.org/>, the instrument consists of two components: health state description and evaluation. For the description component a subject self-rates their health in terms of five dimensions; mobility, self-care, usual activities, pain/discomfort, and anxiety/depression using either a three-level (EQ-5D-3L, <https://euroqol.org/information-and-support/euroqol-instruments/eq-5d-3l/>) or a five-level (EQ-5D-5L, <https://euroqol.org/information-and-support/euroqol-instruments/eq-5d-5l/>) scale. Frequently the scores on these five dimensions are converted to a single utility index using country specific value sets, which can be used in the clinical and economic evaluation of health care as well as in population health surveys. The eq5d package provides methods to calculate index scores from a subject's dimension scores. 32 TTO and 11 VAS EQ-5D-3L value sets including those for countries in Szende et al (2007) <doi:10.1007/1-4020-5511-0> and Szende et al (2014) <doi:10.1007/978-94-007-7596-1>, 48 EQ-5D-5L EQ-VT value sets, the EQ-5D-5L crosswalk value sets developed by van Hout et al. (2012) <doi:10.1016/j.jval.2012.02.008>, the crosswalk value sets for Bermuda, Jordan and Russia and the van Hout (2021) reverse crosswalk value sets. 11 EQ-5D-Y3L value sets are also included as are the NICE DSU age-sex based EQ-5D-3L to EQ-5D-5L and EQ-5D-5L to EQ-5D-3L mappings. Methods are also included for the analysis of EQ-5D profiles, including those from the book "Methods for Analyzing and Reporting EQ-5D data" by Devlin et al. (2020) <doi:10.1007/978-3-030-47622-9>. Additionally a shiny web tool is included to enable the calculation, visualisation and automated statistical analysis of EQ-5D data via a web browser using EQ-5D dimension scores stored in CSV or Excel files.
This package provides tools for estimating length-based indicators from length frequency data to assess fish stock status and manage fisheries sustainably. Implements methods from Cope and Punt (2009) <doi:10.1577/C08-025.1> for data-limited stock assessment and Froese (2004) <doi:10.1111/j.1467-2979.2004.00144.x> for detecting overfishing using simple indicators. Key functions include: FrequencyTable(): Calculate the frequency table from the collected and also the extract the length frequency data from the frequency table with the upper length_range. A numeric value specifying the bin width for class intervals. If not provided, the bin width is automatically calculated using Wang (2020) <doi:10.1016/j.fishres.2019.105474> formula. FreqTM(): Creates a frequency distribution table for fish length data across multiple months using a consistent length class structure. The bin width is determined by either a custom value or Wang's formula, applied uniformly across all months. The function dynamically detects and renames columns to Month and Length from the input dataframe. The maximum observed length is included as part of the last class, with the upper bound set to the smallest multiple of the bin width greater than or equal to the maximum length. Months can be converted to dates using a configurable day and year, with dates assigned sequentially in day.month.year format (e.g., 15.01.26). FishPar(): Calculates length-based indicators (LBIs) proposed by Froese (2004) <doi:10.1111/j.1467-2979.2004.00144.x> such as the percentage of mature fish (Pmat), percentage of optimal length fish (Popt), percentage of mega spawners (Pmega), and the sum of these as Pobj. This function also estimates confidence intervals for different lengths, visualizes length frequency distributions, and provides data frames containing calculated values. FishSS(): Makes decisions based on input from Cope and Punt (2009) <doi:10.1577/C08-025.1> and parameters calculated by FishPar() (e.g., Pobj, Pmat, Popt, LM_ratio) to determine stock status as target spawning biomass (TSB40) and limit spawning biomass (LSB25), and selectivity. LWR(): Fits and visualizes length-weight relationships using linear regression, with options for log-transformation and customizable plotting.
Enables: (1) plotting two-dimensional confidence regions, (2) coverage analysis of confidence region simulations, (3) calculating confidence intervals and the associated actual coverage for binomial proportions, (4) calculating the support values and the probability mass function of the Kaplan-Meier product-limit estimator, and (5) plotting the actual coverage function associated with a confidence interval for the survivor function from a randomly right-censored data set. Each is given in greater detail next. (1) Plots the two-dimensional confidence region for probability distribution parameters (supported distribution suffixes: cauchy, gamma, invgauss, logis, llogis, lnorm, norm, unif, weibull) corresponding to a user-given complete or right-censored dataset and level of significance. The crplot() algorithm plots more points in areas of greater curvature to ensure a smooth appearance throughout the confidence region boundary. An alternative heuristic plots a specified number of points at roughly uniform intervals along its boundary. Both heuristics build upon the radial profile log-likelihood ratio technique for plotting confidence regions given by Jaeger (2016) <doi:10.1080/00031305.2016.1182946>, and are detailed in a publication by Weld et al. (2019) <doi:10.1080/00031305.2018.1564696>. (2) Performs confidence region coverage simulations for a random sample drawn from a user- specified parametric population distribution, or for a user-specified dataset and point of interest with coversim(). (3) Calculates confidence interval bounds for a binomial proportion with binomTest(), calculates the actual coverage with binomTestCoverage(), and plots the actual coverage with binomTestCoveragePlot(). Calculates confidence interval bounds for the binomial proportion using an ensemble of constituent confidence intervals with binomTestEnsemble(). Calculates confidence interval bounds for the binomial proportion using a complete enumeration of all possible transitions from one actual coverage acceptance curve to another which minimizes the root mean square error for n <= 15 and follows the transitions for well-known confidence intervals for n > 15 using binomTestMSE(). (4) The km.support() function calculates the support values of the Kaplan-Meier product-limit estimator for a given sample size n using an induction algorithm described in Qin et al. (2023) <doi:10.1080/00031305.2022.2070279>. The km.outcomes() function generates a matrix containing all possible outcomes (all possible sequences of failure times and right-censoring times) of the value of the Kaplan-Meier product-limit estimator for a particular sample size n. The km.pmf() function generates the probability mass function for the support values of the Kaplan-Meier product-limit estimator for a particular sample size n, probability of observing a failure h at the time of interest expressed as the cumulative probability percentile associated with X = min(T, C), where T is the failure time and C is the censoring time under a random-censoring scheme. The km.surv() function generates multiple probability mass functions of the Kaplan-Meier product-limit estimator for the same arguments as those given for km.pmf(). (5) The km.coverage() function plots the actual coverage function associated with a confidence interval for the survivor function from a randomly right-censored data set for one or more of the following confidence intervals: Greenwood, log-minus-log, Peto, arcsine, and exponential Greenwood. The actual coverage function is plotted for a small number of items on test, stated coverage, failure rate, and censoring rate. The km.coverage() function can print an optional table containing all possible failure/censoring orderings, along with their contribution to the actual coverage function.
An ODBC database interface.
Queries data from RDAP servers.
Color palettes from famous artists and paintings.
Communications simulation package supporting forward error correction.
The Rmisc library contains functions for data analysis and utility operations.
This package provides a common framework for calculating distance matrices.
Client for the Ocean Biodiversity Information System (<https://obis.org>).
This package provides recursive partitioning functions for classification, regression and survival trees.
This package provides functions for performing spatial microsimulation ('raking') in R.
Constrained clustering, transfer functions, and other methods for analysing Quaternary science data.
This package provides functions to convert R objects into JSON objects and vice-versa.
Interactive viewing and exploration of graphs, connecting R to Cytoscape.js, using websockets.
Create production-ready Rich Text Format (RTF) tables and figures with flexible format.
This package provides functions used in the R: Einführung durch angewandte Statistik (second edition).
.
This package provides functions to read flat or tabular text files from disk (or a connection).