Random simulations of fuzzy numbers are still a challenging problem. The aim of this package is to provide the respective procedures to simulate fuzzy random variables, especially in the case of the piecewise linear fuzzy numbers (PLFNs, see Coroianua et al. (2013) <doi:10.1016/j.fss.2013.02.005> for the further details). Additionally, the special resampling algorithms known as the epistemic bootstrap are provided (see Grzegorzewski and Romaniuk (2022) <doi:10.34768/amcs-2022-0021>, Grzegorzewski and Romaniuk (2022) <doi:10.1007/978-3-031-08974-9_39>, Romaniuk et al. (2024) <doi:10.32614/RJ-2024-016>) together with the functions to apply statistical tests and estimate various characteristics based on the epistemic bootstrap. The package also includes real-life datasets of epistemic fuzzy triangular and trapezoidal numbers. The fuzzy numbers used in this package are consistent with the FuzzyNumbers package.
This package contains functions for the classification and ranking of top candidate features, reconstruction of networks from adjacency matrices and data frames, analysis of the topology of the network and calculation of centrality measures, and identification of the most influential nodes. Also, a function is provided for running SIRIR model, which is the combination of leave-one-out cross validation technique and the conventional SIR model, on a network to unsupervisedly rank the true influence of vertices. Additionally, some functions have been provided for the assessment of dependence and correlation of two network centrality measures as well as the conditional probability of deviation from their corresponding means in opposite direction. Fred Viole and David Nawrocki (2013, ISBN:1490523995). Csardi G, Nepusz T (2006). "The igraph software package for complex network research." InterJournal, Complex Systems, 1695. Adopted algorithms and sources are referenced in function document.
Seq2pathway is a novel tool for functional gene-set (or termed as pathway) analysis of next-generation sequencing data, consisting of "seq2gene" and "gene2path" components. The seq2gene links sequence-level measurements of genomic regions (including SNPs or point mutation coordinates) to gene-level scores, and the gene2pathway summarizes gene scores to pathway-scores for each sample. The seq2gene has the feasibility to assign both coding and non-exon regions to a broader range of neighboring genes than only the nearest one, thus facilitating the study of functional non-coding regions. The gene2pathway takes into account the quantity of significance for gene members within a pathway compared those outside a pathway. The output of seq2pathway is a general structure of quantitative pathway-level scores, thus allowing one to functional interpret such datasets as RNA-seq, ChIP-seq, GWAS, and derived from other next generational sequencing experiments.
These Rcpp'-based functions compute the efficient score statistics for grouped time-to-event data (Prentice and Gloeckler, 1978), with the optional inclusion of baseline covariates. Functions for estimating the parameter of interest and nuisance parameters, including baseline hazards, using maximum likelihood are also provided. A parallel set of functions allow for the incorporation of family structure of related individuals (e.g., trios). Note that the current implementation of the frailty model (Ripatti and Palmgren, 2000) is sensitive to departures from model assumptions, and should be considered experimental. For these data, the exact proportional-hazards-model-based likelihood is computed by evaluating multiple variable integration. The integration is accomplished using the Cuba library (Hahn, 2005), and the source files are included in this package. The maximization process is carried out using Brent's algorithm, with the C++ code file from John Burkardt and John Denker (Brent, 2002).
Fast, optimal, and reproducible clustering algorithms for circular, periodic, or framed data. The algorithms introduced here are based on a core algorithm for optimal framed clustering the authors have developed (Debnath & Song 2021) <doi:10.1109/TCBB.2021.3077573>. The runtime of these algorithms is O(K N log^2 N), where K is the number of clusters and N is the number of circular data points. On a desktop computer using a single processor core, millions of data points can be grouped into a few clusters within seconds. One can apply the algorithms to characterize events along circular DNA molecules, circular RNA molecules, and circular genomes of bacteria, chloroplast, and mitochondria. One can also cluster climate data along any given longitude or latitude. Periodic data clustering can be formulated as circular clustering. The algorithms offer a general high-performance solution to circular, periodic, or framed data clustering.
Computes the extended spring indices (SI-x) and false spring exposure indices (FSEI). The SI-x indices are standard indices used for analysis in spring phenology studies. In addition, the FSEI is also from research on the climatology of false springs and adjusted to include an early and late false spring exposure index. The indices include the first leaf index, first bloom index, and false spring exposure indices, along with all calculations for all functions needed to calculate each index. The main function returns all indices, but each function can also be run separately. Allstadt et al. (2015) <doi: 10.1088/1748-9326/10/10/104008> Ault et al. (2015) <doi: 10.1016/j.cageo.2015.06.015> Peterson and Abatzoglou (2014) <doi: 10.1002/2014GL059266> Schwarz et al. (2006) <doi: 10.1111/j.1365-2486.2005.01097.x> Schwarz et al. (2013) <doi: 10.1002/joc.3625>.
The C++ header files of the Stan project are provided by this package. There is a shared object containing part of the CVODES library, but it is not accessible from R. r-stanheaders is only useful for developers who want to utilize the LinkingTo directive of their package's DESCRIPTION file to build on the Stan library without incurring unnecessary dependencies.
The Stan project develops a probabilistic programming language that implements full or approximate Bayesian statistical inference via Markov Chain Monte Carlo or variational methods and implements (optionally penalized) maximum likelihood estimation via optimization. The Stan library includes an advanced automatic differentiation scheme, templated statistical and linear algebra functions that can handle the automatically differentiable scalar types (and doubles, ints, etc.), and a parser for the Stan language. The r-rstan package provides user-facing R functions to parse, compile, test, estimate, and analyze Stan models.
Detect feedback loops (cycles, circuits) between species (nodes) in ordinary differential equation (ODE) models. Feedback loops are paths from a node to itself without visiting any other node twice, and they have important regulatory functions. Loops are reported with their order of participating nodes and their length, and whether the loop is a positive or a negative feedback loop. An upper limit of the number of feedback loops limits runtime (which scales with feedback loop count). Model parametrizations and values of the modelled variables are accounted for. Computation uses the characteristics of the Jacobian matrix as described e.g. in Thomas and Kaufman (2002) <doi:10.1016/s1631-0691(02)01452-x>. Input can be the Jacobian matrix of the ODE model or the ODE function definition; in the latter case, the Jacobian matrix is determined using numDeriv'. Graph-based algorithms from igraph are employed for path detection.
This package contains the functions for construction and visualization of underlying and reflexivity graphs of the three families of the proximity catch digraphs (PCDs), see (Ceyhan (2005) ISBN:978-3-639-19063-2), and for computing the edge density of these PCD-based graphs which are then used for testing the patterns of segregation and association against complete spatial randomness (CSR)) or uniformity in one and two dimensional cases. The PCD families considered are Arc-Slice PCDs, Proportional-Edge (PE) PCDs (Ceyhan et al. (2006) <doi:10.1016/j.csda.2005.03.002>) and Central Similarity PCDs (Ceyhan et al. (2007) <doi:10.1002/cjs.5550350106>). See also (Ceyhan (2016) <doi:10.1016/j.stamet.2016.07.003>) for edge density of the underlying and reflexivity graphs of PE-PCDs. The package also has tools for visualization of PCD-based graphs for one, two, and three dimensional data.
Org Ref is an Emacs library that provides rich support for citations, labels and cross-references in Org mode.
The basic idea of Org Ref is that it defines a convenient interface to insert citations from a reference database (e.g., from BibTeX files), and a set of functional Org links for citations, cross-references and labels that export properly to LaTeX, and that provide clickable functionality to the user. Org Ref interfaces with Helm BibTeX to facilitate citation entry, and it can also use RefTeX.
It also provides a fairly large number of utilities for finding bad citations, extracting BibTeX entries from citations in an Org file, and functions to create and modify BibTeX entries from a variety of sources, most notably from a DOI.
Org Ref is especially suitable for Org documents destined for LaTeX export and scientific publication. Org Ref is also useful for research documents and notes.
Org-Babel support for evaluating rust code. Much of this is modeled after `ob-C'. Just like the `ob-C', you can specify :flags headers when compiling with the "rust run" command. Unlike `ob-C', you can also specify :args which can be a list of arguments to pass to the binary. If you quote the value passed into the list, it will use `ob-ref to find the reference data. If you do not include a main function or a package name, `ob-rust will provide it for you and it's the only way to properly use very limited implementation: - currently only support :results output. ; Requirements: - You must have rust and cargo installed and the rust and cargo should be in your `exec-path rust command. - rust-script - `rust-mode is also recommended for syntax highlighting and formatting. Not this particularly needs it, it just assumes you have it.
This package provides easy access to essential climate change datasets to non-climate experts. Users can download the latest raw data from authoritative sources and view it via pre-defined ggplot2 charts. Datasets include atmospheric CO2, methane, emissions, instrumental and proxy temperature records, sea levels, Arctic/Antarctic sea-ice, Hurricanes, and Paleoclimate data. Sources include: NOAA Mauna Loa Laboratory <https://gml.noaa.gov/ccgg/trends/data.html>, Global Carbon Project <https://www.globalcarbonproject.org/carbonbudget/>, NASA GISTEMP <https://data.giss.nasa.gov/gistemp/>, National Snow and Sea Ice Data Center <https://nsidc.org/home>, CSIRO <https://research.csiro.au/slrwavescoast/sea-level/measurements-and-data/sea-level-data/>, NOAA Laboratory for Satellite Altimetry <https://www.star.nesdis.noaa.gov/socd/lsa/SeaLevelRise/> and HURDAT Atlantic Hurricane Database <https://www.aoml.noaa.gov/hrd/hurdat/Data_Storm.html>, Vostok Paleo carbon dioxide and temperature data: <doi:10.3334/CDIAC/ATG.009>.
Tool for easy prior construction and visualization. It helps to formulates joint prior distributions for variance parameters in latent Gaussian models. The resulting prior is robust and can be created in an intuitive way. A graphical user interface (GUI) can be used to choose the joint prior, where the user can click through the model and select priors. An extensive guide is available in the GUI. The package allows for direct inference with the specified model and prior. Using a hierarchical variance decomposition, we formulate a joint variance prior that takes the whole model structure into account. In this way, existing knowledge can intuitively be incorporated at the level it applies to. Alternatively, one can use independent variance priors for each model components in the latent Gaussian model. Details can be found in the accompanying scientific paper: Hem, Fuglstad, Riebler (2024, Journal of Statistical Software, <doi:10.18637/jss.v110.i03>).
Simulate inventory policies with and without forecasting, facilitate inventory analysis calculations such as stock levels and re-order points,pricing and promotions calculations. The package includes calculations of inventory metrics, stock-out calculations and ABC analysis calculations. The package includes revenue management techniques such as Multi-product optimization,logit and polynomial model optimization. The functions are referenced from : 1-Harris, Ford W. (1913). "How many parts to make at once". Factory, The Magazine of Management. 2- Nahmias, S. Production and Operations Analysis. McGraw-Hill International Edition. 3-Silver, E.A., Pyke, D.F., Peterson, R. Inventory Management and Production Planning and Scheduling. 4-Ballou, R.H. Business Logistics Management. 5-MIT Micromasters Program. 6- Columbia University course for supply and demand analysis. 8- Price Elasticity of Demand MATH 104,Mark Mac Lean (with assistance from Patrick Chan) 2011W For further details or correspondence :<www.linkedin.com/in/haythamomar>, <www.rescaleanalytics.com>.
Perform inference in the secondary analysis setting with linked data potentially containing mismatch errors. Only the linked data file may be accessible and information about the record linkage process may be limited or unavailable. Implements the General Framework for Regression with Mismatched Data developed by Slawski et al. (2023) <doi:10.48550/arXiv.2306.00909>. The framework uses a mixture model for pairs of linked records whose two components reflect distributions conditional on match status, i.e., correct match or mismatch. Inference is based on composite likelihood and the Expectation-Maximization (EM) algorithm. The package currently supports Cox Proportional Hazards Regression (right-censored data only) and Generalized Linear Regression Models (Gaussian, Gamma, Poisson, and Logistic (binary models only)). Information about the underlying record linkage process can be incorporated into the method if available (e.g., assumed overall mismatch rate, safe matches, predictors of match status, or predicted probabilities of correct matches).
stJoincount facilitates the application of join count analysis to spatial transcriptomic data generated from the 10x Genomics Visium platform. This tool first converts a labeled spatial tissue map into a raster object, in which each spatial feature is represented by a pixel coded by label assignment. This process includes automatic calculation of optimal raster resolution and extent for the sample. A neighbors list is then created from the rasterized sample, in which adjacent and diagonal neighbors for each pixel are identified. After adding binary spatial weights to the neighbors list, a multi-categorical join count analysis is performed to tabulate "joins" between all possible combinations of label pairs. The function returns the observed join counts, the expected count under conditions of spatial randomness, and the variance calculated under non-free sampling. The z-score is then calculated as the difference between observed and expected counts, divided by the square root of the variance.
Included are two main interfaces, bentcable.ar() and bentcable.dev.plot(), for fitting and diagnosing bent-cable regressions for autoregressive time-series data (Chiu and Lockhart 2010, <doi:10.1002/cjs.10070>) or independent data (time series or otherwise - Chiu, Lockhart and Routledge 2006, <doi:10.1198/016214505000001177>). Some components in the package can also be used as stand-alone functions. The bent cable (linear-quadratic-linear) generalizes the broken stick (linear-linear), which is also handled by this package. Version 0.2 corrected a glitch in the computation of confidence intervals for the CTP. References that were updated from Versions 0.2.1 and 0.2.2 appear in Version 0.2.3 and up. Version 0.3.0 improved robustness of the error-message producing mechanism. Version 0.3.1 improves the NAMESPACE file of the package. It is the author's intention to distribute any future updates via GitHub.
Continuous glucose monitoring (CGM) systems provide real-time, dynamic glucose information by tracking interstitial glucose values throughout the day. Glycemic variability, also known as glucose variability, is an established risk factor for hypoglycemia (Kovatchev) and has been shown to be a risk factor in diabetes complications. Over 20 metrics of glycemic variability have been identified. Here, we provide functions to calculate glucose summary metrics, glucose variability metrics (as defined in clinical publications), and visualizations to visualize trends in CGM data. Cho P, Bent B, Wittmann A, et al. (2020) <https://diabetes.diabetesjournals.org/content/69/Supplement_1/73-LB.abstract> American Diabetes Association (2020) <https://professional.diabetes.org/diapro/glucose_calc> Kovatchev B (2019) <doi:10.1177/1932296819826111> Kovdeatchev BP (2017) <doi:10.1038/nrendo.2017.3> Tamborlane W V., Beck RW, Bode BW, et al. (2008) <doi:10.1056/NEJMoa0805017> Umpierrez GE, P. Kovatchev B (2018) <doi:10.1016/j.amjms.2018.09.010>.
This package provides a large number of measurements generate count data. This is a statistical data type that only assumes non-negative integer values and is generated by counting. Typically, counting data can be found in biomedical applications, such as the analysis of DNA double-strand breaks. The number of DNA double-strand breaks can be counted in individual cells using various bioanalytical methods. For diagnostic applications, it is relevant to record the distribution of the number data in order to determine their biomedical significance (Roediger, S. et al., 2018. Journal of Laboratory and Precision Medicine. <doi:10.21037/jlpm.2018.04.10>). The software offers functions for a comprehensive automated evaluation of distribution models of count data. In addition to programmatic interaction, a graphical user interface (web server) is included, which enables fast and interactive data-scientific analyses. The user is supported in selecting the most suitable counting distribution for his own data set.
In tumor tissue, underlying genomic instability can lead to DNA copy number alterations, e.g., copy number gains or losses. Sporadic copy number alterations occur randomly throughout the genome, whereas recurrent alterations are observed in the same genomic region across multiple independent samples, perhaps because they provide a selective growth advantage. Here we use cyclic shift permutations to identify recurrent copy number alterations in a single cohort or recurrent copy number differences in two cohorts based on a common set of genomic markers. Additional functionality is provided to perform downstream analyses, including the creation of summary files and graphics. DiNAMIC.Duo builds upon the original DiNAMIC package of Walter et al. (2011) <doi:10.1093/bioinformatics/btq717> and leverages the theory developed in Walter et al. (2015) <doi:10.1093/biomet/asv046>. An article describing DiNAMIC.Duo by Walter et al. (2022) can be found at <doi: 10.1093/bioinformatics/btac542>.
This package provides a reproducible pipeline to conduct genomeâ wide association studies (GWAS) and extract singleâ nucleotide polymorphisms (SNPs) for a human trait or disease. Given aggregated GWAS dataset(s) and a userâ defined significance threshold, the package retrieves significant SNPs from the GWAS Catalog and the Experimental Factor Ontology (EFO), annotates their gene context, and can write a harmonised metadata table in comma-separated values (CSV) format, genomic intervals in the Browser Extensible Data (BED) format, and sequences in the FASTA (text-based sequence) format with user-defined flanking regions for clustered regularly interspaced short palindromic repeats (CRISPR) guide design. For details on the resources and methods see: Buniello et al. (2019) <doi:10.1093/nar/gky1120>; Sollis et al. (2023) <doi:10.1093/nar/gkac1010>; Jinek et al. (2012) <doi:10.1126/science.1225829>; Malone et al. (2010) <doi:10.1093/bioinformatics/btq099>; Experimental Factor Ontology (EFO) <https://www.ebi.ac.uk/efo>.
Tests whether the linear hypothesis of a model is correct specified using Dominguez-Lobato test. Also Ramsey's RESET (Regression Equation Specification Error Test) test is implemented and Wald tests can be carried out. Although RESET test is widely used to test the linear hypothesis of a model, Dominguez and Lobato (2019) proposed a novel approach that generalizes well known specification tests such as Ramsey's. This test relies on wild-bootstrap; this package implements this approach to be usable with any function that fits linear models and is compatible with the update() function such as stats'::lm(), lfe'::felm() and forecast'::Arima(), for ARMA (autoregressiveâ moving-average) models. Also the package can handle custom statistics such as Cramer von Mises and Kolmogorov Smirnov, described by the authors, and custom distributions such as Mammen (discrete and continuous) and Rademacher. Manuel A. Dominguez & Ignacio N. Lobato (2019) <doi:10.1080/07474938.2019.1687116>.
This package provides methods for model selection, model averaging, and calculating metrics, such as the Gini, Theil, Mean Log Deviation, etc, on binned income data where the topmost bin is right-censored. We provide both a non-parametric method, termed the bounded midpoint estimator (BME), which assigns cases to their bin midpoints; except for the censored bins, where cases are assigned to an income estimated by fitting a Pareto distribution. Because the usual Pareto estimate can be inaccurate or undefined, especially in small samples, we implement a bounded Pareto estimate that yields much better results. We also provide a parametric approach, which fits distributions from the generalized beta (GB) family. Because some GB distributions can have poor fit or undefined estimates, we fit 10 GB-family distributions and use multimodel inference to obtain definite estimates from the best-fitting distributions. We also provide binned income data from all United States of America school districts, counties, and states.
Analyzing genetic data obtained from pooled samples. This package can read in Fragment Analysis output files, process the data, and score peaks, as well as facilitate various analyses, including cluster analysis, calculation of genetic distances and diversity indices, as well as bootstrap resampling for statistical inference. Specifically tailored to handle genetic data efficiently, researchers can explore population structure, genetic differentiation, and genetic relatedness among samples. We updated some functions from Covarrubias-Pazaran et al. (2016) <doi:10.1186/s12863-016-0365-6> to allow for the use of new file formats and referenced the following to write our genetic analysis functions: Long et al. (2022) <doi:10.1038/s41598-022-04776-0>, Jost (2008) <doi:10.1111/j.1365-294x.2008.03887.x>, Nei (1973) <doi:10.1073/pnas.70.12.3321>, Foulley et al. (2006) <doi:10.1016/j.livprodsci.2005.10.021>, Chao et al. (2008) <doi:10.1111/j.1541-0420.2008.01010.x>.