It is used to travel graphs, by using DFS and BFS to get the path from node to each leaf node. Depth first traversal(DFS) is a recursive algorithm for searching all the vertices of a graph or tree data structure. Traversal means visiting all the nodes of a graph. Breadth first traversal(BFS) algorithm is used to search a tree or graph data structure for a node that meets a set of criteria. It starts at the treeâ s root or graph and searches/visits all nodes at the current depth level before moving on to the nodes at the next depth level. Also, it provides the matrix which is reachable between each node. Implement reference about Baruch Awerbuch (1985) <doi:10.1016/0020-0190(85)90083-3>.
Easy implementation of the MABAC multi-criteria decision method, that was introduced by PamuÄ ar and Ä iroviÄ in the work entitled: "The selection of transport and handling resources in logistics centers using Multi-Attributive Border Approximation area Comparison (MABAC)" - <doi:10.1016/j.eswa.2014.11.057> - which aimed to choose implements for logistics centers. This package receives data, preferably in a spreadsheet, reads it and applies the mathematical algorithms inherent to the MABAC method to generate a ranking with the optimal solution according to the established criteria, weights and type of criteria. The data will be normalized, weighted by the weights, the border area will be determined, the distances to this border area will be calculated and finally a ranking with the optimal option will be generated.
The network structural equation modeling conducts a network statistical analysis on a data frame of coincident observations of multiple continuous variables [1]. It builds a pathway model by exploring a pool of domain knowledge guided candidate statistical relationships between each of the variable pairs, selecting the best fit on the basis of a specific criteria such as adjusted r-squared value. This material is based upon work supported by the U.S. National Science Foundation Award EEC-2052776 and EEC-2052662 for the MDS-Rely IUCRC Center, under the NSF Solicitation: NSF 20-570 Industry-University Cooperative Research Centers Program [1] Bruckman, Laura S., Nicholas R. Wheeler, Junheng Ma, Ethan Wang, Carl K. Wang, Ivan Chou, Jiayang Sun, and Roger H. French. (2013) <doi:10.1109/ACCESS.2013.2267611>.
This package provides a set of basic tools for generating, analyzing, summarizing and visualizing finite partially ordered sets. In particular, it implements flexible and very efficient algorithms for the extraction of linear extensions and for the computation of mutual ranking probabilities and other user-defined functionals, over them. The package is meant as a computationally efficient "engine", for the implementation of data analysis procedures, on systems of multidimensional ordinal indicators and partially ordered data, in the spirit of Fattore, M. (2016) "Partially ordered sets and the measurement of multidimensional ordinal deprivation", Social Indicators Research <DOI:10.1007/s11205-015-1059-6>, and Fattore M. and Arcagni, A. (2018) "A reduced posetic approach to the measurement of multidimensional ordinal deprivation", Social Indicators Research <DOI:10.1007/s11205-016-1501-4>.
Single-index mixture cure models allow estimating the probability of cure and the latency depending on a vector (or functional) covariate, avoiding the curse of dimensionality. The vector of parameters that defines the model can be estimated by maximum likelihood. A nonparametric estimator for the conditional density of the susceptible population is provided. For more details, see Piñeiro-Lamas (2024) (<https://ruc.udc.es/dspace/handle/2183/37035>). Funding: This work, integrated into the framework of PERTE for Vanguard Health, has been co-financed by the Spanish Ministry of Science, Innovation and Universities with funds from the European Union NextGenerationEU
, from the Recovery, Transformation and Resilience Plan (PRTR-C17.I1) and from the Autonomous Community of Galicia within the framework of the Biotechnology Plan Applied to Health.
Implementation of evolutionary fuzzy systems for the data mining task called "subgroup discovery". In particular, the algorithms presented in this package are: M. J. del Jesus, P. Gonzalez, F. Herrera, M. Mesonero (2007) <doi:10.1109/TFUZZ.2006.890662> M. J. del Jesus, P. Gonzalez, F. Herrera (2007) <doi:10.1109/MCDM.2007.369416> C. J. Carmona, P. Gonzalez, M. J. del Jesus, F. Herrera (2010) <doi:10.1109/TFUZZ.2010.2060200> C. J. Carmona, V. Ruiz-Rodado, M. J. del Jesus, A. Weber, M. Grootveld, P. González, D. Elizondo (2015) <doi:10.1016/j.ins.2014.11.030> It also provide a Shiny App to ease the analysis. The algorithms work with data sets provided in KEEL, ARFF and CSV format and also with data.frame objects.
How can we measure how the usage or frequency of some feature, such as words, differs across some group or set, such as documents? One option is to use the log odds ratio, but the log odds ratio alone does not account for sampling variability; we haven't counted every feature the same number of times so how do we know which differences are meaningful? Enter the weighted log odds, which tidylo provides an implementation for, using tidy data principles. In particular, here we use the method outlined in Monroe, Colaresi, and Quinn (2008) <doi:10.1093/pan/mpn018> to weight the log odds ratio by a prior. By default, the prior is estimated from the data itself, an empirical Bayes approach, but an uninformative prior is also available.
EpiMix
is a comprehensive tool for the integrative analysis of high-throughput DNA methylation data and gene expression data. EpiMix
enables automated data downloading (from TCGA or GEO), preprocessing, methylation modeling, interactive visualization and functional annotation.To identify hypo- or hypermethylated CpG
sites across physiological or pathological conditions, EpiMix
uses a beta mixture modeling to identify the methylation states of each CpG
probe and compares the methylation of the experimental group to the control group.The output from EpiMix
is the functional DNA methylation that is predictive of gene expression. EpiMix
incorporates specialized algorithms to identify functional DNA methylation at various genetic elements, including proximal cis-regulatory elements of protein-coding genes, distal enhancers, and genes encoding microRNAs
and lncRNAs
.
This is the human disease ontology R package HDO.db, which provides the semantic relationship between human diseases. Relying on the DOSE and GOSemSim
packages, this package can carry out disease enrichment and semantic similarity analyses. Many biological studies are achieved through mouse models, and a large number of data indicate the association between genotypes and phenotypes or diseases. The study of model organisms can be transformed into useful knowledge about normal human biology and disease to facilitate treatment and early screening for diseases. Organism-specific genotype-phenotypic associations can be applied to cross-species phenotypic studies to clarify previously unknown phenotypic connections in other species. Using the same principle to diseases can identify genetic associations and even help to identify disease associations that are not obvious.
This package implements several string comparison algorithms, including calACS
(count all common subsequences), lenACS
(calculate the lengths of all common subsequences), and lenLCS
(calculate the length of the longest common subsequence). Some algorithms differentiate between the more strict definition of subsequence, where a common subsequence cannot be separated by any other items, from its looser counterpart, where a common subsequence can be interrupted by other items. This difference is shown in the suffix of the algorithm (-Strict vs -Loose). For example, q-w is a common subsequence of q-w-e-r and q-e-w-r on the looser definition, but not on the more strict definition. calACSLoose
Algorithm from Wang, H. All common subsequences (2007) IJCAI International Joint Conference on Artificial Intelligence, pp. 635-640.
Implementation of adaptive assessment procedures based on Knowledge Space Theory (KST, Doignon & Falmagne, 1999 <ISBN:9783540645016>) and Formal Psychological Assessment (FPA, Spoto, Stefanutti & Vidotto, 2010 <doi:10.3758/BRM.42.1.342>) frameworks. An adaptive assessment is a type of evaluation that adjusts the difficulty and nature of subsequent questions based on the test taker's responses to previous ones. The package contains functions to perform and simulate an adaptive assessment. Moreover, it is integrated with two Shiny interfaces, making it both accessible and user-friendly. The package has been partially funded by the European Union - NextGenerationEU
and by the Ministry of University and Research (MUR), National Recovery and Resilience Plan (NRRP), Mission 4, Component 2, Investment 1.5, project â RAISE - Robotics and AI for Socio-economic Empowermentâ (ECS00000035).
It is often useful to produce short, quasi-unique identifiers (SQUIDs) without the benefit of a central authority to prevent duplication. Although Universally Unique Identifiers (UUIDs) provide for this, these are also unwieldy; for example, the most used UUID, version 4, is 36 characters long. SQUIDs are short (8 characters) at the expense of having more collisions, which can be mitigated by combining them with human-produced suffixes, yielding relatively brief, half human-readable, almost-unique identifiers (see for example the identifiers used for Decentralized Construct Taxonomies; Peters & Crutzen, 2024 <doi:10.15626/MP.2022.3638>). SQUIDs are the number of centiseconds elapsed since the beginning of 1970 converted to a base 30 system. This package contains functions to produce SQUIDs as well as convert them back into dates and times.
This package implements a spatiotemporal boundary detection model with a dissimilarity metric for areal data with inference in a Bayesian setting using Markov chain Monte Carlo (MCMC). The response variable can be modeled as Gaussian (no nugget), probit or Tobit link and spatial correlation is introduced at each time point through a conditional autoregressive (CAR) prior. Temporal correlation is introduced through a hierarchical structure and can be specified as exponential or first-order autoregressive. Full details of the package can be found in the accompanying vignette. Furthermore, the details of the package can be found in "Diagnosing Glaucoma Progression with Visual Field Data Using a Spatiotemporal Boundary Detection Method", by Berchuck et al (2018), <arXiv:1805.11636>
. The paper is in press at the Journal of the American Statistical Association.
This package provides a collection of methods for both the rank-based estimates and least-square estimates to the Accelerated Failure Time (AFT) model. For rank-based estimation, it provides approaches that include the computationally efficient Gehan's weight and the general's weight such as the logrank weight. Details of the rank-based estimation can be found in Chiou et al. (2014) <doi:10.1007/s11222-013-9388-2> and Chiou et al. (2015) <doi:10.1002/sim.6415>. For the least-square estimation, the estimating equation is solved with generalized estimating equations (GEE). Moreover, in multivariate cases, the dependence working correlation structure can be specified in GEE's setting. Details on the least-squares estimation can be found in Chiou et al. (2014) <doi:10.1007/s10985-014-9292-x>.
This package performs causal mediation analysis for count and zero-inflated count data without or with a post-treatment confounder; calculates power to detect prespecified causal mediation effects, direct effects, and total effects; performs sensitivity analysis when there is a treatment- induced mediator-outcome confounder as described by Cheng, J., Cheng, N.F., Guo, Z., Gregorich, S., Ismail, A.I., Gansky, S.A. (2018) <doi:10.1177/0962280216686131>. Implements Instrumental Variable (IV) method to estimate the controlled (natural) direct and mediation effects, and compute the bootstrap Confidence Intervals as described by Guo, Z., Small, D.S., Gansky, S.A., Cheng, J. (2018) <doi:10.1111/rssc.12233>. This software was made possible by Grant R03DE028410 from the National Institute of Dental and Craniofacial Research, a component of the National Institutes of Health.
This is a cross-platform linear model to SQL compiler. It generates SQL from linear and generalized linear models. Its interface consists of a single function, modelc()
, which takes the output of lm()
or glm()
functions (or any object which has the same signature) and outputs a SQL character vector representing the predictions on the scale of the response variable as described in Dunn & Smith (2018) <doi:10.1007/978-1-4419-0118-7> and originating in Nelder & Wedderburn (1972) <doi:10.2307/2344614>. The resultant SQL can be included in a SELECT statement and returns output similar to that of the glm.predict()
or lm.predict()
predictions, assuming numeric types are represented in the database using sufficient precision. Currently log and identity link functions are supported.
Stop signal task data of go and stop trials is generated per participant. The simulation process is based on the generally non-independent horse race model and fixed stop signal delay or tracking method. Each of go and stop process is assumed having exponentially modified Gaussian(ExG
) or Shifted Wald (SW) distributions. The output data can be converted to BEESTS software input data enabling researchers to test and evaluate various brain stopping processes manifested by ExG
or SW distributional parameters of interest. Methods are described in: Soltanifar M (2020) <https://hdl.handle.net/1807/101208>, Matzke D, Love J, Wiecki TV, Brown SD, Logan GD and Wagenmakers E-J (2013) <doi:10.3389/fpsyg.2013.00918>, Logan GD, Van Zandt T, Verbruggen F, Wagenmakers EJ. (2014) <doi:10.1037/a0035230>.
Enables the user to calculate Value at Risk (VaR
) and Expected Shortfall (ES) by means of various parametric and semiparametric GARCH-type models. For the latter the estimation of the nonparametric scale function is carried out by means of a data-driven smoothing approach. Model quality, in terms of forecasting VaR
and ES, can be assessed by means of various backtesting methods such as the traffic light test for VaR
and a newly developed traffic light test for ES. The approaches implemented in this package are described in e.g. Feng Y., Beran J., Letmathe S. and Ghosh S. (2020) <https://ideas.repec.org/p/pdn/ciepap/137.html> as well as Letmathe S., Feng Y. and Uhde A. (2021) <https://ideas.repec.org/p/pdn/ciepap/141.html>.
Reconstructing gene regulatory networks and transcription factor activity is crucial to understand biological processes and holds potential for developing personalized treatment. Yet, it is still an open problem as state-of-art algorithm are often not able to handle large amounts of data. Furthermore, many of the present methods predict numerous false positives and are unable to integrate other sources of information such as previously known interactions. Here we introduce KBoost, an algorithm that uses kernel PCA regression, boosting and Bayesian model averaging for fast and accurate reconstruction of gene regulatory networks. KBoost can also use a prior network built on previously known transcription factor targets. We have benchmarked KBoost using three different datasets against other high performing algorithms. The results show that our method compares favourably to other methods across datasets.
An implementation of the ALFAM2 dynamic emission model for ammonia volatilization from field-applied animal slurry (manure with dry matter below about 15%). The model can be used to predict cumulative emission and emission rate of ammonia following field application of slurry. Predictions may be useful for emission inventory calculations, fertilizer management, assessment of mitigation strategies, or research aimed at understanding ammonia emission. Default parameter sets include effects of application method, slurry composition, and weather. The model structure is based on a simplified representation of the physical-chemical slurry-soil-atmosphere system. See Hafner et al. (2018) <doi:10.1016/j.atmosenv.2018.11.034> for information on the model and Hafner et al. (2019) <doi:10.1016/j.agrformet.2017.11.027> for more on the measurement data used for parameter development.
This package provides a novel PRS model is introduced to enhance the prediction accuracy by utilising GxE
effects. This package performs Genome Wide Association Studies (GWAS) and Genome Wide Environment Interaction Studies (GWEIS) using a discovery dataset. The package has the ability to obtain polygenic risk scores (PRSs) for a target sample. Finally it predicts the risk values of each individual in the target sample. Users have the choice of using existing models (Li et al., 2015) <doi:10.1093/annonc/mdu565>, (Pandis et al., 2013) <doi:10.1093/ejo/cjt054>, (Peyrot et al., 2018) <doi:10.1016/j.biopsych.2017.09.009> and (Song et al., 2022) <doi:10.1038/s41467-022-32407-9>, as well as newly proposed models for genomic risk prediction (refer to the URL for more details).
This resource provides tools to create, compare, and post-process spatial isotope assignment models of animal origin. It generates probability-of-origin maps for individuals based on user-provided tissue and environment isotope values (e.g., as generated by IsoMAP
, Bowen et al. [2013] <doi:10.1111/2041-210X.12147>) using the framework established in Bowen et al. (2010) <doi:10.1146/annurev-earth-040809-152429>). The package isocat can then quantitatively compare and cluster these maps to group individuals by similar origin. It also includes techniques for applying four approaches (cumulative sum, odds ratio, quantile only, and quantile simulation) with which users can summarize geographic origins and probable distance traveled by individuals. Campbell et al. [2020] establishes several of the functions included in this package <doi:10.1515/ami-2020-0004>.
Runs a Shiny web application that merges raw qPCR
fluorescence data with related metadata to visualize species presence/absence detection patterns and assess data quality. The application calculates threshold values from raw fluorescence data using a method based on the second derivative method, Luu-The et al (2005) <doi:10.2144/05382RR05>, and utilizes the âchipPCRâ
package by Rödiger, Burdukiewicz, & Schierack (2015) <doi:10.1093/bioinformatics/btv205> to calculate Cq values. The application has the ability to connect to a custom developed MySQL
database to populate the applications interface. The application allows users to interact with visualizations such as a dynamic map, amplification curves and standard curves, that allow for zooming and/or filtering. It also enables the generation of customized exportable reports based on filtered mapping data.
Intended to create standard human-in-the-loop validity tests for typical automated content analysis such as topic modeling and dictionary-based methods. This package offers a standard workflow with functions to prepare, administer and evaluate a human-in-the-loop validity test. This package provides functions for validating topic models using word intrusion, topic intrusion (Chang et al. 2009, <https://papers.nips.cc/paper/3700-reading-tea-leaves-how-humans-interpret-topic-models>) and word set intrusion (Ying et al. 2021) <doi:10.1017/pan.2021.33> tests. This package also provides functions for generating gold-standard data which are useful for validating dictionary-based methods. The default settings of all generated tests match those suggested in Chang et al. (2009) and Song et al. (2020) <doi:10.1080/10584609.2020.1723752>.