This package allows biologists to judge in the first place whether the sequence surrounding the polymorphism is a good match, and in the second place how much information is gained or lost in one allele of the polymorphism relative to another. This package gives a choice of algorithms for interrogation of genomes with motifs from public sources:
a weighted-sum probability matrix;
log-probabilities;
weighted by relative entropy.
This package can predict effects for novel or previously described variants in public databases, making it suitable for tasks beyond the scope of its original design. Lastly, it can be used to interrogate any genome curated within Bioconductor.
Computes the ATM (Attractor Transition Matrix) structure and the tree-like structure describing the cell differentiation process (based on the Threshold Ergodic Set concept introduced by Serra and Villani), starting from the Boolean networks with synchronous updating scheme of the BoolNet
R package. TESs (Threshold Ergodic Sets) are the mathematical abstractions that represent the different cell types arising during ontogenesis. TESs and the powerful model of biological differentiation based on Boolean networks to which it belongs have been firstly described in "A Dynamical Model of Genetic Networks for Cell Differentiation" Villani M, Barbieri A, Serra R (2011) A Dynamical Model of Genetic Networks for Cell Differentiation. PLOS ONE 6(3): e17703.
Bayesian data analysis usually incurs long runtimes and cumbersome custom code. A pipeline toolkit tailored to Bayesian statisticians, the jagstargets R package is leverages targets and R2jags to ease this burden. jagstargets makes it super easy to set up scalable JAGS pipelines that automatically parallelize the computation and skip expensive steps when the results are already up to date. Minimal custom code is required, and there is no need to manually configure branching, so usage is much easier than targets alone. For the underlying methodology, please refer to the documentation of targets <doi:10.21105/joss.02959> and JAGS (Plummer 2003) <https://www.r-project.org/conferences/DSC-2003/Proceedings/Plummer.pdf>.
This package provides a framework for deconvolution, alignment and postprocessing of 1-dimensional (1d) nuclear magnetic resonance (NMR) spectra, resulting in a data matrix of aligned signal integrals. The deconvolution part uses the algorithm described in Koh et al. (2009) <doi:10.1016/j.jmr.2009.09.003>. The alignment part is based on functions from the speaq package, described in Beirnaert et al. (2018) <doi:10.1371/journal.pcbi.1006018> and Vu et al. (2011) <doi:10.1186/1471-2105-12-405>. A detailed description and evaluation of an early version of the package, MetaboDecon1D
v0.2.2', can be found in Haeckl et al. (2021) <doi:10.3390/metabo11070452>.
This is a collection of tools for assessment of feature importance and feature effects. Key functions are:
feature_importance()
for assessment of global level feature importance,ceteris_paribus()
for calculation of the what-if plots,partial_dependence()
for partial dependence plots,conditional_dependence()
for conditional dependence plots,accumulated_dependence()
for accumulated local effects plots,aggregate_profiles()
andcluster_profiles()
for aggregation of ceteris paribus profiles,generic
print()
andplot()
for better usability of selected explainers,generic
plotD3()
for interactive, D3 based explanations, andgeneric
describe()
for explanations in natural language.
This package provides a highly efficient R tool suite for Credit Modeling, Analysis and Visualization.Contains infrastructure functionalities such as data exploration and preparation, missing values treatment, outliers treatment, variable derivation, variable selection, dimensionality reduction, grid search for hyper parameters, data mining and visualization, model evaluation, strategy analysis etc. This package is designed to make the development of binary classification models (machine learning based models as well as credit scorecard) simpler and faster. The references including: 1 Refaat, M. (2011, ISBN: 9781447511199). Credit Risk Scorecard: Development and Implementation Using SAS; 2 Bezdek, James C.FCM: The fuzzy c-means clustering algorithm. Computers & Geosciences (0098-3004),<DOI:10.1016/0098-3004(84)90020-7>.
Este paquete pretende apoyar el proceso enseñanza-aprendizaje de estadà stica descriptiva e inferencial. Las funciones contenidas en el paquete estadistica cubren los conceptos básicos estudiados en un curso introductorio. Muchos conceptos son ilustrados con gráficos dinámicos o web apps para facilitar su comprensión. This package aims to help the teaching-learning process of descriptive and inferential statistics. The functions contained in the package estadistica cover the basic concepts studied in a statistics introductory course. Many concepts are illustrated with dynamic graphs or web apps to make the understanding easier. See: Esteban et al. (2005, ISBN: 9788497323741), Newbold et al.(2019, ISBN:9781292315034 ), Murgui et al. (2002, ISBN:9788484424673) .
This package performs pathway enrichment analysis using a voting-based framework that integrates CpGâ gene
regulatory information from expression quantitative trait methylation (eQTM
) data. For a grid of top-ranked CpGs
and filtering thresholds, gene sets are generated and refined using an entropy-based pruning strategy that balances information richness, stability, and probe bias correction. In particular, gene lists dominated by genes with disproportionately high numbers of CpG
mappings are penalized to mitigate active probe biasâ a common artifact in methylation data analysis. Enrichment results across parameter combinations are then aggregated using a voting scheme, prioritizing pathways that are consistently recovered under diverse settings and robust to parameter perturbations.
This code provides several different functions for cleaning and analyzing continuous glucose monitor data. Currently it works with Dexcom', iPro
2', Diasend', Libre', or Carelink data. The cleandata()
function takes a directory of CGM data files and prepares them for analysis. cgmvariables()
iterates through a directory of cleaned CGM data files and produces a single spreadsheet with data for each file in either rows or columns. The column format of this spreadsheet is compatible with REDCap data upload. cgmreport()
also iterates through a directory of cleaned data, and produces PDFs of individual and aggregate AGP plots. Please visit <https://github.com/childhealthbiostatscore/R-Packages/> to download the new-user guide.
This package provides methods to perform Joint graph Regularized Single-Cell Kullback-Leibler Sparse Non-negative Matrix Factorization ('jrSiCKLSNMF
', pronounced "junior sickles NMF") on quality controlled single-cell multimodal omics count data. jrSiCKLSNMF
specifically deals with dual-assay scRNA-seq
and scATAC-seq
data. This package contains functions to extract meaningful latent factors that are shared across omics modalities. These factors enable accurate cell-type clustering and facilitate visualizations. Methods for pre-processing, clustering, and mini-batch updates and other adaptations for larger datasets are also included. For further details on the methods used in this package please see Ellis, Roy, and Datta (2023) <doi:10.3389/fgene.2023.1179439>.
This package provides functions for simulating and estimating kinship-related dispersal. Based on the methods described in M. Jasper, T.L. Schmidt., N.W. Ahmad, S.P. Sinkins & A.A. Hoffmann (2019) <doi:10.1111/1755-0998.13043> "A genomic approach to inferring kinship reveals limited intergenerational dispersal in the yellow fever mosquito". Assumes an additive variance model of dispersal in two dimensions, compatible with Wright's neighbourhood area. Simple and composite dispersal simulations are supplied, as well as the functions needed to estimate parent-offspring dispersal for simulated or empirical data, and to undertake sampling design for future field studies of dispersal. For ease of use an integrated Shiny app is also included.
Labels are a common construct in statistical software providing a human readable description of a variable. While variable names are succinct, quick to type, and follow a language's naming conventions, labels may be more illustrative and may use plain text and spaces. R does not provide native support for labels. Some packages, however, have made this feature available. Most notably, the Hmisc package provides labelling methods for a number of different object. Due to design decisions, these methods are not all exported, and so are unavailable for use in package development. The labelVector
package supports labels for atomic vectors in a light-weight design that is suitable for use in other packages.
R6 classes to model traditional life insurance contracts like annuities, whole life insurances or endowments. Such life insurance contracts provide a guaranteed interest and are not directly linked to the performance of a particular investment vehicle, but they typically provide (discretionary) profit participation. This package provides a framework to model such contracts in a very generic (cash-flow-based) way and includes modelling profit participation schemes, dynamic increases or more general contract layers, as well as contract changes (like sum increases or premium waivers). All relevant quantities like premium decomposition, reserves and benefits over the whole contract period are calculated and potentially exported to Excel'. Mortality rates are given using the MortalityTables
package.
An HTML widget that randomly tours 2D projections of numerical data. A random walk through projections of the data is shown. The user can manipulate the plot to use specified axes, or turn on Guided Tour mode to find an informative projection of the data. Groups within the data can be hidden or shown, as can particular axes. Points can be brushed, and the selection can be linked to other widgets using crosstalk. The underlying method to produce the random walk and projection pursuit uses Langevin dynamics. The widget can be used from within R, or included in a self-contained R Markdown or Quarto document or presentation, or used in a Shiny app.
This package provides a test of multivariate normality of an unknown sample that does not require estimation of the nuisance parameters, the mean and covariance matrix. Rather, a sequence of transformations removes these nuisance parameters and results in a set of sample matrices that are positive definite. These matrices are uniformly distributed on the space of positive definite matrices in the unit hyper-rectangle if and only if the original data is multivariate normal (Fairweather, 1973, Doctoral dissertation, University of Washington). The package performs a goodness of fit test of this hypothesis. In addition to the test, functions in the package give visualizations of the support region of positive definite matrices for bivariate samples.
Conducts a goodness-of-fit test for the Weibull distribution (referred to as the weibullness test) and furnishes parameter estimations for both the two-parameter and three-parameter Weibull distributions. Notably, the threshold parameter is derived through correlation from the Weibull plot. Additionally, this package conducts goodness-of-fit assessments for the exponential, Gumbel, and inverse Weibull distributions, accompanied by parameter estimations. For more details, see Park (2017) <doi:10.23055/ijietap.2017.24.4.2848>, Park (2018) <doi:10.1155/2018/6056975>, and Park (2023) <doi:10.3390/math11143156>. This work was supported by the National Research Foundation of Korea (NRF) grants funded by the Korea government (MSIT) (No. 2022R1A2C1091319, RS-2023-00242528).
The package provides functionality that can be useful for the analysis of the high-density tiling microarray data (such as from Affymetrix genechips) or for measuring the transcript abundance and the architecture. The main functionalities of the package are:
the class segmentation for representing partitionings of a linear series of data;
the function segment for fitting piecewise constant models using a dynamic programming algorithm that is both fast and exact;
the function
confint
for calculating confidence intervals using thestrucchange
package;the function
plotAlongChrom
for generating pretty plots;the function
normalizeByReference
for probe-sequence dependent response adjustment from a (set of) reference hybridizations.
Biologically relevant, yet mathematically sound constraints are used to compute the propensity and thence infer the dominant direction of reactions of a generic biochemical network. The reactions must be unique and their number must exceed that of the reactants,i.e., reactions >= reactants + 2. ReDirection
', computes the null space of a user-defined stoichiometry matrix. The spanning non-zero and unique reaction vectors (RVs) are combinatorially summed to generate one or more subspaces recursively. Every reaction is represented as a sequence of identical components across all RVs of a particular subspace. The terms are evaluated with (biologically relevant bounds, linear maps, tests of convergence, descriptive statistics, vector norms) and the terms are classified into forward-, reverse- and equivalent-subsets. Since, these are mutually exclusive the probability of occurrence is binary (all, 1; none, 0). The combined propensity of a reaction is the p1-norm of the sub-propensities, i.e., sum of the products of the probability and maximum numeric value of a subset (least upper bound, greatest lower bound). This, if strictly positive is the probable rate constant, is used to infer dominant direction and annotate a reaction as "Forward (f)", "Reverse (b)" or "Equivalent (e)". The inherent computational complexity (NP-hard) per iteration suggests that a suitable value for the number of reactions is around 20. Three functions comprise ReDirection
. These are check_matrix()
and reaction_vector()
which are internal, and calculate_reaction_vector()
which is external.
Computation of key characteristics and plots for blinded sample size recalculation. Continuous as well as binary endpoints are supported in superiority and non-inferiority trials. See Baumann, Pilz, Kieser (2022) <doi:10.32614/RJ-2022-001> for a detailed description. The implemented methods include the approaches by Lu, K. (2019) <doi:10.1002/pst.1737>, Kieser, M. and Friede, T. (2000) <doi:10.1002/(SICI)1097-0258(20000415)19:7%3C901::AID-SIM405%3E3.0.CO;2-L>, Friede, T. and Kieser, M. (2004) <doi:10.1002/pst.140>, Friede, T., Mitchell, C., Mueller-Veltern, G. (2007) <doi:10.1002/bimj.200610373>, and Friede, T. and Kieser, M. (2011) <doi:10.3414/ME09-01-0063>.
Most existing approaches for network reconstruction can only infer an overall network and, also, fail to capture a complete set of network properties. To address these issues, a new model has been developed, which converts static data into their dynamic form. idopNetwork
is an R interface to this model, it can inferring informative, dynamic, omnidirectional and personalized networks. For more information on functional clustering part, see Kim et al. (2008) <doi:10.1534/genetics.108.093690>, Wang et al. (2011) <doi:10.1093/bib/bbr032>. For more information on our model, see Chen et al. (2019) <doi:10.1038/s41540-019-0116-1>, and Cao et al. (2022) <doi:10.1080/19490976.2022.2106103>.
This package performs Levins loop analysis of qualitatively-specified complex causal systems. Loop analysis makes qualitative predictions of variable change in a system of causally interdependent variables, where "qualitative" means direct causal relationships and indirect causal effects are coded as sign only (i.e. increases, decreases, no change, and ambiguous). This implementation includes output support for graphs in .dot file format for use with visualization software such as graphviz (<https://graphviz.org>). LoopAnalyst
provides tools for the construction and output of community matrices, computation and output of community effect matrices, tables of correlations, adjoint, absolute feedback, weighted feedback and weighted prediction matrices, change in life expectancy matrices, and feedback, path and loop enumeration tools.
This package provides a method for factor retention using a pre-trained Long Short Term Memory (LSTM) Network, which is originally developed by Hochreiter and Schmidhuber (1997) <doi:10.1162/neco.1997.9.8.1735>, is provided. The sample size of the dataset used to train the LSTM model is 1,000,000. Each sample is a batch of simulated response data with a specific latent factor structure. The eigenvalues of these response data will be used as sequential data to train the LSTM. The pre-trained LSTM is capable of factor retention for real response data with a true latent factor number ranging from 1 to 10, that is, determining the number of factors.
This package provides details such as Morphine Equivalent Dose (MED), brand name and opioid content which are calculated of all oral opioids authorized for sale by Health Canada and the FDA based on their Drug Identification Number (DIN) or National Drug Code (NDC). MEDs are calculated based on recommendations by Canadian Institute for Health Information (CIHI) and Von Korff et al (2008) and information obtained from Health Canada's Drug Product Database's monthly data dump or FDA Daily database for Canadian and US databases respectively. Please note in no way should output from this package be a substitute for medical advise. All medications should only be consumed on prescription from a licensed healthcare provider.
This package provides a lightweight tool that provides a reproducible workflow for selecting and executing appropriate statistical analysis in one-way or two-way experimental designs. The package automatically checks for data normality, conducts parametric (ANOVA) or non-parametric (Kruskal-Wallis) tests, performs post-hoc comparisons with Compact Letter Displays (CLD), and generates publication-ready boxplots, faceted plots, and heatmaps. It is designed for researchers seeking fast, automated statistical summaries and visualization. Based on established statistical methods including Shapiro and Wilk (1965) <doi:10.2307/2333709>, Kruskal and Wallis (1952) <doi:10.1080/01621459.1952.10483441>, Tukey (1949) <doi:10.2307/3001913>, Fisher (1925) <ISBN:0050021702>, and Wickham (2016) <ISBN:978-3-319-24277-4>.