Multiple moderation analysis for two-instance repeated measures designs, with up to three simultaneous moderators (dichotomous and/or continuous) with additive or multiplicative relationship. Includes analyses of simple slopes and conditional effects at (automatically determined or manually set) values of the moderator(s), as well as an implementation of the Johnson-Neyman procedure for determining regions of significance in single moderator models. Based on Montoya, A. K. (2018) "Moderation analysis in two-instance repeated measures designs: Probing methods and multiple moderator models" <doi:10.3758/s13428-018-1088-6> .
An R interface to pikchr (<https://pikchr.org>, pronounced â pictureâ ), a PIC'-like markup language for creating diagrams within technical documentation. Originally developed by Brian Kernighan, PIC has been adapted into pikchr by D. Richard Hipp, the creator of SQLite'. pikchr is designed to be embedded in fenced code blocks of Markdown or other documentation markup languages, making it ideal for generating diagrams in text-based formats. This package allows R users to seamlessly integrate the descriptive syntax of pikchr for diagram creation directly within the R environment.
This package provides a central decision in a parametric regression is how to specify the relation between an dependent variable and each explanatory variable. This package provides a semi-parametric tool for comparing different transformations of an explanatory variables in a parametric regression. The functions is relevant in a situation, where you would use a box-cox or Box-Tidwell transformations. In contrast to the classic power-transformations, the methods in this package allows for theoretical driven user input and the possibility to compare with a non-parametric transformation.
Computes the effective range of a smoothing matrix, which is a measure of the distance to which smoothing occurs. This is motivated by the application of spatial splines for adjusting for unmeasured spatial confounding in regression models, but the calculation of effective range can be applied to smoothing matrices in other contexts. For algorithmic details, see Rainey and Keller (2024) "spconfShiny
: an R Shiny application..." <doi:10.1371/journal.pone.0311440> and Keller and Szpiro (2020) "Selecting a Scale for Spatial Confounding Adjustment" <doi:10.1111/rssa.12556>.
Smoothing signals and computing their derivatives is a common requirement in signal processing workflows. Savitzky-Golay filters are a established method able to do both (Savitzky and Golay, 1964 <doi:10.1021/ac60214a047>). This package implements one dimensional Savitzky-Golay filters that can be applied to vectors and matrices (either row-wise or column-wise). Vectorization and memory allocations have been profiled to reduce computational fingerprint. Short filter lengths are implemented in the direct space, while longer filters are implemented in frequency space, using a Fast Fourier Transform (FFT).
Visualize Variance is an intuitive shiny applications tailored for agricultural research data analysis, including one-way and two-way analysis of variance, correlation, and other essential statistical tools. Users can easily upload their datasets, perform analyses, and download the results as a well-formatted document, streamlining the process of data analysis and reporting in agricultural research.The experimental design methods are based on classical work by Fisher (1925) and Scheffe (1959). The correlation visualization approaches follow methods developed by Wei & Simko (2021) and Friendly (2002) <doi:10.1198/000313002533>.
Efficient Bayesian generalized linear models with time-varying coefficients as in Helske (2022, <doi:10.1016/j.softx.2022.101016>). Gaussian, Poisson, and binomial observations are supported. The Markov chain Monte Carlo (MCMC) computations are done using Hamiltonian Monte Carlo provided by Stan, using a state space representation of the model in order to marginalise over the coefficients for efficient sampling. For non-Gaussian models, the package uses the importance sampling type estimators based on approximate marginal MCMC as in Vihola, Helske, Franks (2020, <doi:10.1111/sjos.12492>).
LegATo
is a suite of open-source software tools for longitudinal microbiome analysis. It is extendable to several different study forms with optimal ease-of-use for researchers. Microbiome time-series data presents distinct challenges including complex covariate dependencies and variety of longitudinal study designs. This toolkit will allow researchers to determine which microbial taxa are affected over time by perturbations such as onset of disease or lifestyle choices, and to predict the effects of these perturbations over time, including changes in composition or stability of commensal bacteria.
Facilitates the use of data mining algorithms in classification and regression (including time series forecasting) tasks by presenting a short and coherent set of functions. Versions: 1.4.8 improved help, several warning and error code fixes (more stable version, all examples run correctly); 1.4.7 improved Importance function and examples, minor error fixes; 1.4.6 / 1.4.5 / 1.4.4 new automated machine learning (AutoML
) and ensembles, via improved fit()
, mining()
and mparheuristic()
functions, and new categorical preprocessing, via improved delevels()
function; 1.4.3 new metrics (e.g., macro precision, explained variance), new "lssvm" model and improved mparheuristic()
function; 1.4.2 new "NMAE" metric, "xgboost" and "cv.glmnet" models (16 classification and 18 regression models); 1.4.1 new tutorial and more robust version; 1.4 - new classification and regression models, with a total of 14 classification and 15 regression methods, including: Decision Trees, Neural Networks, Support Vector Machines, Random Forests, Bagging and Boosting; 1.3 and 1.3.1 - new classification and regression metrics; 1.2 - new input importance methods via improved Importance()
function; 1.0 - first version.
Performance analysis workflow that combines the power of the R language (and the tidyverse realm) and many auxiliary tools to provide a consistent, flexible, extensible, fast, and versatile framework for the performance analysis of task-based applications that run on top of the StarPU runtime (with its MPI (Message Passing Interface) layer for multi-node support). Its goal is to provide a fruitful prototypical environment to conduct performance analysis hypothesis-checking for task-based applications that run on heterogeneous (multi-GPU, multi-core) multi-node HPC (High-performance computing) platforms.
Experience studies are used by actuaries to explore historical experience across blocks of business and to inform assumption setting activities. This package provides functions for preparing data, creating studies, visualizing results, and beginning assumption development. Experience study methods, including exposure calculations, are described in: Atkinson & McGarry
(2016) "Experience Study Calculations" <https://www.soa.org/49378a/globalassets/assets/files/research/experience-study-calculations.pdf>. The limited fluctuation credibility method used by the exp_stats()
function is described in: Herzog (1999, ISBN:1-56698-374-6) "Introduction to Credibility Theory".
Count transformation models featuring parameters interpretable as discrete hazard ratios, odds ratios, reverse-time discrete hazard ratios, or transformed expectations. An appropriate data transformation for a count outcome and regression coefficients are simultaneously estimated by maximising the exact discrete log-likelihood using the computational framework provided in package mlt', technical details are given in Siegfried & Hothorn (2020) <DOI:10.1111/2041-210X.13383>. The package also contains an experimental implementation of multivariate count transformation models with an application to multi-species distribution models <DOI:10.48550/arXiv.2201.13095>
.
Collection of ancillary functions and utilities for Partial Linear Single Index Models for Environmental mixture analyses, which currently provides functions for scalar outcomes. The outputs of these functions include the single index function, single index coefficients, partial linear coefficients, mixture overall effect, exposure main and interaction effects, and differences of quartile effects. In the future, we will add functions for binary, ordinal, Poisson, survival, and longitudinal outcomes, as well as models for time-dependent exposures. See Wang et al (2020) <doi:10.1186/s12940-020-00644-4> for an overview.
This package provides an implementation of two-dimensional functional principal component analysis (FPCA), Marginal FPCA, and Product FPCA for repeated functional data. Marginal and Product FPCA implementations are done for both dense and sparsely observed functional data. References: Chen, K., Delicado, P., & Müller, H. G. (2017) <doi:10.1111/rssb.12160>. Chen, K., & Müller, H. G. (2012) <doi:10.1080/01621459.2012.734196>. Hall, P., Müller, H.G. and Wang, J.L. (2006) <doi:10.1214/009053606000000272>. Yao, F., Müller, H. G., & Wang, J. L. (2005) <doi:10.1198/016214504000001745>.
This package implements the generalized order-restricted information criterion approximation (GORICA), an AIC-like information criterion that can be utilized to evaluate informative hypotheses specifying directional relationships between model parameters in terms of (in)equality constraints (see Altinisik, Van Lissa, Hoijtink, Oldehinkel, & Kuiper, 2021), <doi:10.31234/osf.io/t3c8g>. The GORICA is applicable not only to normal linear models, but also to generalized linear models (GLMs), generalized linear mixed models (GLMMs), structural equation models (SEMs), and contingency tables. For contingency tables, restrictions on cell probabilities can be non-linear.
Fast, model-agnostic implementation of different H-statistics introduced by Jerome H. Friedman and Bogdan E. Popescu (2008) <doi:10.1214/07-AOAS148>. These statistics quantify interaction strength per feature, feature pair, and feature triple. The package supports multi-output predictions and can account for case weights. In addition, several variants of the original statistics are provided. The shape of the interactions can be explored through partial dependence plots or individual conditional expectation plots. DALEX explainers, meta learners ('mlr3', tidymodels', caret') and most other models work out-of-the-box.
Rank-based tests for enrichment of KOG (euKaryotic
Orthologous Groups) classes with up- or down-regulated genes based on a continuous measure. The meta-analysis is based on correlation of KOG delta-ranks across datasets (delta-rank is the difference between mean rank of genes belonging to a KOG class and mean rank of all other genes). With binary measure (1 or 0 to indicate significant and non-significant genes), one-tailed Fisher's exact test for over-representation of each KOG class among significant genes will be performed.
This package contains sixteen moisture sorption isotherm models, which evaluate the fitness of adsorption and desorption curves for further understanding of the relationship between moisture content and water activity. Fitness evaluation is conducted through parameter estimation and error analysis. Moreover, graphical representation, hysteresis area estimation, and isotherm classification through the equation of Blahovec & Yanniotis (2009) <doi:10.1016/j.jfoodeng.2008.08.007> which is based on the classification system introduced by Brunauer et. al. (1940) <doi:10.1021/ja01864a025> are also included for the visualization of models and hysteresis.
This package provides a computational framework for analyzing mutations in immunoglobulin (Ig) sequences. Includes methods for Bayesian estimation of antigen-driven selection pressure, mutational load quantification, building of somatic hypermutation (SHM) models, and model-dependent distance calculations. Also includes empirically derived models of SHM for both mice and humans. Citations: Gupta and Vander Heiden, et al (2015) <doi:10.1093/bioinformatics/btv359>, Yaari, et al (2012) <doi:10.1093/nar/gks457>, Yaari, et al (2013) <doi:10.3389/fimmu.2013.00358>, Cui, et al (2016) <doi:10.4049/jimmunol.1502263>.
This package provides a mixture model for clustering individuals (or sampling groups) into stocks based on their genetic profile. Here, sampling groups are individuals that are sure to come from the same stock (e.g. breeding adults or larvae). The mixture (log-)likelihood is maximised using the EM-algorithm after finding good starting values via a K-means clustering of the genetic data. Details can be found in: Foster, S. D.; Feutry, P.; Grewe, P. M.; Berry, O.; Hui, F. K. C. & Davies (2020) <doi:10.1111/1755-0998.12920>.
The LSTM (Long Short-Term Memory) model is a Recurrent Neural Network (RNN) based architecture that is widely used for time series forecasting. Min-Max transformation has been used for data preparation. Here, we have used one LSTM layer as a simple LSTM model and a Dense layer is used as the output layer. Then, compile the model using the loss function, optimizer and metrics. This package is based on Keras and TensorFlow
modules and the algorithm of Paul and Garai (2021) <doi:10.1007/s00500-021-06087-4>.
New tools for the imputation of missing values in high-dimensional data are introduced using the non-parametric nearest neighbor methods. It includes weighted nearest neighbor imputation methods that use specific distances for selected variables. It includes an automatic procedure of cross validation and does not require prespecified values of the tuning parameters. It can be used to impute missing values in high-dimensional data when the sample size is smaller than the number of predictors. For more information see Faisal and Tutz (2017) <doi:10.1515/sagmb-2015-0098>.
Genetically modified organisms (GMOs) and cell lines are widely used models in all kinds of biological research. As part of characterising these models, DNA sequencing technology and bioinformatics analyses are used systematically to study their genomes. Therefore, large volumes of data are generated and various algorithms are applied to analyse this data, which introduces a challenge on representing all findings in an informative and concise manner. `gmoviz` provides users with an easy way to visualise and facilitate the explanation of complex genomic editing events on a larger, biologically-relevant scale.
VarCon
is an R package which converts the positional information from the annotation of an single nucleotide variation (SNV) (either referring to the coding sequence or the reference genomic sequence). It retrieves the genomic reference sequence around the position of the single nucleotide variation. To asses, whether the SNV could potentially influence binding of splicing regulatory proteins VarCon
calcualtes the HEXplorer score as an estimation. Besides, VarCon
additionally reports splice site strengths of splice sites within the retrieved genomic sequence and any changes due to the SNV.