bnem combines the use of indirect measurements of Nested Effects Models (package mnem) with the Boolean networks of CellNOptR
. Perturbation experiments of signalling nodes in cells are analysed for their effect on the global gene expression profile. Those profiles give evidence for the Boolean regulation of down-stream nodes in the network, e.g., whether two parents activate their child independently (OR-gate) or jointly (AND-gate).
Plyr is a set of tools that solves a common set of problems: you need to break a big problem down into manageable pieces, operate on each piece and then put all the pieces back together. For example, you might want to fit a model to each spatial location or time point in your study, summarise data by panels or collapse high-dimensional arrays to simpler summary statistics.
We utilize approximate Bayesian machinery to fit two-level conjugate hierarchical models on overdispersed Gaussian, Poisson, and Binomial data and evaluates whether the resulting approximate Bayesian interval estimates for random effects meet the nominal confidence levels via frequency coverage evaluation. The data that Rgbp assumes comprise observed sufficient statistic for each random effect, such as an average or a proportion of each group, without population-level data. The approximate Bayesian tool equipped with the adjustment for density maximization produces approximate point and interval estimates for model parameters including second-level variance component, regression coefficients, and random effect. For the Binomial data, the package provides an option to produce posterior samples of all the model parameters via the acceptance-rejection method. The package provides a quick way to evaluate coverage rates of the resultant Bayesian interval estimates for random effects via a parametric bootstrapping, which we call frequency method checking.
An implementation of best subset selection in generalized linear model and Cox proportional hazard model via the primal dual active set algorithm proposed by Wen, C., Zhang, A., Quan, S. and Wang, X. (2020) <doi:10.18637/jss.v094.i04>. The algorithm formulates coefficient parameters and residuals as primal and dual variables and utilizes efficient active set selection strategies based on the complementarity of the primal and dual variables.
This package provides routines for the generation of response patterns under unidimensional dichotomous and polytomous computerized adaptive testing (CAT) framework. It holds many standard functions to estimate ability, select the first item(s) to administer and optimally select the next item, as well as several stopping rules. Options to control for item exposure and content balancing are also available (Magis and Barrada (2017) <doi:10.18637/jss.v076.c01>).
Algorithms for classical symmetric and deflation-based FastICA
, reloaded deflation-based FastICA
algorithm and an algorithm for adaptive deflation-based FastICA
using multiple nonlinearities. For details, see Miettinen et al. (2014) <doi:10.1109/TSP.2014.2356442> and Miettinen et al. (2017) <doi:10.1016/j.sigpro.2016.08.028>. The package is described in Miettinen, Nordhausen and Taskinen (2018) <doi:10.32614/RJ-2018-046>.
An implementation of maximum simulated likelihood method for the estimation of multinomial logit models with random coefficients as presented by Sarrias and Daziano (2017) <doi:10.18637/jss.v079.i02>. Specifically, it allows estimating models with continuous heterogeneity such as the mixed multinomial logit and the generalized multinomial logit. It also allows estimating models with discrete heterogeneity such as the latent class and the mixed-mixed multinomial logit model.
Fits multiple-group latent class analysis (LCA) for exploring differences between populations in the data with a multilevel structure. There are two approaches to reflect group differences in glca: fixed-effect LCA (Bandeen-Roche et al (1997) <doi:10.1080/01621459.1997.10473658>; Clogg and Goodman (1985) <doi:10.2307/270847>) and nonparametric random-effect LCA (Vermunt (2003) <doi:10.1111/j.0081-1750.2003.t01-1-00131.x>).
Computes the probability density, survival function, the hazard rate functions and generates random samples from the GTDL distribution given by Mackenzie, G. (1996) <doi:10.2307/2348408>. The likelihood estimates, the randomized quantile (Louzada, F., et al. (2020) <doi:10.1109/ACCESS.2020.3040525>) residuals and the normally transformed randomized survival probability (Li,L., et al. (2021) <doi:10.1002/sim.8852>) residuals are obtained for the GTDL model.
Simulation of, and fitting models for, Generalised Network Autoregressive (GNAR) time series models which take account of network structure, potentially with exogenous variables. Such models are described in Knight et al. (2020) <doi:10.18637/jss.v096.i05> and Nason and Wei (2021) <doi:10.1111/rssa.12875>. Diagnostic tools for GNAR(X) models can be found in Nason et al. (2023) <doi:10.48550/arXiv.2312.00530>
.
Implementation of a class of hierarchical item response theory (IRT) models where both the mean and the variance of latent preferences (ability parameters) may depend on observed covariates. The current implementation includes both the two-parameter latent trait model for binary data and the graded response model for ordinal data. Both are fitted via the Expectation-Maximization (EM) algorithm. Asymptotic standard errors are derived from the observed information matrix.
This package provides a novel machine learning method for plant viruses diagnostic using genome sequencing data. This package includes three different machine learning models, random forest, XGBoost, and elastic net, to train and predict mapped genome samples. Mappability profile and unreliable regions are introduced to the algorithm, and users can build a mappability profile from scratch with functions included in the package. Plotting mapped sample coverage information is provided.
Computes the Lomb-Scargle Periodogram and actogram for evenly or unevenly sampled time series. Includes a randomization procedure to obtain exact p-values. Partially based on C original by Press et al. (Numerical Recipes) and the Python module Astropy. For more information see Ruf, T. (1999). The Lomb-Scargle periodogram in biological rhythm research: analysis of incomplete and unequally spaced time-series. Biological Rhythm Research, 30(2), 178-201.
Estimates the multivariate skew-t and nested models, as described in the articles Liseo, B., Parisi, A. (2013). Bayesian inference for the multivariate skew-normal model: a population Monte Carlo approach. Comput. Statist. Data Anal. <doi:10.1016/j.csda.2013.02.007> and in Parisi, A., Liseo, B. (2017). Objective Bayesian analysis for the multivariate skew-t model. Statistical Methods & Applications <doi: 10.1007/s10260-017-0404-0>.
Basic Setup for Projects in R for Monterey County Office of Education. It contains functions often used in the analysis of education data in the county office including seeing if an item is not in a list, rounding in the manner the general public expects, including logos for districts, switching between district names and their county-district-school codes, accessing the local SQL table and making thematically consistent graphs.
Motivated by changing administrative boundaries over time, the nuts package can convert European regional data with NUTS codes between versions (2006, 2010, 2013, 2016 and 2021) and levels (NUTS 1, NUTS 2 and NUTS 3). The package uses spatial interpolation as in Lam (1983) <doi:10.1559/152304083783914958> based on granular (100m x 100m) area, population and land use data provided by the European Commission's Joint Research Center.
Estimation of panel models for glm-like models: this includes binomial models (logit and probit), count models (poisson and negbin) and ordered models (logit and probit), as described in: Baltagi (2013) Econometric Analysis of Panel Data, ISBN-13:978-1-118-67232-7, Hsiao (2014) Analysis of Panel Data <doi:10.1017/CBO9781139839327> and Croissant and Millo (2018), Panel Data Econometrics with R, ISBN:978-1-118-94918-4.
Simulate and run the Gaussian puff forward atmospheric model in sensor (specific sensor coordinates) or grid (across the grid of a full oil and gas operations site) modes, following Jia, M., Fish, R., Daniels, W., Sprinkle, B. and Hammerling, D. (2024) <doi:10.26434/chemrxiv-2023-hc95q-v3>. Numerous visualization options, including static and animated, 2D and 3D, and a site map generator based on sensor and source coordinates.
An efficient implementation of Scalable Bayesian Rule Lists Algorithm, a competitor algorithm for decision tree algorithms; see Hongyu Yang, Cynthia Rudin, Margo Seltzer (2017) <https://proceedings.mlr.press/v70/yang17h.html>. It builds from pre-mined association rules and have a logical structure identical to a decision list or one-sided decision tree. Fully optimized over rule lists, this algorithm strikes practical balance between accuracy, interpretability, and computational speed.
An R implementation of the Self-Organising Migrating Algorithm, a general-purpose, stochastic optimisation algorithm. The approach is similar to that of genetic algorithms, although it is based on the idea of a series of ``migrations by a fixed set of individuals, rather than the development of successive generations. It can be applied to any cost-minimisation problem with a bounded parameter space, and is robust to local minima.
This package implements an extension of the Generalized Berk-Jones (GBJ) statistic for survival data, sGBJ
. It computes the sGBJ
statistic and its p-value for testing the association between a gene set and a time-to-event outcome with possible adjustment on additional covariates. Detailed method is available at Villain L, Ferte T, Thiebaut R and Hejblum BP (2021) <doi:10.1101/2021.09.07.459329>.
Implementation of the SRCS method for a color-based visualization of the results of multiple pairwise tests on a large number of problem configurations, proposed in: I.G. del Amo, D.A. Pelta. SRCS: a technique for comparing multiple algorithms under several factors in dynamic optimization problems. In: E. Alba, A. Nakib, P. Siarry (Eds.), Metaheuristics for Dynamic Optimization. Series: Studies in Computational Intelligence 433, Springer, Berlin/Heidelberg, 2012.
This package provides classes for storing and manipulating taxonomic data. Most of the classes can be treated like base R vectors (e.g. can be used in tables as columns and can be named). Vectorized classes can store taxon names and authorities, taxon IDs from databases, taxon ranks, and other types of information. More complex classes are provided to store taxonomic trees and user-defined data associated with them.
Generates stochastic time series and genealogies associated with a population dynamics model. Times series are simulated using the Gillespie exact and approximate algorithms and a new algorithm we introduce that uses both approaches to optimize the time execution of the simulations. Genealogies are simulated from a trajectory using a backwards-in-time based approach. Methods are described in Danesh G et al (2022) <doi:10.1111/2041-210X.14038>.