Simulation of, and fitting models for, Generalised Network Autoregressive (GNAR) time series models which take account of network structure, potentially with exogenous variables. Such models are described in Knight et al. (2020) <doi:10.18637/jss.v096.i05> and Nason and Wei (2021) <doi:10.1111/rssa.12875>. Diagnostic tools for GNAR(X) models can be found in Nason et al. (2023) <doi:10.48550/arXiv.2312.00530>.
Fits multiple-group latent class analysis (LCA) for exploring differences between populations in the data with a multilevel structure. There are two approaches to reflect group differences in glca: fixed-effect LCA (Bandeen-Roche et al (1997) <doi:10.1080/01621459.1997.10473658>; Clogg and Goodman (1985) <doi:10.2307/270847>) and nonparametric random-effect LCA (Vermunt (2003) <doi:10.1111/j.0081-1750.2003.t01-1-00131.x>).
Computes the probability density, survival function, the hazard rate functions and generates random samples from the GTDL distribution given by Mackenzie, G. (1996) <doi:10.2307/2348408>. The likelihood estimates, the randomized quantile (Louzada, F., et al. (2020) <doi:10.1109/ACCESS.2020.3040525>) residuals and the normally transformed randomized survival probability (Li,L., et al. (2021) <doi:10.1002/sim.8852>) residuals are obtained for the GTDL model.
This package implements an efficient algorithm for fitting the entire regularization path of quantile regression models with elastic-net penalties using a generalized coordinate descent scheme. The framework also supports SCAD and MCP penalties. It is designed for high-dimensional datasets and emphasizes numerical accuracy and computational efficiency. This package implements the algorithms proposed in Tang, Q., Zhang, Y., & Wang, B. (2022) <https://openreview.net/pdf?id=RvwMTDYTOb>.
This package provides a novel machine learning method for plant viruses diagnostic using genome sequencing data. This package includes three different machine learning models, random forest, XGBoost, and elastic net, to train and predict mapped genome samples. Mappability profile and unreliable regions are introduced to the algorithm, and users can build a mappability profile from scratch with functions included in the package. Plotting mapped sample coverage information is provided.
Computes the Lomb-Scargle Periodogram and actogram for evenly or unevenly sampled time series. Includes a randomization procedure to obtain exact p-values. Partially based on C original by Press et al. (Numerical Recipes) and the Python module Astropy. For more information see Ruf, T. (1999). The Lomb-Scargle periodogram in biological rhythm research: analysis of incomplete and unequally spaced time-series. Biological Rhythm Research, 30(2), 178-201.
Estimates the multivariate skew-t and nested models, as described in the articles Liseo, B., Parisi, A. (2013). Bayesian inference for the multivariate skew-normal model: a population Monte Carlo approach. Comput. Statist. Data Anal. <doi:10.1016/j.csda.2013.02.007> and in Parisi, A., Liseo, B. (2017). Objective Bayesian analysis for the multivariate skew-t model. Statistical Methods & Applications <doi: 10.1007/s10260-017-0404-0>.
Basic Setup for Projects in R for Monterey County Office of Education. It contains functions often used in the analysis of education data in the county office including seeing if an item is not in a list, rounding in the manner the general public expects, including logos for districts, switching between district names and their county-district-school codes, accessing the local SQL table and making thematically consistent graphs.
Motivated by changing administrative boundaries over time, the nuts package can convert European regional data with NUTS codes between versions (2006, 2010, 2013, 2016 and 2021) and levels (NUTS 1, NUTS 2 and NUTS 3). The package uses spatial interpolation as in Lam (1983) <doi:10.1559/152304083783914958> based on granular (100m x 100m) area, population and land use data provided by the European Commission's Joint Research Center.
Estimation of panel models for glm-like models: this includes binomial models (logit and probit), count models (poisson and negbin) and ordered models (logit and probit), as described in: Baltagi (2013) Econometric Analysis of Panel Data, ISBN-13:978-1-118-67232-7, Hsiao (2014) Analysis of Panel Data <doi:10.1017/CBO9781139839327> and Croissant and Millo (2018), Panel Data Econometrics with R, ISBN:978-1-118-94918-4.
Simulate and run the Gaussian puff forward atmospheric model in sensor (specific sensor coordinates) or grid (across the grid of a full oil and gas operations site) modes, following Jia, M., Fish, R., Daniels, W., Sprinkle, B. and Hammerling, D. (2024) <doi:10.26434/chemrxiv-2023-hc95q-v3>. Numerous visualization options, including static and animated, 2D and 3D, and a site map generator based on sensor and source coordinates.
An R implementation of the Self-Organising Migrating Algorithm, a general-purpose, stochastic optimisation algorithm. The approach is similar to that of genetic algorithms, although it is based on the idea of a series of ``migrations by a fixed set of individuals, rather than the development of successive generations. It can be applied to any cost-minimisation problem with a bounded parameter space, and is robust to local minima.
An efficient implementation of Scalable Bayesian Rule Lists Algorithm, a competitor algorithm for decision tree algorithms; see Hongyu Yang, Cynthia Rudin, Margo Seltzer (2017) <https://proceedings.mlr.press/v70/yang17h.html>. It builds from pre-mined association rules and have a logical structure identical to a decision list or one-sided decision tree. Fully optimized over rule lists, this algorithm strikes practical balance between accuracy, interpretability, and computational speed.
This package implements an extension of the Generalized Berk-Jones (GBJ) statistic for survival data, sGBJ. It computes the sGBJ statistic and its p-value for testing the association between a gene set and a time-to-event outcome with possible adjustment on additional covariates. Detailed method is available at Villain L, Ferte T, Thiebaut R and Hejblum BP (2021) <doi:10.1101/2021.09.07.459329>.
Generates stochastic time series and genealogies associated with a population dynamics model. Times series are simulated using the Gillespie exact and approximate algorithms and a new algorithm we introduce that uses both approaches to optimize the time execution of the simulations. Genealogies are simulated from a trajectory using a backwards-in-time based approach. Methods are described in Danesh G et al (2022) <doi:10.1111/2041-210X.14038>.
This package provides classes for storing and manipulating taxonomic data. Most of the classes can be treated like base R vectors (e.g. can be used in tables as columns and can be named). Vectorized classes can store taxon names and authorities, taxon IDs from databases, taxon ranks, and other types of information. More complex classes are provided to store taxonomic trees and user-defined data associated with them.
The Predictive Model Markup Language (PMML) is an XML-based language which provides a way for applications to define machine learning, statistical and data mining models and to share models between PMML compliant applications. More information about the PMML industry standard and the Data Mining Group can be found at http://dmg.org/. The generated PMML can be imported into any PMML consuming application, such as Zementis Predictive Analytics products.
This package provides a toolkit for archaeological time series and time intervals. This package provides a system of classes and methods to represent and work with archaeological time series and time intervals. Dates are represented as "rata die" and can be converted to (virtually) any calendar defined by Reingold and Dershowitz (2018) <doi:10.1017/9781107415058>. This packages offers a simple API that can be used by other specialized packages.
Test for no adverse shift in two-sample comparison when we have a training set, the reference distribution, and a test set. The approach is flexible and relies on a robust and powerful test statistic, the weighted AUC. Technical details are in Kamulete, V. M. (2021) <arXiv:1908.04000>. Modern notions of outlyingness such as trust scores and prediction uncertainty can be used as the underlying scores for example.
This package provides statistical tools for analyzing net and relative survival, with a key feature of relaxing the assumption of independent censoring and incorporating the effect of dependent competing risks. It employs a copula-based methodology, specifically the Archimedean copula, to simulate data, conduct survival analysis, and offer comparisons with other methods. This approach is detailed in the work of Adatorwovor et al. (2022) <doi:10.1515/ijb-2021-0016>.
Joint DNA-based disaster victim identification (DVI), as described in Vigeland and Egeland (2021) <doi:10.21203/rs.3.rs-296414/v1>. Identification is performed by optimising the joint likelihood of all victim samples and reference individuals. Individual identification probabilities, conditional on all available information, are derived from the joint solution in the form of posterior pairing probabilities. dvir is part of the pedsuite collection of packages for pedigree analysis.
Elastic net regression models are controlled by two parameters, lambda, a measure of shrinkage, and alpha, a metric defining the model's location on the spectrum between ridge and lasso regression. glmnet provides tools for selecting lambda via cross validation but no automated methods for selection of alpha. Elastic Net SearcheR automates the simultaneous selection of both lambda and alpha. Developed, in part, with support by NICHD R03 HD094912.
Analysis of dichotomous and polytomous response data using the explanatory item response modeling framework, as described in Bulut, Gorgun, & Yildirim-Erbasli (2021) <doi:10.3390/psych3030023>, Stanke & Bulut (2019) <doi:10.21449/ijate.515085>, and De Boeck & Wilson (2004) <doi:10.1007/978-1-4757-3990-9>. Generalized linear mixed modeling is used for estimating the effects of item-related and person-related variables on dichotomous and polytomous item responses.
Simulate and analyze multistate models with general hazard functions. gems provides functionality for the preparation of hazard functions and parameters, simulation from a general multistate model and predicting future events. The multistate model is not required to be a Markov model and may take the history of previous events into account. In the basic version, it allows to simulate from transition-specific hazard function, whose parameters are multivariable normally distributed.