Useful tools for fitting, validating, and forecasting of practical convolution-closed time series models for low counts are provided. Marginal distributions of the data can be modelled via Poisson and Generalized Poisson innovations. Regression effects can be incorporated through time varying innovation rates. The models are described in Jung and Tremayne (2011) <doi:10.1111/j.1467-9892.2010.00697.x> and the model assessment tools are presented in Czado et al. (2009) <doi:10.1111/j.1541-0420.2009.01191.x> and, Tsay (1992) <doi:10.2307/2347612>.
Easy-to-use and efficient interface for Bayesian inference of complex panel (time series) data using dynamic multivariate panel models by Helske and Tikka (2024) <doi:10.1016/j.alcr.2024.100617>. The package supports joint modeling of multiple measurements per individual, time-varying and time-invariant effects, and a wide range of discrete and continuous distributions. Estimation of these dynamic multivariate panel models is carried out via Stan'. For an in-depth tutorial of the package, see (Tikka and Helske, 2024) <doi:10.48550/arXiv.2302.01607>
.
Gene information from Ensembl genome builds GRCh38.p14 and GRCh37.p13 to use with the topr package. The datasets were originally downloaded from <https://ftp.ensembl.org/pub/current/gtf/homo_sapiens/Homo_sapiens.GRCh38.111.gtf.gz> and <https://ftp.ensembl.org/pub/grch37/current/gtf/homo_sapiens/Homo_sapiens.GRCh37.87.gtf.gz> and converted into the format required by the topr package. See <https://github.com/totajuliusd/topr?tab=readme-ov-file#how-to-use-topr-with-other-species-than-human> to see the required format.
Analyzing censored variables usually requires the use of optimization algorithms. This package provides an alternative algebraic approach to the task of determining the expected value of a random censored variable with a known censoring point. Likewise this approach allows for the determination of the censoring point if the expected value is known. These results are derived under the assumption that the variable follows an Epanechnikov kernel distribution with known mean and range prior to censoring. Statistical functions related to the uncensored Epanechnikov distribution are also provided by this package.
The algorithm of semi-supervised learning based on finite Gaussian mixture models with a missing-data mechanism is designed for a fitting g-class Gaussian mixture model via maximum likelihood (ML). It is proposed to treat the labels of the unclassified features as missing-data and to introduce a framework for their missing as in the pioneering work of Rubin (1976) for missing in incomplete data analysis. This dependency in the missingness pattern can be leveraged to provide additional information about the optimal classifier as specified by Bayesâ rule.
Routines for the estimation or simultaneous estimation and variable selection in several functional semiparametric models with scalar responses are provided. These models include the functional single-index model, the semi-functional partial linear model, and the semi-functional partial linear single-index model. Additionally, the package offers algorithms for handling scalar covariates with linear effects that originate from the discretization of a curve. This functionality is applicable in the context of the linear model, the multi-functional partial linear model, and the multi-functional partial linear single-index model.
Fits the (randomized drift) inverse Gaussian distribution to survival data. The model is described in Aalen OO, Borgan O, Gjessing HK. Survival and Event History Analysis. A Process Point of View. Springer, 2008. It is based on describing time to event as the barrier hitting time of a Wiener process, where drift towards the barrier has been randomized with a Gaussian distribution. The model allows covariates to influence starting values of the Wiener process and/or average drift towards a barrier, with a user-defined choice of link functions.
This package provides a collection of helper functions for multiple regression models fitted by lm()
. Most of them are simple functions for simple tasks which can be done with coding, but may not be easy for occasional users of R. Most of the tasks addressed are those sometimes needed when using the manymome package (Cheung and Cheung, 2023, <doi:10.3758/s13428-023-02224-z>) and stdmod package (Cheung, Cheung, Lau, Hui, and Vong, 2022, <doi:10.1037/hea0001188>). However, they can also be used in other scenarios.
Applies phylogenetic comparative methods (PCM) and phylogenetic trait imputation using structural equation models (SEM), extending methods from Thorson et al. (2023) <doi:10.1111/2041-210X.14076>. This implementation includes a minimal set of features, to allow users to easily read all of the documentation and source code. PCM using SEM includes phylogenetic linear models and structural equation models as nested submodels, but also allows imputation of missing values. Features and comparison with other packages are described in Thorson and van der Bijl (2023) <doi:10.1111/jeb.14234>.
This package contains three simulation functions for implementing the entire Phase 123 trial and the separate Eff-Tox and Phase 3 portions of the trial, which may be beneficial for use on clusters. The functions AssignEffTox()
and RandomizeEffTox()
assign doses to patient cohorts during phase 12 and Reoptimize()
determines the optimal dose to continue with during Phase 3. The functions ReturnMeansAgent()
and ReturnMeanControl()
gives the true mean survival for the agent doses and control and ReturnOCS()
gives the operating characteristics of the design.
Supports propensity score weighting analysis of observational studies and randomized trials. Enables the estimation and inference of average causal effects with binary and multiple treatments using overlap weights (ATO), inverse probability of treatment weights (ATE), average treatment effect among the treated weights (ATT), matching weights (ATM) and entropy weights (ATEN), with and without propensity score trimming. These weights are members of the family of balancing weights introduced in Li, Morgan and Zaslavsky (2018) <doi:10.1080/01621459.2016.1260466> and Li and Li (2019) <doi:10.1214/19-AOAS1282>.
This package provides tools for an automated identification of diagnostic molecular characters, i.e. such columns in a given nucleotide or amino acid alignment that allow to distinguish taxa from each other. These characters can then be used to complement the formal descriptions of the taxa, which are often based on morphological and anatomical features. Especially for morphologically cryptic species, this will be helpful. QUIDDICH distinguishes between four different types of diagnostic characters. For more information, see "Kuehn, A.L., Haase, M. 2019. QUIDDICH: QUick IDentification of DIagnostic CHaracters.".
This package provides user friendly methods for the identification of sequence patterns that are statistically significantly associated with a property of the sequence. For instance, SeqFeatR
allows to identify viral immune escape mutations for hosts of given HLA types. The underlying statistical method is Fisher's exact test, with appropriate corrections for multiple testing, or Bayes. Patterns may be point mutations or n-tuple of mutations. SeqFeatR
offers several ways to visualize the results of the statistical analyses, see Budeus (2016) <doi:10.1371/journal.pone.0146409>.
Treatment and visualization of membrane (selective) transport data. Transport profiles involving up to three species are produced as publication-ready plots and several membrane performance parameters (e.g. separation factors as defined in Koros et al. (1996) <doi:10.1351/pac199668071479> and non-linear regression parameters for the equations described in Rodriguez de San Miguel et al. (2014) <doi:10.1016/j.jhazmat.2014.03.052>) can be obtained. Many widely used experimental setups (e.g. membrane physical aging) can be easily studied through the package's graphical representations.
Mainly data sets to accompany the VGAM package and the book "Vector Generalized Linear and Additive Models: With an Implementation in R" (Yee, 2015) <DOI:10.1007/978-1-4939-2818-7>. These are used to illustrate vector generalized linear and additive models (VGLMs/VGAMs), and associated models (Reduced-Rank VGLMs, Quadratic RR-VGLMs, Row-Column Interaction Models, and constrained and unconstrained ordination models in ecology). This package now contains some old VGAM family functions which have been replaced by newer ones (often because they are now special cases).
r-pathview
is a tool set for pathway based data integration and visualization. It maps and renders a wide variety of biological data on relevant pathway graphs. All users need is to supply their data and specify the target pathway. This package automatically downloads the pathway graph data, parses the data file, maps user data to the pathway, and render pathway graph with the mapped data. In addition, r-pathview
also seamlessly integrates with pathway and gene set (enrichment) analysis tools for large-scale and fully automated analysis.
The use of structured elicitation to inform decision making has grown dramatically in recent decades, however, judgements from multiple experts must be aggregated into a single estimate. Empirical evidence suggests that mathematical aggregation provides more reliable estimates than enforcing behavioural consensus on group estimates. aggreCAT
provides state-of-the-art mathematical aggregation methods for elicitation data including those defined in Hanea, A. et al. (2021) <doi:10.1371/journal.pone.0256919>. The package also provides functions to visualise and evaluate the performance of your aggregated estimates on validation data.
Automated assessment of accuracy and geographical status of georeferenced biological data. The methods rely on reference regions, namely checklists and range maps. Includes functions to obtain data from the Global Biodiversity Information Facility <https://www.gbif.org/> and from the Global Inventory of Floras and Traits <https://gift.uni-goettingen.de/home>. Alternatively, the user can input their own data. Furthermore, provides easy visualisation of the data and the results through the plotting functions. Especially suited for large datasets. The reference for the methodology is: Arlé et al. (under review).
In practical applications, the assumptions underlying generalized linear models frequently face violations, including incorrect specifications of the outcome variable's distribution or omitted predictors. These deviations can render the results of standard generalized linear models unreliable. As the sample size increases, what might initially appear as minor issues can escalate to critical concerns. To address these challenges, we adopt a permutation-based inference method tailored for generalized linear models. This approach offers robust estimations that effectively counteract the mentioned problems, and its effectiveness remains consistent regardless of the sample size.
This package provides functions to calculate the out-of-bag learning curve for random forests for any measure that is available in the mlr package. Supported random forest packages are randomForest
and ranger and trained models of these packages with the train function of mlr'. The main function is OOBCurve()
that calculates the out-of-bag curve depending on the number of trees. With the OOBCurvePars()
function out-of-bag curves can also be calculated for mtry', sample.fraction and min.node.size for the ranger package.
In the past decade, genome-scale metabolic reconstructions have widely been used to comprehend the systems biology of metabolic pathways within an organism. Different GSMs are constructed using various techniques that require distinct steps, but the input data, information conversion and software tools are neither concisely defined nor mathematically or programmatically formulated in a context-specific manner.The tool that quantitatively and qualitatively specifies each reconstruction steps and can generate a template list of reconstruction steps dynamically selected from a reconstruction step reservoir, constructed based on all available published papers.
TensorFlow
SIG Addons <https://www.tensorflow.org/addons> is a repository of community contributions that conform to well-established API patterns, but implement new functionality not available in core TensorFlow
'. TensorFlow
natively supports a large number of operators, layers, metrics, losses, optimizers, and more. However, in a fast moving field like Machine Learning, there are many interesting new developments that cannot be integrated into core TensorFlow
(because their broad applicability is not yet clear, or it is mostly used by a smaller subset of the community).
This package provides R bindings for NNG (Nanomsg Next Gen), a successor to ZeroMQ. NNG is a socket library for reliable, high-performance messaging over in-process, IPC, TCP, WebSocket and secure TLS transports. It implements Scalability Protocols, a standard for common communications patterns including publish/subscribe, request/reply and service discovery. As its own threaded concurrency framework, it provides a toolkit for asynchronous programming and distributed computing. Intuitive aio
objects resolve automatically when asynchronous operations complete, and synchronisation primitives allow R to wait upon events signalled by concurrent threads.
RubyMoney provides a library for dealing with money and currency conversion. Its features are:
Provides a Money class which encapsulates all information about a certain amount of money, such as its value and its currency.
Provides a Money::Currency class which encapsulates all information about a monetary unit.
Represents monetary values as integers, in cents; so avoids floating point rounding errors.
Represents currency as Money::Currency instances providing a high level of flexibility.
Provides APIs for exchanging money from one currency to another.