Comprehensive set of tools for performing system identification of both linear and nonlinear dynamical systems directly from data. The Automatic Regression for Governing Equations (ARGOS) simplifies the complex task of constructing mathematical models of dynamical systems from observed input and output data, supporting various types of systems, including those described by ordinary differential equations. It employs optimal numerical derivatives for enhanced accuracy and employs formal variable selection techniques to help identify the most relevant variables, thereby enabling the development of predictive models for system behavior analysis.
Builds on gpuR
and utilizes the clRNG
('OpenCL
') library to provide efficient tools to generate independent random numbers in parallel on a GPU and save the results as R objects, ensuring high-quality random numbers even when R is used interactively or in an ad-hoc manner. Includes Fisher's simulation method adapted from Patefield, William M (1981) <doi:10.2307/2346669> and MRG31k3p Random Number Generator from clRNG
library by Advanced Micro Devices, Inc. (2015) <https://github.com/clMathLibraries/clRNG>
.
This package provides coefficients of interrater reliability that are generalized to cope with randomly incomplete (i.e. unbalanced) datasets without any imputation of missing values or any (row-wise or column-wise) omissions of actually available data. Applied to complete (balanced) datasets, these generalizations yield the same results as the common procedures, namely the Intraclass Correlation according to McGraw
& Wong (1996) \doi10.1037/1082-989X.1.1.30 and the Coefficient of Concordance according to Kendall & Babington Smith (1939) \doi10.1214/aoms/1177732186.
Designed for the curation and analysis of data generated from real-time quaking-induced conversion (RT-QuIC
) assays first described by Atarashi et al. (2011) <doi:10.1038/nm.2294>. quicR
calculates useful metrics such as maxpoint ratio: Rowden et al. (2023) <doi:10.1099/vir.0.069906-0>; time-to-threshold: Shi et al. (2013) <doi:10.1186/2051-5960-1-44>; and maximum slope. Integration with the output from plate readers allows for seamless input of raw data into the R environment.
The tdROC
package facilitates the estimation of time-dependent ROC (Receiver Operating Characteristic) curves and the Area Under the time-dependent ROC Curve (AUC) in the context of survival data, accommodating scenarios with right censored data and the option to account for competing risks. In addition to the ROC/AUC estimation, the package also estimates time-dependent Brier score and survival difference. Confidence intervals of various estimated quantities can be obtained from bootstrap. The package also offers plotting functions for visualizing time-dependent ROC curves.
This package implements nonlinear autoregressive (AR) time series models. For univariate series, a non-parametric approach is available through additive nonlinear AR. Parametric modeling and testing for regime switching dynamics is available when the transition is either direct (TAR: threshold AR) or smooth (STAR: smooth transition AR, LSTAR). For multivariate series, one can estimate a range of TVAR or threshold cointegration TVECM models with two or three regimes. Tests can be conducted for TVAR as well as for TVECM (Hansen and Seo 2002 and Seo 2006).
This package provides a comprehensive suite of statistical tools for analyzing, simulating, and computing properties of the Topp-Leone Cauchy Rayleigh (TLCAR) distribution, a versatile distribution amalgamating features of the Topp-Leone, Cauchy, and Rayleigh distributions, ideal for modeling intricate, heterogeneous data across scientific domains. See Atchadé, M.N., Bogninou, M.J., and Djibril, A.M. (2023) <doi:10.1007/s44199-023-00066-4> and Atchadé, M.N., Bogninou, M.J., and Djibril, A.M. (2024) <doi:10.1007/s44199-023-00069-1> for further insights.
This tool provides functions to load, segment and classify zooplankton images. The image processing algorithms and the machine learning classifiers in this package are (will be, since these have not been added yet) direct ports of an early python implementation that can be found at <https://github.com/arickGrootveld/ZooID>
. The model weights and datasets (also not added yet) that are a part of this package can also be found at Arick Grootveld, Eva R. Kozak, Carmen Franco-Gordo (2023) <doi:10.5281/zenodo.7979996>.
This package provides tools to convert statistical analysis objects from R into tidy data frames, so that they can more easily be combined, reshaped and otherwise processed with tools like dplyr
, tidyr
and ggplot2
. The package provides three S3 generics: tidy
, which summarizes a model's statistical findings such as coefficients of a regression; augment
, which adds columns to the original data such as predictions, residuals and cluster assignments; and glance
, which provides a one-row summary of model-level statistics.
Decomposition of (income) inequality by population sub groups. For a decomposition on a single variable the mean log deviation can be used (see Mookherjee Shorrocks (1982) <DOI:10.2307/2232673>). For a decomposition on multiple variables a regression based technique can be used (see Fields (2003) <DOI:10.1016/s0147-9121(03)22001-x>). Recentered influence function regression for marginal effects of the (income or wealth) distribution (see Firpo et al. (2009) <DOI:10.3982/ECTA6822>). Some extensions to inequality functions to handle weights and/or missings.
This package implements an efficient algorithm for solving sparse-penalized support vector machines with kernel density convolution. This package is designed for high-dimensional classification tasks, supporting lasso (L1) and elastic-net penalties for sparse feature selection and providing options for tuning kernel bandwidth and penalty weights. The dcsvm is applicable to fields such as bioinformatics, image analysis, and text classification, where high-dimensional data commonly arise. Learn more about the methodology and algorithm at Wang, Zhou, Gu, and Zou (2023) <doi:10.1109/TIT.2022.3222767>.
This package provides methods for fitting various extreme value distributions with parameters of generalised additive model (GAM) form are provided. For details of distributions see Coles, S.G. (2001) <doi:10.1007/978-1-4471-3675-0>, GAMs see Wood, S.N. (2017) <doi:10.1201/9781315370279>, and the fitting approach see Wood, S.N., Pya, N. & Safken, B. (2016) <doi:10.1080/01621459.2016.1180986>. Details of how evgam works and various examples are given in Youngman, B.D. (2022) <doi:10.18637/jss.v103.i03>.
This package provides functions for simulating and estimating parameters of various growth models, including Logistic, Exponential, Theta-logistic, Von-Bertalanffy, and Gompertz models. The package supports both simulated and real data analysis, including parameter estimation, visualization, and calculation of global and local estimates. The methods are based on research described by Md Aktar Ul Karim and Amiya Ranjan Bhowmick (2022) in (<https://www.researchsquare.com/article/rs-2363586/v1>). An interactive web application is also available at [GPEMR Web App](<https://gpem-r.shinyapps.io/GPEM-R/>).
Helper functions provide an accurate imputation algorithm for reconstructing the missing segment in a multi-variate data streams. Inspired by single-shot learning, it reconstructs the missing segment by identifying the first similar segment in the stream. Nevertheless, there should be one column of data available, i.e. a constraint column. The values of columns can be characters (A, B, C, etc.). The result of the imputed dataset will be returned a .csv file. For more details see Reza Rawassizadeh (2019) <doi:10.1109/TKDE.2019.2914653>.
An S4 class and several functions which utilize internally stored datasets and gauging data enable 1d water level interpolation. The S4 class (WaterLevelDataFrame
) structures the computation and visualisation of 1d water level information along the German federal waterways Elbe and Rhine. hyd1d delivers 1d water level data - extracted from the FLYS database - and validated gauging data - extracted from the hydrological database WISKI7 - package-internally. For computations near real time gauging data are queried externally from the PEGELONLINE REST API <https://pegelonline.wsv.de/webservice/dokuRestapi>
.
This package performs Gaussian process regression with heteroskedastic noise following the model by Binois, M., Gramacy, R., Ludkovski, M. (2016) <doi:10.48550/arXiv.1611.05902>
, with implementation details in Binois, M. & Gramacy, R. B. (2021) <doi:10.18637/jss.v098.i13>. The input dependent noise is modeled as another Gaussian process. Replicated observations are encouraged as they yield computational savings. Sequential design procedures based on the integrated mean square prediction error and lookahead heuristics are provided, and notably fast update functions when adding new observations.
Estimate diagnostic classification models (also called cognitive diagnostic models) with Stan'. Diagnostic classification models are confirmatory latent class models, as described by Rupp et al. (2010, ISBN: 978-1-60623-527-0). Automatically generate Stan code for the general loglinear cognitive diagnostic diagnostic model proposed by Henson et al. (2009) <doi:10.1007/s11336-008-9089-5> and other subtypes that introduce additional model constraints. Using the generated Stan code, estimate the model evaluate the model's performance using model fit indices, information criteria, and reliability metrics.
This package provides a set of evolutionary algorithms to solve many-objective optimization. Hybridization between the algorithms are also facilitated. Available algorithms are: SMS-EMOA <doi:10.1016/j.ejor.2006.08.008> NSGA-III <doi:10.1109/TEVC.2013.2281535> MO-CMA-ES <doi:10.1145/1830483.1830573> The following many-objective benchmark problems are also provided: DTLZ1'-'DTLZ4 from Deb, et al. (2001) <doi:10.1007/1-84628-137-7_6> and WFG4'-'WFG9 from Huband, et al. (2005) <doi:10.1109/TEVC.2005.861417>.
Generate and analyze Optimal Channel Networks (OCNs): oriented spanning trees reproducing all scaling features characteristic of real, natural river networks. As such, they can be used in a variety of numerical experiments in the fields of hydrology, ecology and epidemiology. See Carraro et al. (2020) <doi:10.1002/ece3.6479> for a presentation of the package; Rinaldo et al. (2014) <doi:10.1073/pnas.1322700111> for a theoretical overview on the OCN concept; Furrer and Sain (2010) <doi:10.18637/jss.v036.i10> for the construct used.
This package provides a simple to use summary function that can be used with pipes and displays nicely in the console. The default summary statistics may be modified by the user as can the default formatting. Support for data frames and vectors is included, and users can implement their own skim methods for specific object types as described in a vignette. Default summaries include support for inline spark graphs. Instructions for managing these on specific operating systems are given in the "Using skimr" vignette and the README.
Offers classes and functions to contact web servers while enforcing scheduling rules required by the sites. The URL class makes it easy to construct a URL by providing parameters as a vector. The Request class allows to describes SOAP (Simple Object Access Protocol) or standard requests: URL, method (POST or GET), header, body. The Scheduler class controls the request frequency for each server address by mean of rules (Rule class). The RequestResult
class permits to get the request status to handle error cases and the content.
This package provides flexible Bayesian estimation of IMIFA and related models, for nonparametrically clustering high-dimensional data. The IMIFA model conducts Bayesian nonparametric model-based clustering with factor analytic covariance structures without recourse to model selection criteria to choose the number of clusters or cluster-specific latent factors, mostly via efficient Gibbs updates. Model-specific diagnostic tools are also provided, as well as many options for plotting results, conducting posterior inference on parameters of interest, posterior predictive checking, and quantifying uncertainty.
This is package for QTL mapping in a mixed model framework with separate detection and localization stages. The first stage detects the number of QTL on each chromosome based on the genetic variation due to grouped markers on the chromosome; the second stage uses this information to determine the most likely QTL positions. The mixed model can accommodate general fixed and random effects, including spatial effects in field trials and pedigree effects. It is applicable to backcrosses, doubled haploids, recombinant inbred lines, F2 intercrosses, and association mapping populations.
Real-time quantitative polymerase chain reaction (qPCR
) data sets by Boggy et al. (2008) <doi:10.1371/journal.pone.0012355>. This package provides a dilution series for one PCR target: a random sequence that minimizes secondary structure and off-target primer binding. The data set is a six-point, ten-fold dilution series. For each concentration there are two replicates. Each amplification curve is 40 cycles long. Original raw data file: <https://journals.plos.org/plosone/article/file?type=supplementary&id=10.1371/journal.pone.0012355.s004>.