Integrated Functional Depth for Partially Observed Functional Data and applications to visualization, outlier detection and classification. It implements the methods proposed in: Elà as, A., Jiménez, R., Paganoni, A. M. and Sangalli, L. M., (2023), "Integrated Depth for Partially Observed Functional Data", Journal of Computational and Graphical Statistics, <doi:10.1080/10618600.2022.2070171>. Elà as, A., Jiménez, R., & Shang, H. L. (2023), "Depth-based reconstruction method for incomplete functional data", Computational Statistics, <doi:10.1007/s00180-022-01282-9>. Elà as, A., Nagy, S. (2024), "Statistical properties of partially observed integrated functional depths", TEST, <doi:10.1007/s11749-024-00954-6>.
In streaming data analysis, it is crucial to detect significant shifts in the data distribution or the accuracy of predictive models over time, a phenomenon known as concept drift. The package aims to identify when concept drift occurs and provide methodologies for adapting models in non-stationary environments. It offers a range of state-of-the-art techniques for detecting concept drift and maintaining model performance. Additionally, the package provides tools for adapting models in response to these changes, ensuring continuous and accurate predictions in dynamic contexts. Methods for concept drift detection are described in Tavares (2022) <doi:10.1007/s12530-021-09415-z>.
Several functions to calculate two important indexes (IBR (Integrated Biomarker Response) and IBRv2 (Integrated Biological Response version 2)), it also calculates the standardized values for enzyme activity for each index, and it has a graphing function to perform radarplots that make great data visualization for this type of data. Beliaeff, B., & Burgeot, T. (2002). <https://pubmed.ncbi.nlm.nih.gov/12069320/>. Sanchez, W., Burgeot, T., & Porcher, J.-M. (2013).<doi:10.1007/s11356-012-1359-1>. Devin, S., Burgeot, T., Giambérini, L., Minguez, L., & Pain-Devin, S. (2014). <doi:10.1007/s11356-013-2169-9>. Minato N. (2022). <https://minato.sip21c.org/msb/>.
Implementation of popular mortality models using the rstan package, which provides the R interface to the Stan C++ library for Bayesian estimation. The package supports well-known models proposed in the actuarial and demographic literature including the Lee-Carter (1992) <doi:10.1080/01621459.1992.10475265> and the Cairns-Blake-Dowd (2006) <doi:10.1111/j.1539-6975.2006.00195.x> models. By a simple call, the user inputs deaths and exposures and the package outputs the MCMC simulations for each parameter, the log likelihoods and predictions. Moreover, the package includes tools for model selection and Bayesian model averaging by leave future-out validation.
This program realizes a universal estimation approach that accommodates multi-category variables and effect scales, making up for the deficiencies of the existing approaches when dealing with non-binary exposures and complex models. The estimation via bootstrapping can simultaneously provide results of causal mediation on risk difference (RD), odds ratio (OR) and risk ratio (RR) scales with tests of the effects difference. The estimation is also applicable to many other settings, e.g., moderated mediation, inconsistent covariates, panel data, etc. The high flexibility and compatibility make it possible to apply for any type of model, greatly meeting the needs of current empirical researches.
Rclone is a command line program to sync files and directories to and from different cloud storage providers.
Features include:
MD5/SHA1 hashes checked at all times for file integrity
Timestamps preserved on files
Partial syncs supported on a whole file basis
Copy mode to just copy new/changed files
Sync (one way) mode to make a directory identical
Check mode to check for file hash equality
Can sync to and from network, e.g., two different cloud accounts
Optional encryption (Crypt)
Optional cache (Cache)
Optional FUSE mount (rclone mount)
This package is Cytometry dATa anALYSis Tools (CATALYST). Mass cytometry like Cytometry by time of flight (CyTOF) uses heavy metal isotopes rather than fluorescent tags as reporters to label antibodies, thereby substantially decreasing spectral overlap and allowing for examination of over 50 parameters at the single cell level. While spectral overlap is significantly less pronounced in CyTOF than flow cytometry, spillover due to detection sensitivity, isotopic impurities, and oxide formation can impede data interpretability. CATALYST was designed to provide a pipeline for preprocessing of cytometry data, including:
normalization using bead standards;
single-cell deconvolution;
bead-based compensation.
Annotates data from liquid chromatography coupled to mass spectrometry (LC/MS) metabolomics experiments. Based on a network algorithm (O.Senan, A. Aguilar- Mogas, M. Navarro, O. Yanes, R.Guimerà and M. Sales-Pardo, Bioinformatics, 35(20), 2019), CliqueMS builds a weighted similarity network where nodes are features and edges are weighted according to the similarity of this features. Then it searches for the most plausible division of the similarity network into cliques (fully connected components). Finally it annotates metabolites within each clique, obtaining for each annotated metabolite the neutral mass and their features, corresponding to isotopes, ionization adducts and fragmentation adducts of that metabolite.
This package provides a reproducible and modular workflow for absolute microbial quantification using spike-in controls. Supports both single spike-in taxa and synthetic microbial communities with user-defined spike-in volumes and genome copy numbers. Compatible with phyloseq and TreeSummarizedExperiment (TSE) data structures. The package implements methods for spike-in validation, preprocessing, scaling factor estimation, absolute abundance conversion, bias correction, and normalization. Facilitates downstream statistical analyses with DESeq2', edgeR', and other Bioconductor-compatible methods. Visualization tools are provided via ggplot2', ggtree', and related packages. Includes detailed vignettes, case studies, and function-level documentation to guide users through experimental design, quantification, and interpretation.
This package implements optimal matching with near-fine balance in large observational studies with the use of optimal calipers to get a sparse network. The caliper is optimal in the sense that it is as small as possible such that a matching exists. The main functions in the bigmatch package are optcal() to find the optimal caliper, optconstant() to find the optimal number of nearest neighbors, and nfmatch() to find a near-fine balance match with a caliper and a restriction on the number of nearest neighbors. Yu, R., Silber, J. H., and Rosenbaum, P. R. (2020). <DOI:10.1214/19-sts699>.
Analyze data from next-generation sequencing experiments on genomic samples. CLONETv2 offers a set of functions to compute allele specific copy number and clonality from segmented data and SNPs position pileup. The package has also calculated the clonality of single nucleotide variants given read counts at mutated positions. The package has been developed at the laboratory of Computational and Functional Oncology, Department of CIBIO, University of Trento (Italy), under the supervision of prof Francesca Demichelis. References: Prandi et al. (2014) <doi:10.1186/s13059-014-0439-6>; Carreira et al. (2014) <doi:10.1126/scitranslmed.3009448>; Romanel et al. (2015) <doi:10.1126/scitranslmed.aac9511>.
This package implements choice models based on economic theory, including estimation using Markov chain Monte Carlo (MCMC), prediction, and more. Its usability is inspired by ideas from tidyverse'. Models include versions of the Hierarchical Multinomial Logit and Multiple Discrete-Continous (Volumetric) models with and without screening. The foundations of these models are described in Allenby, Hardt and Rossi (2019) <doi:10.1016/bs.hem.2019.04.002>. Models with conjunctive screening are described in Kim, Hardt, Kim and Allenby (2022) <doi:10.1016/j.ijresmar.2022.04.001>. Models with set-size variation are described in Hardt and Kurz (2020) <doi:10.2139/ssrn.3418383>.
This package provides a toolkit for the analysis and management of data for genes in the so-called "Human Leukocyte Antigen" (HLA) region. Functions extract reference data from the Anthony Nolan HLA Informatics Group/ImmunoGeneTics HLA GitHub repository (ANHIG/IMGTHLA) <https://github.com/ANHIG/IMGTHLA>, validate Genotype List (GL) Strings, convert between UNIFORMAT and GL String Code (GLSC) formats, translate HLA alleles and GLSCs across ImmunoPolymorphism Database (IPD) IMGT/HLA Database release versions, identify differences between pairs of alleles at a locus, generate customized, multi-position sequence alignments, trim and convert allele-names across nomenclature epochs, and extend existing data-analysis methods.
This package provides functionality for working with raster-like quadtrees (also called â region quadtreesâ ), which allow for variable-sized cells. The package allows for flexibility in the quadtree creation process. Several functions defining how to split and aggregate cells are provided, and custom functions can be written for both of these processes. In addition, quadtrees can be created using other quadtrees as â templatesâ , so that the new quadtree's structure is identical to the template quadtree. The package also includes functionality for modifying quadtrees, querying values, saving quadtrees to a file, and calculating least-cost paths using the quadtree as a resistance surface.
RcppArmadillo implementation for the Matlab code of the Variational Mode Decomposition and Two-Dimensional Variational Mode Decomposition'. For more information, see (i) Variational Mode Decomposition by K. Dragomiretskiy and D. Zosso in IEEE Transactions on Signal Processing, vol. 62, no. 3, pp. 531-544, Feb.1, 2014, <doi:10.1109/TSP.2013.2288675>; (ii) Two-Dimensional Variational Mode Decomposition by Dragomiretskiy, K., Zosso, D. (2015), In: Tai, XC., Bae, E., Chan, T.F., Lysaker, M. (eds) Energy Minimization Methods in Computer Vision and Pattern Recognition. EMMCVPR 2015. Lecture Notes in Computer Science, vol 8932. Springer, <doi:10.1007/978-3-319-14612-6_15>.
DEsingle is an R package for differential expression (DE) analysis of single-cell RNA-seq (scRNA-seq) data. It defines and detects 3 types of differentially expressed genes between two groups of single cells, with regard to different expression status (DEs), differential expression abundance (DEa), and general differential expression (DEg). DEsingle employs Zero-Inflated Negative Binomial model to estimate the proportion of real and dropout zeros and to define and detect the 3 types of DE genes. Results showed that DEsingle outperforms existing methods for scRNA-seq DE analysis, and can reveal different types of DE genes that are enriched in different biological functions.
Allows access to selected services that are part of the Google Adwords API <https://developers.google.com/adwords/api/docs/guides/start>. Google Adwords is an online advertising service by Google', that delivers Ads to users. This package offers a authentication process using OAUTH2'. Currently, there are two methods of data of accessing the API, depending on the type of request. One method uses SOAP requests which require building an XML structure and then sent to the API. These are used for the ManagedCustomerService and the TargetingIdeaService'. The second method is by building AWQL queries for the reporting side of the Google Adwords API.
Simplify bivariate and regression analyses by automating result generation, including summary tables, statistical tests, and customizable graphs. It supports tests for continuous and dichotomous data, as well as stepwise regression for linear, logistic, and Firth penalized logistic models. While not a substitute for tailored analysis, BiVariAn accelerates workflows and is expanding features like multilingual interpretations of results.The methods for selecting significant statistical tests, as well as the predictor selection in prediction functions, can be referenced in the works of Marc Kery (2003) <doi:10.1890/0012-9623(2003)84[92:NORDIG]2.0.CO;2> and Rainer Puhr (2017) <doi:10.1002/sim.7273>.
Implementation of the double/debiased machine learning framework of Chernozhukov et al. (2018) <doi:10.1111/ectj.12097> for partially linear regression models, partially linear instrumental variable regression models, interactive regression models and interactive instrumental variable regression models. DoubleML allows estimation of the nuisance parts in these models by machine learning methods and computation of the Neyman orthogonal score functions. DoubleML is built on top of mlr3 and the mlr3 ecosystem. The object-oriented implementation of DoubleML based on the R6 package is very flexible. More information available in the publication in the Journal of Statistical Software: <doi:10.18637/jss.v108.i03>.
This package provides a flexible container to transport and manipulate complex sets of data. These data may consist of multiple data files and associated meta data and ancillary files. Individual data objects have associated system level meta data, and data files are linked together using the OAI-ORE standard resource map which describes the relationships between the files. The OAI- ORE standard is described at <https://www.openarchives.org/ore/>. Data packages can be serialized and transported as structured files that have been created following the BagIt specification. The BagIt specification is described at <https://datatracker.ietf.org/doc/html/draft-kunze-bagit-08>.
There are many different formats dates are commonly represented with: the order of day, month, or year can differ, different separators ("-", "/", or whitespace) can be used, months can be numerical, names, or abbreviations and year given as two digits or four. datefixR takes dates in all these different formats and converts them to R's built-in date class. If datefixR cannot standardize a date, such as because it is too malformed, then the user is told which date cannot be standardized and the corresponding ID for the row. datefixR also allows the imputation of missing days and months with user-controlled behavior.
Lactation curve modeling plays a central role in dairy production, supporting management decisions and the selection of animals with superior productivity and resilience. The package EMOTIONS fits 47 models for lactation curves and creates ensemble models using model averaging based on Akaike information criterion, Bayesian information criterion, root mean square percentage error, and mean squared error, variance of the predictions, cosine similarity for each model's predictions, and Bayesian Model Average. The daily production values predicted through the ensemble models can be used to estimate resilience indicators in the package. Additionally, the package allows the graphical visualization of the model ranks and the predicted lactation curves.
We implement various classical tests for the composite hypothesis of testing the fit to the family of gamma distributions as the Kolmogorov-Smirnov test, the Cramer-von Mises test, the Anderson Darling test and the Watson test. For each test a parametric bootstrap procedure is implemented, as considered in Henze, Meintanis & Ebner (2012) <doi:10.1080/03610926.2010.542851>. The recent procedures presented in Henze, Meintanis & Ebner (2012) <doi:10.1080/03610926.2010.542851> and Betsch & Ebner (2019) <doi:10.1007/s00184-019-00708-7> are implemented. Estimation of parameters of the gamma law are implemented using the method of Bhattacharya (2001) <doi:10.1080/00949650108812100>.
Converts table-like objects to stand-alone PDF or PNG. Can be used to embed tables and arbitrary content in PDF or Word documents. Provides a low-level R interface for creating LaTeX code, e.g. command() and a high-level interface for creating PDF documents, e.g. as.pdf.data.frame(). Extensive customization is available via mid-level functions, e.g. as.tabular(). See also package?latexpdf'. Support for PNG is experimental; see as.png.data.frame'. Adapted from metrumrg <https://r-forge.r-project.org/R/?group_id=1215>. Requires a compatible installation of pdflatex', e.g. <https://miktex.org/>.