Nonparametric kernel distribution function estimation is performed. Three bandwidth selectors are implemented: the plug-in selectors of Altman and Leger and of Polansky and Baker, and the cross-validation selector of Bowman, Hall and Prvan. The exceedance function, the mean return period and the return level are also computed. For details, see Quintela-del-Rà o and Estévez-Pérez (2012) <doi:10.18637/jss.v050.i08>.
This package provides methods to extract information on pathways, genes and various single-nucleotid polymorphisms (SNPs) from online databases. It provides functions for data preparation and evaluation of genetic influence on a binary outcome using the logistic kernel machine test (LKMT). Three different kernel functions are offered to analyze genotype information in this variance component test: A linear kernel, a size-adjusted kernel and a network-based kernel).
This package performs meta-analysis and meta-regression using standard and robust methods with confidence intervals based on the profile likelihood. Robust methods are based on alternative distributions for the random effect, either the t-distribution (Lee and Thompson, 2008 <doi:10.1002/sim.2897> or Baker and Jackson, 2008 <doi:10.1007/s10729-007-9041-8>) or mixtures of normals (Beath, 2014 <doi:10.1002/jrsm.1114>).
Function pip3d() tests whether a point in 3D space is within, exactly on, or outside an enclosed surface defined by a triangular mesh. Function pip2d() tests whether a point in 2D space is within, exactly on, or outside a polygon. For a reference, see: Liu et al., A new point containment test algorithm based on preprocessing and determining triangles, Computer-Aided Design 42(12):1143-1150.
The goal of safejoin is to guarantee that when performing joins extra rows are not added to your data. safejoin provides a wrapper around dplyr::left_join that will raise an error when extra rows are unexpectedly added to your data. This can be useful when working with data where you expect there to be a many to one relationship but you are not certain the relationship holds.
This package provides functions to estimate a strategic selection estimator. A strategic selection estimator is an agent error model in which the two random components are not assumed to be orthogonal. In addition this package provides generic functions to print and plot objects of its class as well as the necessary functions to create tables for LaTeX. There is also a function to create dyadic data sets.
Species sensitivity distributions are cumulative probability distributions which are fitted to toxicity concentrations for different species as described by Posthuma et al.(2001) <isbn:9781566705783>. The ssdtools package uses Maximum Likelihood to fit distributions such as the gamma, log-logistic, log-normal and log-normal log-normal mixture. Multiple distributions can be averaged using Akaike Information Criteria. Confidence intervals on hazard concentrations and proportions are produced by bootstrapping.
This package performs parametric synthesis of sounds with harmonic and noise components such as animal vocalizations or human voice. Also offers tools for audio manipulation and acoustic analysis, including pitch tracking, spectral analysis, audio segmentation, pitch and formant shifting, etc. Includes four interactive web apps for synthesizing and annotating audio, manually correcting pitch contours, and measuring formant frequencies. Reference: Anikin (2019) <doi:10.3758/s13428-018-1095-7>.
This package provides a novel feature-wise normalization method based on a zero-inflated negative binomial model. This method assumes that the effects of sequencing depth vary for each taxon on their mean and also incorporates a rational link of zero probability and taxon dispersion as a function of sequencing depth. Ziyue Wang, Dillon Lloyd, Shanshan Zhao, Alison Motsinger-Reif (2023) <doi:10.1101/2023.10.31.563648>.
This package performs maximum likelihood based estimation and inference on time to event data, possibly subject to non-informative right censoring. FitParaSurv() provides maximum likelihood estimates of model parameters and distributional characteristics, including the mean, median, variance, and restricted mean. CompParaSurv() compares the mean, median, and restricted mean survival experiences of two treatment groups. Candidate distributions include the exponential, gamma, generalized gamma, log-normal, and Weibull.
This package provides a generic reference Bayesian analysis of unidimensional mixture distributions obtained by a location-scale parameterisation of the model is implemented. The including functions simulate and summarize posterior samples for location-scale mixture models using a weakly informative prior. There is no need to define priors for scale-location parameters except two hyperparameters in which are associated with a Dirichlet prior for weights and a simplex.
squallms is a Bioconductor R package that implements a "semi-labeled" approach to untargeted mass spectrometry data. It pulls in raw data from mass-spec files to calculate several metrics that are then used to label MS features in bulk as high or low quality. These metrics of peak quality are then passed to a simple logistic model that produces a fully-labeled dataset suitable for downstream analysis.
This package offers functions to process multiple ChIP-seq BAM files and detect allele-specific events. It computes allele counts at individual variants (SNPs/SNVs), implements extensive QC (quality control) steps to remove problematic variants, and utilizes a Bayesian framework to identify statistically significant allele-specific events. BaalChIP is able to account for copy number differences between the two alleles, a known phenotypical feature of cancer samples.
The first day of any MMWR week is Sunday. MMWR week numbering is sequential beginning with 1 and incrementing with each week to a maximum of 52 or 53. MMWR week #1 of an MMWR year is the first week of the year that has at least four days in the calendar year. This package provides functionality to convert dates to MMWR day, week, and year and the reverse.
Phylogenetic clustering (phyloclustering) is an evolutionary continuous time Markov Chain model-based approach to identify population structure from molecular data without assuming linkage equilibrium. The package phyclust provides a convenient implementation of phyloclustering for DNA and SNP data, capable of clustering individuals into subpopulations and identifying molecular sequences representative of those subpopulations. It is designed in C for performance and interfaced with R for visualization.
Nonparametric data-driven approach to discovering heterogeneous subgroups in a selection-on-observables framework. aggTrees allows researchers to assess whether there exists relevant heterogeneity in treatment effects by generating a sequence of optimal groupings, one for each level of granularity. For each grouping, we obtain point estimation and inference about the group average treatment effects. Please reference the use as Di Francesco (2022) <doi:10.2139/ssrn.4304256>.
With appRiori <doi:10.1177/25152459241293110>, users upload the research variables and the app guides them to the best set of comparisons fitting the hypotheses, for both main and interaction effects. Through a graphical explanation and empirical examples on reproducible data, it is shown that it is possible to understand both the logic behind the planned comparisons and the way to interpret them when a model is tested.
Allows the estimation and prediction for binary Gaussian process model. The mean function can be assumed to have time-series structure. The estimation methods for the unknown parameters are based on penalized quasi-likelihood/penalized quasi-partial likelihood and restricted maximum likelihood. The predicted probability and its confidence interval are computed by Metropolis-Hastings algorithm. More details can be seen in Sung et al (2017) <arXiv:1705.02511>.
Network meta-analysis and meta-regression (allows including up to three covariates) for individual participant data, aggregate data, and mixtures of both formats using the three-level hierarchical model. Each format can come from randomized controlled trials or non-randomized studies or mixtures of both. Estimates are generated in a Bayesian framework using JAGS. The implemented models are described by Hamza et al. 2023 <DOI:10.1002/jrsm.1619>.
This package provides a utility to quickly obtain clean and tidy college football data. Serves as a wrapper around the <https://collegefootballdata.com/> API and provides functions to access live play by play and box score data from ESPN <https://www.espn.com> when available. It provides users the capability to access a plethora of endpoints, and supplement that data with additional information (Expected Points Added/Win Probability added).
This package provides functions to facilitate the use of the ff package in interaction with big data in SQL databases (e.g. in Oracle', MySQL', PostgreSQL', Hive') by allowing easy importing directly into ffdf objects using DBI', RODBC and RJDBC'. Also contains some basic utility functions to do fast left outer join merging based on match', factorisation of data and a basic function for re-coding vectors.
This package provides a suite of bootstrap-based models and tools for analyzing fish stocks and aquatic populations. Designed for ecologists and fisheries scientists, it supports data from length-frequency distributions, tag-and-recapture studies, and hard structure readings (e.g., otoliths). See Schwamborn et al., 2019 <doi:10.1016/j.ecolmodel.2018.12.001> for background. The package includes functions for bootstrapped fitting of growth curves and plotting.
Read data files readable by gnumeric into R'. Can read whole sheet or a range, from several file formats, including the native format of gnumeric'. Reading is done by using ssconvert (a file converter utility included in the gnumeric distribution <http://www.gnumeric.org>) to convert the requested part to CSV. From gnumeric files (but not other formats) can list sheet names and sheet sizes or read all sheets.
This repository aims to contribute to the econometric models production with Colombian data, by providing a set of web-scrapping functions of some of the main macro-financial indicators. All the sources are public and free, but the advantage of these functions is that they directly download and harmonize the information in R's environment. No need to import or download additional files. You only need an internet connection!