Core visualizations and summaries for the CRAN package database. The package provides comprehensive methods for cleaning up and organizing the information in the CRAN package database, for building package directives networks (depends, imports, suggests, enhances, linking to) and collaboration networks, producing package dependence trees, and for computing useful summaries and producing interactive visualizations from the resulting networks and summaries. The resulting networks can be coerced to igraph <https://CRAN.R-project.org/package=igraph> objects for further analyses and modelling.
Utility functions that help with common base-R problems relating to lists. Lists in base-R are very flexible. This package provides functions to quickly and easily characterize types of lists. That is, to identify if all elements in a list are null, data.frames, lists, or fully named lists. Other functionality is provided for the handling of lists, such as the easy splitting of lists into equally sized groups, and the unnesting of data.frames within fully named lists.
Defines window or bin boundaries for the analysis of genomic data. Boundaries are based on the inflection points of a cubic smoothing spline fitted to the raw data. Along with defining boundaries, a technique to evaluate results obtained from unequally-sized windows is provided. Applications are particularly pertinent for, though not limited to, genome scans for selection based on variability between populations (e.g. using Wright's fixations index, Fst, which measures variability in subpopulations relative to the total population).
High dimensional longitudinal data analysis with Markov Chain Monte Carlo(MCMC). Currently support mixed effect regression with or without missing observations by considering covariance structures. It provides estimates by missing at random and missing not at random assumptions. In this R package, we present Bayesian approaches that statisticians and clinical researchers can easily use. The functions methodology is based on the book "Bayesian Approaches in Oncology Using R and OpenBUGS
" by Bhattacharjee A (2020) <doi:10.1201/9780429329449-14>.
Perform two linear combination methods for biomarkers: (1) Empirical performance optimization for specificity (or sensitivity) at a controlled sensitivity (or specificity) level of Huang and Sanda (2022) <doi:10.1214/22-aos2210>, and (2) weighted maximum score estimator with empirical minimization of averaged false positive rate and false negative rate. Both adopt the algorithms of Huang and Sanda (2022) <doi:10.1214/22-aos2210>. MOSEK solver is used and needs to be installed; an academic license for MOSEK is free.
This package provides tools for the analysis of population differences using the Major Histocompatibility Complex (MHC) genotypes of samples having a variable number of alleles (1-4) recorded for each individual. A hierarchical Dirichlet-Multinomial model on the genotype counts is used to pool small samples from multiple populations for pairwise tests of equality. Bayesian inference is implemented via the rstan package. Bootstrapped and posterior p-values are provided for chi-squared and likelihood ratio tests of equal genotype probabilities.
Handling the microclimatic data in R. The myClim
workflow begins at the reading data primary from microclimatic dataloggers, but can be also reading of meteorological station data from files. Cleaning time step, time zone settings and metadata collecting is the next step of the work flow. With myClim
tools one can crop, join, downscale, and convert microclimatic data formats, sort them into localities, request descriptive characteristics and compute microclimatic variables. Handy plotting functions are provided with smart defaults.
NEON observational data are provided via the NEON Data Portal <https://www.neonscience.org> and NEON API, and can be downloaded and reformatted by the neonUtilities
package. NEON observational data (human-observed measurements, and analyses derived from human-collected samples, such as tree diameters and algal chemistry) are published in a format consisting of one or more tabular data files. This package provides tools for performing common operations on NEON observational data, including checking for duplicates and joining tables.
This package provides a comprehensive set of indexes and tests for social segregation analysis, as described in Tivadar (2019) - OasisR
': An R Package to Bring Some Order to the World of Segregation Measurement <doi:10.18637/jss.v089.i07>. The package is the most complete existing tool and it clarifies many ambiguities and errors regarding the definition of segregation indices. Additionally, OasisR
introduces several resampling methods that enable testing their statistical significance (randomization tests, bootstrapping, and jackknife methods).
We provide an R interface to OpenML.org
which is an online machine learning platform where researchers can access open data, download and upload data sets, share their machine learning tasks and experiments and organize them online to work and collaborate with other researchers. The R interface allows to query for data sets with specific properties, and allows the downloading and uploading of data sets, tasks, flows and runs. See <https://www.openml.org/guide/api> for more information.
The R package povmap supports small area estimation of means and poverty headcount rates. It adds several new features to the emdi package (see "The R Package emdi for Estimating and Mapping Regionally Disaggregated Indicators" by Kreutzmann et al. (2019) <doi:10.18637/jss.v091.i07>). These include new options for incorporating survey weights, ex-post benchmarking of estimates, two additional transformations, several new convenient functions to assist with reporting results, and a wrapper function to facilitate access from Stata'.
This package provides functions to aid in micro and macro economic analysis and handling of price and currency data. Includes extraction of relevant inflation and exchange rate data from World Bank API, data cleaning/parsing, and standardisation. Inflation adjustment calculations as found in Principles of Macroeconomics by Gregory Mankiw et al (2014). Current and historical end of day exchange rates for 171 currencies from the European Central Bank Statistical Data Warehouse (2020) <https://sdw.ecb.europa.eu/curConverter.do>
.
Enable optimization of parameters of ordinary differential equations. Therefore, using SUNDIALS to solve the ODE-System (see Hindmarsh, Alan C., Peter N. Brown, Keith E. Grant, Steven L. Lee, Radu Serban, Dan E. Shumaker, and Carol S. Woodward. (2005) <doi:10.1145/1089014.1089020>). Furthermore, for optimization the particle swarm algorithm is used (see: Akman, Devin, Olcay Akman, and Elsa Schaefer. (2018) <doi:10.1155/2018/9160793> and Sengupta, Saptarshi, Sanchita Basak, and Richard Peters. (2018) <doi:10.3390/make1010010>).
This package provides functions to Simultaneously Infer Causal Graphs and Genetic Architecture. Includes acyclic and cyclic graphs for data from an experimental cross with a modest number (<10) of phenotypes driven by a few genetic loci (QTL). Chaibub Neto E, Keller MP, Attie AD, Yandell BS (2010) Causal Graphical Models in Systems Genetics: a unified framework for joint inference of causal network and genetic architecture for correlated phenotypes. Annals of Applied Statistics 4: 320-339. <doi:10.1214/09-AOAS288>.
Based on the illness-death model a large number of clinical trials with oncology endpoints progression-free survival (PFS) and overall survival (OS) can be simulated, see Meller, Beyersmann and Rufibach (2019) <doi:10.1002/sim.8295>. The simulation set-up allows for random and event-driven censoring, an arbitrary number of treatment arms, staggered study entry and drop-out. Exponentially, Weibull and piecewise exponentially distributed survival times can be generated. The correlation between PFS and OS can be calculated.
This package provides functions for estimation (parametric, semi-parametric and non-parametric) of copula-based dependence coefficients between a finite collection of random vectors, including phi-dependence measures and Bures-Wasserstein dependence measures. An algorithm for agglomerative hierarchical variable clustering is also implemented. Following the articles De Keyser & Gijbels (2024) <doi:10.1016/j.jmva.2024.105336>, De Keyser & Gijbels (2024) <doi:10.1016/j.ijar.2023.109090>, and De Keyser & Gijbels (2024) <doi:10.48550/arXiv.2404.07141>
.
cogeqc aims to facilitate systematic quality checks on standard comparative genomics analyses to help researchers detect issues and select the most suitable parameters for each data set. cogeqc can be used to asses: i. genome assembly and annotation quality with BUSCOs and comparisons of statistics with publicly available genomes on the NCBI; ii. orthogroup inference using a protein domain-based approach and; iii. synteny detection using synteny network properties. There are also data visualization functions to explore QC summary statistics.
The HOPACH clustering algorithm builds a hierarchical tree of clusters by recursively partitioning a data set, while ordering and possibly collapsing clusters at each level. The algorithm uses the Mean/Median Split Silhouette (MSS) criteria to identify the level of the tree with maximally homogeneous clusters. It also runs the tree down to produce a final ordered list of the elements. The non-parametric bootstrap allows one to estimate the probability that each element belongs to each cluster (fuzzy clustering).
Package nethet is an implementation of statistical solid methodology enabling the analysis of network heterogeneity from high-dimensional data. It combines several implementations of recent statistical innovations useful for estimation and comparison of networks in a heterogeneous, high-dimensional setting. In particular, we provide code for formal two-sample testing in Gaussian graphical models (differential network and GGM-GSA; Stadler and Mukherjee, 2013, 2014) and make a novel network-based clustering algorithm available (mixed graphical lasso, Stadler and Mukherjee, 2013).
This package provides a daemon for checking running and not running processes. It reads the /proc
directory every n seconds and does a POSIX regexp on the process names. The daemon runs a user-provided script when it detects a program in the running processes, or an alternate script if it doesn't detect the program. The daemon can only be called by the root user, but can use sudo -u user
in the process called if needed.
Oscope is a oscillatory genes identifier in unsynchronized single cell RNA-seq. This statistical pipeline has been developed to identify and recover the base cycle profiles of oscillating genes in an unsynchronized single cell RNA-seq experiment. The Oscope pipeline includes three modules: a sine model module to search for candidate oscillator pairs; a K-medoids clustering module to cluster candidate oscillators into groups; and an extended nearest insertion module to recover the base cycle order for each oscillator group.
This package provides an optimization method based on sequential quadratic programming for maximum likelihood estimation of the mixture proportions in a finite mixture model where the component densities are known. The algorithm is expected to obtain solutions that are at least as accurate as the state-of-the-art MOSEK interior-point solver, and they are expected to arrive at solutions more quickly when the number of samples is large and the number of mixture components is not too large.
Bisulfite-treated RNA non-conversion in a set of samples is analysed as follows : each sample's non-conversion distribution is identified to a Poisson distribution. P-values adjusted for multiple testing are calculated in each sample. Combined non-conversion P-values and standard errors are calculated on the intersection of the set of samples. For further details, see C Legrand, F Tuorto, M Hartmann, R Liebers, D Jakob, M Helm and F Lyko (2017) <doi:10.1101/gr.210666.116>.
The bootstrap ARDL tests for cointegration is the main functionality of this package. It also acts as a wrapper of the most commond ARDL testing procedures for cointegration: the bound tests of Pesaran, Shin and Smith (PSS; 2001 - <doi:10.1002/jae.616>) and the asymptotic test on the independent variables of Sam, McNown
and Goh (SMG: 2019 - <doi:10.1016/j.econmod.2018.11.001>). Bootstrap and bound tests are performed under both the conditional and unconditional ARDL models.