Package to predict protein-protein interaction (PPI) networks in target organisms for which only a view information about PPIs is available. Path2PPI predicts PPI networks based on sets of proteins which can belong to a certain pathway from well-established model organisms. It helps to combine and transfer information of a certain pathway or biological process from several reference organisms to one target organism. Path2PPI only depends on the sequence similarity of the involved proteins.
Large data files can be difficult to work with in R, where data generally resides in memory. This package encourages a style of programming where data is streamed from disk into R via a `producer and through a series of `consumers that, typically reduce the original data to a manageable size. The package provides useful Producer and Consumer stream components for operations such as data input, sampling, indexing, and transformation; see package?Streamer for details.
Uniquorn enables users to identify cancer cell lines. Cancer cell line misidentification and cross-contamination reprents a significant challenge for cancer researchers. The identification is vital and in the frame of this package based on the locations/ loci of somatic and germline mutations/ variations. The input format is vcf/ vcf.gz and the files have to contain a single cancer cell line sample (i.e. a single member/genotype/gt column in the vcf file).
This package is to find SNV/Indel differences between two bam
files with near relationship in a way of pairwise comparison through each base position across the genome region of interest. The difference is inferred by Fisher test and euclidean distance, the input of which is the base count (A,T,G,C) in a given position and read counts for indels that span no less than 2bp on both sides of indel region.
This package contains functions to perform Bayesian inference using posterior simulation for a number of statistical models. Most simulation is done in compiled C++ written in the Scythe Statistical Library. All models return coda
mcmc
objects that can then be summarized using the coda
package. Some useful utility functions such as density functions, pseudo-random number generators for statistical distributions, a general purpose Metropolis sampling algorithm, and tools for visualization are provided.
Designed to help health economic modellers when building and reviewing models. The visualisation functions allow users to more easily review the network of functions in a project, and get lay summaries of them. The asserts included are intended to check for common errors, thereby freeing up time for modellers to focus on tests specific to the individual model in development or review. For more details see Smith and colleagues (2024)<doi:10.12688/wellcomeopenres.23180.1>.
This package provides functions to get and download city bike data from the website and API service of each city bike service in Norway. The package aims to reduce time spent on getting Norwegian city bike data, and lower barriers to start analyzing it. The data is retrieved from Oslo City Bike, Bergen City Bike, and Trondheim City Bike. The data is made available under NLOD 2.0 <https://data.norge.no/nlod/en/2.0>.
Skinfold measurements is one of the most popular and practical methods for estimating percent body fat. Body composition is a term that describes the relative proportions of fat, bone, and muscle mass in the human body. Following the collection of skinfold measurements, regression analysis (a statistical procedure used to predict a dependent variable based on one or more independent or predictor variables) is used to estimate total percent body fat in humans. <doi:10.4324/9780203868744>.
Estimate one or two cutpoints of a metric or ordinal-scaled variable in the multivariable context of survival data or time-to-event data. Visualise the cutpoint estimation process using contour plots, index plots, and spline plots. It is also possible to estimate cutpoints based on the assumption of a U-shaped or inverted U-shaped relationship between the predictor and the hazard ratio. Govindarajulu, U., and Tarpey, T. (2022) <doi:10.1080/02664763.2020.1846690>.
This package performs parallel analysis (Timmerman & Lorenzo-Seva, 2011 <doi:10.1037/a0023353>) and hull method (Lorenzo-Seva, Timmerman, & Kiers, 2011 <doi:10.1080/00273171.2011.564527>) for assessing the dimensionality of a set of variables using minimum rank factor analysis (see ten Berge & Kiers, 1991 <doi:10.1007/BF02294464> for more information). The package also includes the option to compute minimum rank factor analysis by itself, as well as the greater lower bound calculation.
Easy-to-use, very fast implementation of various functional bases. Easily used together with other packages. A functional basis is a collection of basis functions [\phi_1, ..., \phi_n] that can represent a smooth function, i.e. $f(t) = \sum c_k \phi_k(t)$. First- and second-order derivatives are also included. These are the mathematically correct ones, no approximations applied. As of version 1.1, this package includes B-splines, Fourier bases and polynomials.
This package provides complete detailed preprocessing of two-dimensional gas chromatogram (GCxGC
) samples. Baseline correction, smoothing, peak detection, and peak alignment. Also provided are some analysis functions, such as finding extracted ion chromatograms, finding mass spectral data, targeted analysis, and nontargeted analysis with either the National Institute of Standards and Technology Mass Spectral Library or with the mass data. There are also several visualization methods provided for each step of the preprocessing and analysis.
Reads data collected from wearable acceleratometers as used in sleep and physical activity research. Currently supports file formats: binary data from GENEActiv <https://activinsights.com/>, .bin-format from GENEA devices (not for sale), and .cwa-format from Axivity <https://axivity.com>. Further, it has functions for reading text files with epoch level aggregates from Actical', Fitbit', Actiwatch', ActiGraph
', and PhilipsHealthBand
'. Primarily designed to complement R package GGIR <https://CRAN.R-project.org/package=GGIR>.
This package provides a declarative language for specifying multilevel models, solving for population parameters based on specified variance-explained effect size measures, generating data, and conducting power analyses to determine sample size recommendations. The specification allows for any number of within-cluster effects, between-cluster effects, covariate effects at either level, and random coefficients. Moreover, the models do not assume orthogonal effects, and predictors can correlate at either level and accommodate models with multiple interaction effects.
Makes it possible to create an internally consistent repository consisting of selected packages from CRAN-like repositories. The user specifies a set of desired packages, and miniCRAN
recursively reads the dependency tree for these packages, then downloads only this subset. The user can then install packages from this repository directly, rather than from CRAN. This is useful in production settings, e.g. server behind a firewall, or remote locations with slow (or zero) Internet access.
Facilitate frequentist and Bayesian meta-analysis of diagnosis and prognosis research studies. It includes functions to summarize multiple estimates of prediction model discrimination and calibration performance (Debray et al., 2019) <doi:10.1177/0962280218785504>. It also includes functions to evaluate funnel plot asymmetry (Debray et al., 2018) <doi:10.1002/jrsm.1266>. Finally, the package provides functions for developing multivariable prediction models from datasets with clustering (de Jong et al., 2021) <doi:10.1002/sim.8981>.
This package implements HSROC (hierarchical summary receiver operating characteristic) model developed by Ma, Lian, Chu, Ibrahim, and Chen (2018) <doi:10.1093/biostatistics/kxx025> and hierarchical model developed by Lian, Hodges, and Chu (2019) <doi:10.1080/01621459.2018.1476239> for performing meta-analysis for 1-5 diagnostic tests to simultaneously compare multiple tests within a missing data framework. This package evaluates the accuracy of multiple diagnostic tests and also gives graphical representation of the results.
Speeds up the process of loading raw data from MBA (Multiplex Bead Assay) examinations, performs quality control checks, and automatically normalises the data, preparing it for more advanced, downstream tasks. The main objective of the package is to create a simple environment for a user, who does not necessarily have experience with R language. The package is developed within the project of the same name - PvSTATEM
', which is an international project aiming for malaria elimination.
Fits a wide variety of multivariate spatio-temporal models with simultaneous and lagged interactions among variables (including vector autoregressive spatio-temporal ('VAST') dynamics) for areal, continuous, or network spatial domains. It includes time-variable, space-variable, and space-time-variable interactions using dynamic structural equation models ('DSEM') as expressive interface, and the mgcv package to specify splines via the formula interface. See Thorson et al. (2024) <doi:10.48550/arXiv.2401.10193>
for more details.
Using matrix layout to visualize the unique, common, or individual contribution of each predictor (or matrix of predictors) towards explained variation on different models. These contributions were derived from variation partitioning (VP) and hierarchical partitioning (HP), applying the algorithm of "Lai et al. (2022) Generalizing hierarchical and variation partitioning in multiple regression and canonical analyses using the rdacca.hp R package.Methods in Ecology and Evolution, 13: 782-788 <doi:10.1111/2041-210X.13800>".
This package facilitates the analysis of single-cell RNA-seq UMI matrices. It does this by computing partitions of a cell similarity graph into small homogeneous groups of cells, which are defined as metacells (MCs). The derived MCs are then used for building different representations of the data, allowing matrix or 2D graph visualization forming a basis for analysis of cell types, subtypes, transcriptional gradients,cell-cycle variation, gene modules and their regulatory models and more.
R-based access to a large set of data variables relevant to forest ecology in British Columbia (BC), Canada. Layers are in raster format at 100m resolution in the BC Albers projection, hosted at the Federated Research Data Repository (FRDR) with <doi:10.20383/101.0283>. The collection includes: elevation; biogeoclimatic zone; wildfire; cutblocks; forest attributes from Hansen et al. (2013) <doi:10.1139/cjfr-2013-0401> and Beaudoin et al. (2017) <doi:10.1139/cjfr-2017-0184>; and rasterized Forest Insect and Disease Survey (FIDS) maps for a number of insect pest species, all covering the period 2001-2018. Users supply a polygon or point location in the province of BC, and rasterbc will download the overlapping raster tiles hosted at FRDR, merging them as needed and returning the result in R as a SpatRaster
object. Metadata associated with these layers, and code for downloading them from their original sources can be found in the github repository <https://github.com/deankoch/rasterbc_src>.
PCA done by eigenvalue decomposition of a data correlation matrix, here it automatically determines the number of factors by eigenvalue greater than 1 and it gives the uncorrelated variables based on the rotated component scores, Such that in each principal component variable which has the high variance are selected. It will be useful for non-statisticians in selection of variables. For more information, see the <http://www.ijcem.org/papers032013/ijcem_032013_06.pdf> web page.
Color palettes for all people, including those with color vision deficiency. Popular color palette series have been organized by type and have been scored on several properties such as color-blind-friendliness and fairness (i.e. do colors stand out equally?). Own palettes can also be loaded and analysed. Besides the common palette types (categorical, sequential, and diverging) it also includes cyclic and bivariate color palettes. Furthermore, a color for missing values is assigned to each palette.