Convenience functions and datasets to be used with Practical Multilevel Modeling using R. The package includes functions for calculating group means, group mean centered variables, and displaying some basic missing data information. A function for computing robust standard errors for linear mixed models based on Liang and Zeger (1986) <doi:10.1093/biomet/73.1.13> and Bell and McCaffrey (2002) <https://www150.statcan.gc.ca/n1/en/pub/12-001-x/2002002/article/9058-eng.pdf?st=NxMjN1YZ> is included as well as a function for checking for level-one homoskedasticity (Raudenbush & Bryk, 2002, ISBN:076191904X).
This package provides tools for visualizing and analyzing the shape of discrete nominal frequency distributions. The package introduces centered frequency plots, in which nominal categories are ordered from the most frequent category at the center toward less frequent categories on both sides, facilitating the detection of distributional patterns such as uniformity, dominance, symmetry, skewness, and long-tail behavior. In addition, the package supports Pareto charts for the study of dominance and cumulative frequency structure in nominal data. The package is designed for exploratory data analysis and statistical teaching, offering visualizations that emphasize distributional form rather than arbitrary category ordering.
An implementation of data analysis tools for samples of symmetric or Hermitian positive definite matrices, such as collections of covariance matrices or spectral density matrices. The tools in this package can be used to perform: (i) intrinsic wavelet transforms for curves (1D) or surfaces (2D) of Hermitian positive definite matrices with applications to dimension reduction, denoising and clustering in the space of Hermitian positive definite matrices; and (ii) exploratory data analysis and inference for samples of positive definite matrices by means of intrinsic data depth functions and rank-based hypothesis tests in the space of Hermitian positive definite matrices.
The Subsemble algorithm is a general subset ensemble prediction method, which can be used for small, moderate, or large datasets. Subsemble partitions the full dataset into subsets of observations, fits a specified underlying algorithm on each subset, and uses a unique form of k-fold cross-validation to output a prediction function that combines the subset-specific fits. An oracle result provides a theoretical performance guarantee for Subsemble. The paper, "Subsemble: An ensemble method for combining subset-specific algorithm fits" is authored by Stephanie Sapp, Mark J. van der Laan & John Canny (2014) <doi:10.1080/02664763.2013.864263>.
This package provides functionalities for performing stability analysis of genotype by environment interaction (GEI) to identify superior and stable genotypes across diverse environments. It implements Eberhart and Russellâ s ANOVA method (1966)(<doi:10.2135/cropsci1966.0011183X000600010011x>), Finlay and Wilkinsonâ s Joint Linear Regression method (1963) (<doi:10.1071/AR9630742>), Wrickeâ s Ecovalence (1962, 1964), Shuklaâ s stability variance parameter (1972) (<doi:10.1038/hdy.1972.87>), Kangâ s simultaneous selection for high yield and stability (1991) (<doi:10.2134/agronj1991.00021962008300010037x>), Additive Main Effects and Multiplicative Interaction (AMMI) method and Genotype plus Genotypes by Environment (GGE) Interaction methods.
This package provides density functions for the joint distribution of choice, response time and confidence for discrete confidence judgments as well as functions for parameter fitting, prediction and simulation for various dynamical models of decision confidence. All models are explained in detail by Hellmann et al. (2023; Preprint available at <https://osf.io/9jfqr/>, published version: <doi:10.1037/rev0000411>). Implemented models are the dynaViTE model, dynWEV model, the 2DSD model (Pleskac & Busemeyer, 2010, <doi:10.1037/a0019737>), and various race models. C++ code for dynWEV and 2DSD is based on the rtdists package by Henrik Singmann.
This package provides robustness checks to align estimands with the identification that they require. Given a dagitty object and a model specification, DAGassist classifies variables by causal roles, flags problematic controls, and generates a report comparing the original model with minimal and canonical adjustment sets. Exports publication-grade reports in LaTeX', Word', Excel', dotwhisker', or plain text/'markdown'. DAGassist is built on dagitty', an R package that uses the DAGitty web tool (<https://dagitty.net/>) for creating and analyzing DAGs. Methods draw on Pearl (2009) <doi:10.1017/CBO9780511803161> and Textor et al. (2016) <doi:10.1093/ije/dyw341>.
This package performs likelihood-based extreme value inferences with adjustment for the presence of missing values based on Simpson and Northrop (2026). A Generalised Extreme Value distribution is fitted to block maxima using maximum likelihood estimation, with the location and scale parameters reflecting the numbers of non-missing raw values in each block. A Bayesian version is also provided. For the purposes of comparison, there are options to make no adjustment for missing values or to discard any block maximum for which greater than a percentage of the underlying raw values are missing. Example datasets containing missing values are provided.
Different functions includes constructing composite indicators, imputing missing data, and evaluating imputation techniques. Additionally, different tools for data normalization. Detailed methodologies of Indicator package are: OECD/European Union/EC-JRC (2008), "Handbook on Constructing Composite Indicators: Methodology and User Guide", OECD Publishing, Paris, <DOI:10.1787/533411815016>, Matteo Mazziotta & Adriano Pareto, (2018) "Measuring Well-Being Over Time: The Adjusted Mazziottaâ Pareto Index Versus Other Non-compensatory Indices" <DOI:10.1007/s11205-017-1577-5> and De Muro P., Mazziotta M., Pareto A. (2011), "Composite Indices of Development and Poverty: An Application to MDGs" <DOI:10.1007/s11205-010-9727-z>.
Kernel-based methods are powerful methods for integrating heterogeneous types of data. mixKernel aims at providing methods to combine kernel for unsupervised exploratory analysis. Different solutions are provided to compute a meta-kernel, in a consensus way or in a way that best preserves the original topology of the data. mixKernel also integrates kernel PCA to visualize similarities between samples in a non linear space and from the multiple source point of view <doi:10.1093/bioinformatics/btx682>. A method to select (as well as funtions to display) important variables is also provided <doi:10.1093/nargab/lqac014>.
This package provides tools to assist in safely applying user generated objective and derivative function to optimization programs. These are primarily function minimization methods with at most bounds and masks on the parameters. Provides a way to check the basic computation of objective functions that the user provides, along with proposed gradient and Hessian functions, as well as to wrap such functions to avoid failures when inadmissible parameters are provided. Check bounds and masks. Check scaling or optimality conditions. Perform an axial search to seek lower points on the objective function surface. Includes forward, central and backward gradient approximation codes.
The anota2seq package provides analysis of translational efficiency and differential expression analysis for polysome-profiling and ribosome-profiling studies (two or more sample classes) quantified by RNA sequencing or DNA-microarray. Polysome-profiling and ribosome-profiling typically generate data for two RNA sources, translated mRNA and total mRNA. Analysis of differential expression is used to estimate changes within each RNA source. Analysis of translational efficiency aims to identify changes in translation efficiency leading to altered protein levels that are independent of total mRNA levels or buffering, a mechanism regulating translational efficiency so that protein levels remain constant despite fluctuating total mRNA levels.
iheatmapr is an R package for building complex, interactive heatmaps using modular building blocks. "Complex" heatmaps are heatmaps in which subplots along the rows or columns of the main heatmap add more information about each row or column. For example, a one column additional heatmap may indicate what group a particular row or column belongs to. Complex heatmaps may also include multiple side by side heatmaps which show different types of data for the same conditions. Interactivity can improve complex heatmaps by providing tooltips with information about each cell and enabling zooming into interesting features. iheatmapr uses the plotly library for interactivity.
19 term and 9 first trimester placental chorionic villi and matched cell-sorted samples ran on Illumina HumanMethylationEPIC DNA methylation microarrays. This data was made available on GEO accession [GSE159526](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE159526). Both the raw and processed data has been made available on \codeExperimentHub. Raw unprocessed data formatted as an RGChannelSet object for integration and normalization using minfi and other existing Bioconductor packages. Processed normalized data is also available as a DNA methylation \codematrix, with a corresponding phenotype information as a \codedata.frame object.
This package provides several functions to identify and analyse miRNA sponge, including popular methods for identifying miRNA sponge interactions, two types of global ceRNA regulation prediction methods and four types of context-specific prediction methods( Li Y et al.(2017) <doi:10.1093/bib/bbx137>), which are based on miRNA-messenger RNA regulation alone, or by integrating heterogeneous data, respectively. In addition, For predictive ceRNA relationship pairs, this package provides several downstream analysis algorithms, including regulatory network analysis and functional annotation analysis, as well as survival prognosis analysis based on expression of ceRNA ternary pair.
Easy installation, loading and management, of high-performance packages for statistical computing and data manipulation in R. The core fastverse consists of 4 packages: data.table', collapse', kit and magrittr', that jointly only depend on Rcpp'. The fastverse can be freely and permanently extended with additional packages, both globally or for individual projects. Separate package verses can also be created. Fast packages for many common tasks such as time series, dates and times, strings, spatial data, statistics, data serialization, larger-than-memory processing, and compilation of R code are listed in the README file: <https://github.com/fastverse/fastverse#suggested-extensions>.
The routine twosample_test() in this package runs the two-sample test using various test statistic for multivariate data. The user can also run several tests and then find a p value adjusted for simultaneous inference. The p values are found via permutation or via the parametric bootstrap. The routine twosample_power() allows the estimation of the power of the tests. The routine run.studies() allows a user to quickly study the power of a new method and how it compares to those included in the package. For details of the methods and references see the included vignettes.
This package provides functions and datasets to support Valliant, Dever, and Kreuter (2018), <doi:10.1007/978-3-319-93632-1>, "Practical Tools for Designing and Weighting Survey Samples". Contains functions for sample size calculation for survey samples using stratified or clustered one-, two-, and three-stage sample designs, and single-stage audit sample designs. Functions are included that will group geographic units accounting for distances apart and measures of size. Other functions compute variance components for multistage designs, sample sizes in two-phase designs, and a stopping rule for ending data collection. A number of example data sets are included.
Objects to manipulate sequential and seasonal time series. Sequential time series based on time instants and time duration are handled. Both can be regularly or unevenly spaced (overlapping duration are allowed). Only POSIX* format are used for dates and times. The following classes are provided : POSIXcti', POSIXctp', TimeIntervalDataFrame', TimeInstantDataFrame', SubtimeDataFrame ; methods to switch from a class to another and to modify the time support of series (hourly time series to daily time series for instance) are also defined. Tools provided can be used for instance to handle environmental monitoring data (not always produced on a regular time base).
It offers functions for splitting, parsing, tokenizing and creating a vocabulary for big text data files. Moreover, it includes functions for building a document-term matrix and extracting information from those (term-associations, most frequent terms). It also embodies functions for calculating token statistics (collocations, look-up tables, string dissimilarities) and functions to work with sparse matrices. Lastly, it includes functions for Word Vector Representations (i.e. GloVe', fasttext') and incorporates functions for the calculation of (pairwise) text document dissimilarities. The source code is based on C++11 and exported in R through the Rcpp', RcppArmadillo and BH packages.
Access and manage the application programming interface (API) of the Armed Conflict Location & Event Data Project (ACLED) at <https://acleddata.com/>. The package makes it easy to retrieve a user-defined sample (or all of the available data) of ACLED, enabling a seamless integration of regular data updates into the research work flow. It requires a minimal number of dependencies. See the package's README file for a note on replicability when drawing on ACLED data. When using this package, you acknowledge that you have read ACLED's terms and conditions of use, and that you agree with their attribution requirements.
Understanding the drivers of microbial diversity is an important frontier of microbial ecology, and investigating the diversity of samples from microbial ecosystems is a common step in any microbiome analysis. breakaway is the premier package for statistical analysis of microbial diversity. breakaway implements the latest and greatest estimates of species richness, described in Willis and Bunge (2015) <doi:10.1111/biom.12332>, Willis et al. (2017) <doi:10.1111/rssc.12206>, and Willis (2016) <arXiv:1604.02598>, as well as the most commonly used estimates, including the objective Bayes approach described in Barger and Bunge (2010) <doi:10.1214/10-BA527>.
Verification of continually updating time series data where we expect new values, but want to ensure previous data remains unchanged. Data previously recorded could change for a number of reasons, such as discovery of an error in model code, a change in methodology or instrument recalibration. Monitoring data sources for these changes is not always possible. Other unnoticed changes could include a jump in time or measurement frequency, due to instrument failure or software updates. Functionality is provided that can be used to check and flag changes to previous data to prevent changes going unnoticed, as well as unexpected jumps in time.
Pacote para análise de delineamentos experimentais (DIC, DBC e DQL), experimentos em esquema fatorial duplo (em DIC e DBC), experimentos em parcelas subdivididas (em DIC e DBC), experimentos em esquema fatorial duplo com um tratamento adicional (em DIC e DBC), experimentos em fatorial triplo (em DIC e DBC) e experimentos em esquema fatorial triplo com um tratamento adicional (em DIC e DBC), fazendo analise de variancia e comparacao de multiplas medias (para tratamentos qualitativos), ou ajustando modelos de regressao ate a terceira potencia (para tratamentos quantitativos); analise de residuos (Ferreira, Cavalcanti and Nogueira, 2014) <doi:10.4236/am.2014.519280>.