This package provides a modern view on the principal component analysis biplot with calibrated axes. Create principal component analysis biplots rendered in HTML with significant reactivity embedded within the plot. Furthermore, the traditional biplot view is enhanced by translated axes with inter-class kernel densities superimposed. For more information on biplots, see Gower, J.C., Lubbe, S. and le Roux, N.J. (2011, ISBN: 978-0-470-01255-0).
This package provides a platform is provided for interactive analyses with a goal of totally easy to develop, deploy, interact, and explore (TEDDIE). Using this package, users can create customized analyses and make them available to end users who can perform interactive analyses and save analyses to RTF or HTML files. It allows developers to focus on R code for analysis, instead of dealing with html or shiny code.
Analyses gene expression data derived from experiments to detect differentially expressed genes by employing the concept of majority voting with five different statistical models. It includes functions for differential expression analysis, significance testing, etc. It simplifies the process of uncovering meaningful patterns and trends within gene expression data, aiding researchers in downstream analysis. Boyer, R.S., Moore, J.S. (1991) <doi:10.1007/978-94-011-3488-0_5>.
By adding over-relaxation factor to PXEM (Parameter Expanded Expectation Maximization) method, the MOPXEM (Monotonically Overrelaxed Parameter Expanded Expectation Maximization) method is obtained. Compare it with the existing EM (Expectation-Maximization)-like methods. Then, distribute and process five methods and compare them, achieving good performance in convergence speed and result quality.The philosophy of the package is described in Guo G. (2022) <doi:10.1007/s00180-022-01270-z>.
This package implements the locally efficient doubly robust difference-in-differences (DiD
) estimators for the average treatment effect proposed by Sant'Anna and Zhao (2020) <doi:10.1016/j.jeconom.2020.06.003>. The estimator combines inverse probability weighting and outcome regression estimators (also implemented in the package) to form estimators with more attractive statistical properties. Two different estimation methods can be used to estimate the nuisance functions.
Decode elements of the Australian Higher Education Information Management System (HEIMS) data for clarity and performance. HEIMS is the record system of the Department of Education, Australia to record enrolments and completions in Australia's higher education system, as well as a range of relevant information. For more information, including the source of the data dictionary, see <http://heimshelp.education.gov.au/sites/heimshelp/dictionary/pages/data-element-dictionary>.
Flexible procedures to compute local density-based outlier scores for ranking outliers. Both exact and approximate nearest neighbor search can be implemented, while also accommodating multiple neighborhood sizes and four different local density-based methods. It allows for referencing a random subsample of the input data or a user specified reference data set to compute outlier scores against, so both unsupervised and semi-supervised outlier detection can be implemented.
Calculates and differentiates probabilities and density of (conditional) multivariate normal distribution and Gaussian copula (with various marginal distributions) using methods described in A. Genz (2004) <doi:10.1023/B:STCO.0000035304.20635.31>, A. Genz, F. Bretz (2009) <doi:10.1007/978-3-642-01689-9>, H. I. Gassmann (2003) <doi:10.1198/1061860032283> and E. Kossova, B. Potanin (2018) <https://ideas.repec.org/a/ris/apltrx/0346.html>.
This package provides methods and tools for deriving spatial summary functions from single-cell imaging data and performing functional data analyses. Functions can be applied to other single-cell technologies such as spatial transcriptomics. Functional regression and functional principal component analysis methods are in the refund package <https://cran.r-project.org/package=refund> while calculation of the spatial summary functions are from the spatstat package <https://spatstat.org/>.
This package provides a novel pseudo-value regression approach for the differential co-expression network analysis in expression data, which can incorporate additional clinical variables in the model. This is a direct regression modeling for the differential network analysis, and it is therefore computationally amenable for the most users. The full methodological details can be found in Ahn S et al (2023) <doi:10.1186/s12859-022-05123-w>.
Test-based Image structural similarity measure and test of independence. This package implements the key functions of two tasks: (1) computing image structural similarity measure PSSIM of Wang, Maldonado and Silwal (2011) <DOI:10.1016/j.csda.2011.04.021>; and (2) test of independence between a response and a covariate in presence of heteroscedastic treatment effects proposed by Wang, Tolos, and Wang (2010) <DOI:10.1002/cjs.10068>.
An implementation of the one-step privacy-protecting method for estimating the overall and site-specific hazard ratios using inverse probability weighted Cox models in distributed data network studies, as proposed by Shu, Yoshida, Fireman, and Toh (2019) <doi: 10.1177/0962280219869742>. This method only requires sharing of summary-level riskset tables instead of individual-level data. Both the conventional inverse probability weights and the stabilized weights are implemented.
This package provides various styles of function chaining methods: Pipe operator, Pipe object, and pipeline function, each representing a distinct pipeline model yet sharing almost a common set of features: A value can be piped to the first unnamed argument of a function and to dot symbol in an enclosed expression. The syntax is designed to make the pipeline more readable and friendly to a wide range of operations.
Support package for the textbook "An Introduction to Quantitative Text Analysis for Linguists: Reproducible Research Using R" (Francom, 2024) <doi:10.4324/9781003393764>. Includes functions to acquire, clean, and analyze text data as well as functions to document and share the results of text analysis. The package is designed to be used in conjunction with the book, but can also be used as a standalone package for text analysis.
This package provides a simple interface to recursively list files from a directory, filter them using a regular expression, read their contents, and extract lines that match a user-defined pattern. The package returns a dataframe containing the matched lines, their line numbers, file paths, and the corresponding matched substrings. Designed for quick code base exploration, log inspection, or any use case involving pattern-based file and line filtering.
Software that leverages the capabilities of Circos by manipulating data, preparing configuration files, and running the Perl-native Circos directly from the R environment with minimal user intervention. Circos is a novel software that addresses the challenges in visualizing genetic data by creating circular ideograms composed of tracks of heatmaps, scatter plots, line plots, histograms, links between common markers, glyphs, text, and etc. Please see <http://www.circos.ca>.
Imbalanced training datasets impede many popular classifiers. To balance training data, a combination of oversampling minority classes and undersampling majority classes is useful. This package implements the SCUT (SMOTE and Cluster-based Undersampling Technique) algorithm as described in Agrawal et. al. (2015) <doi:10.5220/0005595502260234>. Their paper uses model-based clustering and synthetic oversampling to balance multiclass training datasets, although other resampling methods are provided in this package.
This package provides functions for simplified emulation of time series computer model output in model parameter space using Gaussian processes. Stilt can be used more generally for Kriging of spatio-temporal fields. There are functions to predict at new parameter settings, to test the emulator using cross-validation (which includes information on 95% confidence interval empirical coverage), and to produce contour plots over 2D slices in model parameter space.
This package provides a meta-package that aims to make R easier for everyone, especially programmers who have a background in SAS® software. This set of packages brings many useful concepts to R', including data libraries, data dictionaries, formats and format catalogs, a data step, and a traceable log. The flagship package is a reporting package that can output in text, rich text, PDF', HTML', and DOCX file formats.
You can use the functions provided by the package to make various statistical tables, such as baseline data tables. Creates Table 1', i.e., a description of the baseline patient characteristics, which is essential in every medical research. Supports both continuous and categorical variables, as well as p-values and standardized mean differences. This method was described by Mary L McHugh
(2013) <doi:10.11613/bm.2013.018>.
Computes the solution path of the Terminating-LARS (T-LARS) algorithm. The T-LARS algorithm is a major building block of the T-Rex selector (see R package TRexSelector
'). The package is based on the papers Machkour, Muma, and Palomar (2022) <arXiv:2110.06048>
, Efron, Hastie, Johnstone, and Tibshirani (2004) <doi:10.1214/009053604000000067>, and Tibshirani (1996) <doi:10.1111/j.2517-6161.1996.tb02080.x>.
The best ANN structure for time series data analysis is a demanding need in the present era. This package will find the best-fitted ANN model based on forecasting accuracy. The optimum size of the hidden layers was also determined after determining the number of lags to be included. This package has been developed using the algorithm of Paul and Garai (2021) <doi:10.1007/s00500-021-06087-4>.
This package provides a suite of psychometric analysis tools for research and operation, including: (1) computation of probability, information, and likelihood for the 3PL, GPCM, and GRM; (2) parameter estimation using joint or marginal likelihood estimation method; (3) simulation of computerized adaptive testing using built-in or customized algorithms; (4) assembly and simulation of multistage testing. The full documentation and tutorials are at <https://github.com/xluo11/xxIRT>
.
Gene-environment (GÃ E) interactions have important implications to elucidate the etiology of complex diseases beyond the main genetic and environmental effects. Outliers and data contamination in disease phenotypes of GÃ E studies have been commonly encountered, leading to the development of a broad spectrum of robust penalization methods. Nevertheless, within the Bayesian framework, the issue has not been taken care of in existing studies. We develop a robust Bayesian variable selection method for GÃ E interaction studies. The proposed Bayesian method can effectively accommodate heavy-tailed errors and outliers in the response variable while conducting variable selection by accounting for structural sparsity. In particular, the spike-and-slab priors have been imposed on both individual and group levels to identify important main and interaction effects. An efficient Gibbs sampler has been developed to facilitate fast computation. The Markov chain Monte Carlo algorithms of the proposed and alternative methods are efficiently implemented in C++.