This package implements the kK-NN algorithm, an adaptive k-nearest neighbor classifier that adjusts the neighborhood size based on local data curvature. The method estimates local Gaussian curvature by approximating the shape operator of the data manifold. This approach aims to improve classification performance, particularly in datasets with limited samples.
This package provides functions to simulate data from large-scale educational assessments, including background questionnaire data and cognitive item responses that adhere to a multiple-matrix sampled design. The theoretical foundation can be found on Matta, T.H., Rutkowski, L., Rutkowski, D. et al. (2018) <doi:10.1186/s40536-018-0068-8>.
Approximate node interaction parameters of Markov Random Fields graphical networks. Models can incorporate additional covariates, allowing users to estimate how interactions between nodes in the graph are predicted to change across covariate gradients. The general methods implemented in this package are described in Clark et al. (2018) <doi:10.1002/ecy.2221>.
Topological data analysis (TDA) is a method of data analysis that uses techniques from topology to analyze high-dimensional data. Here we implement Mapper, an algorithm from this area developed by Singh, Mémoli and Carlsson (2007) which generalizes the concept of a Reeb graph <https://en.wikipedia.org/wiki/Reeb_graph>.
Estimation of the survivor function for interval censored time-to-event data subject to misclassification using nonparametric maximum likelihood estimation, implementing the methods of Titman (2017) <doi:10.1007/s11222-016-9705-7>. Misclassification probabilities can either be specified as fixed or estimated. Models with time dependent misclassification may also be fitted.
Analysis of molecular marker data from model and non-model systems. For the later, it allows statistical analysis by simultaneously estimating linkage and linkage phases (genetic map construction) according to Wu and colleagues (2002) <doi:10.1006/tpbi.2002.1577>. All analysis are based on multi-point approaches using hidden Markov models.
This package provides general purpose tools for helping users to implement steepest gradient descent methods for function optimization; for details see Ruder (2016) <arXiv:1609.04747v2>. Currently, the Steepest 2-Groups Gradient Descent and the Adaptive Moment Estimation (Adam) are the methods implemented. Other methods will be implemented in the future.
Model selection for penalized graphical models using the Stability Approach to Regularization Selection ('StARS'), with options for speed-ups including Bounded StARS (B-StARS), batch computing, and other stability metrics (e.g., graphlet stability G-StARS). Christian L. Müller, Richard Bonneau, Zachary Kurtz (2016) <arXiv:1605.07072>.
In base R, object attributes are lost when objects are modified by common data operations such as subset, filter, slice, append, extract etc. This packages allows objects to be marked as sticky and have attributes persisted during these operations or when inserted into or extracted from list-like or table-like objects.
Testing for Spatial Dependence of Qualitative Data in Cross Section. The list of functions includes join-count tests, Q test, spatial scan test, similarity test and spatial runs test. The methodology of these models can be found in <doi:10.1007/s10109-009-0100-1> and <doi:10.1080/13658816.2011.586327>.
Allows the user to connect with IBGE's (Instituto Brasileiro de Geografia e Estatistica, see <https://www.ibge.gov.br/> for more information) SIDRA API in a flexible way. SIDRA is the acronym to "Sistema IBGE de Recuperacao Automatica" and is the system where IBGE turns available aggregate data from their researches.
Create panel data consisting of independent states from 1816 to the present. The package includes the Gleditsch & Ward (G&W) and Correlates of War (COW) lists of independent states, as well as helper functions for working with state panel data and standardizing other data sources to create country-year/month/etc. data.
This package provides efficient R and C++ routines to simulate cognitive diagnostic model data for Deterministic Input, Noisy "And" Gate ('DINA') and reduced Reparameterized Unified Model ('rRUM') from Culpepper and Hudson (2017) <doi: 10.1177/0146621617707511>, Culpepper (2015) <doi:10.3102/1076998615595403>, and de la Torre (2009) <doi:10.3102/1076998607309474>.
Build custom Europe SpatialPolygonsDataFrame, if you don't know what is a SpatialPolygonsDataFrame see SpatialPolygons() in sp', by example for mapLayout() in antaresViz'. Antares is a powerful software developed by RTE to simulate and study electric power systems (more information about Antares here: <https://antares-simulator.org/>).
Effect modification occurs if a treatment effect is larger or more stable in certain subgroups defined by observed covariates. The submax or subgroup-maximum method of Lee et al. (2018) <doi:10.1111/biom.12884> does an overall test and separate tests in subgroups, correcting for multiple testing using the joint distribution.
This package provides functions for computing a standardized moderation effect in moderated regression and forming its confidence interval by nonparametric bootstrapping as proposed in Cheung, Cheung, Lau, Hui, and Vong (2022) <doi:10.1037/hea0001188>. Also includes simple-to-use functions for computing conditional effects (unstandardized or standardized) and plotting moderation effects.
Computerized Adaptive Testing simulations with dichotomous and polytomous items. Selects items with Maximum Fisher Information method or randomly, with or without constraints (content balancing and item exposure control). Evaluates the simulation results in terms of precision, item exposure, and test length. Inspired on Magis & Barrada (2017) <doi:10.18637/jss.v076.c01>.
Generate LaTeX tables directly from R. It builds LaTeX tables in blocks in the spirit of ggplot2 using the + and / operators for concatenation in the vertical and horizontal dimensions, respectively. It exports tables in the LaTeX tabular environment using .tex code. It can compile .tex code to PDF automatically.
The Ultimate Microrray Prediction, Reality and Inference Engine (UMPIRE) is a package to facilitate the simulation of realistic microarray data sets with links to associated outcomes. See Zhang and Coombes (2012) <doi:10.1186/1471-2105-13-S13-S1>. Version 2.0 adds the ability to simulate realistic mixed-typed clinical data.
This package provides a YAML-based mechanism for working with table metadata. Supports compact syntax for creating, modifying, viewing, exporting, importing, displaying, and plotting metadata coded as column attributes. The yamlet dialect is valid YAML with defaults and conventions chosen to improve readability. See ?yamlet, ?decorate, ?modify, ?io_csv, and ?ggplot.decorated.
This package performs the Joint and Individual Variation Explained (JIVE) decomposition on a list of data sets when the data share a dimension, returning low-rank matrices that capture the joint and individual structure of the data [O'Connell, MJ and Lock, EF (2016) <doi:10.1093/bioinformatics/btw324>]. It provides two methods of rank selection when the rank is unknown, a permutation test and a Bayesian Information Criterion (BIC) selection algorithm. Also included in the package are three plotting functions for visualizing the variance attributed to each data source: a bar plot that shows the percentages of the variability attributable to joint and individual structure, a heatmap that shows the structure of the variability, and principal component plots.
This package provides tools to support the analysis of RNA-seq expression data or other similar kind of data. It provides exploratory plots to evaluate saturation, count distribution, expression per chromosome, type of detected features, features length, etc. It also supports the analysis of differential expression between two experimental conditions with no parametric assumptions.
This package provides functions to compare two or more survival curves with:
The Fleming-Harrington test for right-censored data based on permutations and on counting processes.
An extension of the Fleming-Harrington test for interval-censored data based on a permutation distribution and on a score vector distribution.
This package provides functionality to run a number of tasks in the differential expression analysis workflow. This encompasses the most widely used steps, from running various enrichment analysis tools with a unified interface to creating plots and beautifying table components linking to external websites and databases. This streamlines the generation of comprehensive analysis reports.