This package provides a self-tuning spectral clustering method for single or multi-view data. Spectrum uses a new type of adaptive density aware kernel that strengthens connections in the graph based on common nearest neighbours. It uses a tensor product graph data integration and diffusion procedure to integrate different data sources and reduce noise. Spectrum uses either the eigengap or multimodality gap heuristics to determine the number of clusters. The method is sufficiently flexible so that a wide range of Gaussian and non-Gaussian structures can be clustered with automatic selection of K.
This package implements variational Bayesian algorithms to perform scalable variable selection for sparse, high-dimensional linear and logistic regression models. Features include a novel prioritized updating scheme, which uses a preliminary estimator of the variational means during initialization to generate an updating order prioritizing large, more relevant, coefficients. Sparsity is induced via spike-and-slab priors with either Laplace or Gaussian slabs. By default, the heavier-tailed Laplace density is used. Formal derivations of the algorithms and asymptotic consistency results may be found in Kolyan Ray and Botond Szabo (JASA 2020) and Kolyan Ray, Botond Szabo, and Gabriel Clara (NeurIPS
2020).
Recently, regularized variable selection has emerged as a powerful tool to identify and dissect gene-environment interactions. Nevertheless, in longitudinal studies with high dimensional genetic factors, regularization methods for GÃ E interactions have not been systematically developed. In this package, we provide the implementation of sparse group variable selection, based on both the quadratic inference function (QIF) and generalized estimating equation (GEE), to accommodate the bi-level selection for longitudinal GÃ E studies with high dimensional genomic features. Alternative methods conducting only the group or individual level selection have also been included. The core modules of the package have been developed in C++.
Contemporary software commonly used to design stated preference experiments are expensive and the code is closed source. This is a free software package with an easy to use interface to make flexible stated preference experimental designs using state-of-the-art methods. For an overview of stated choice experimental design theory, see e.g., Rose, J. M. & Bliemer, M. C. J. (2014) in Hess S. & Daly. A. <doi:10.4337/9781781003152>. The package website can be accessed at <https://spdesign.edsandorf.me>. We acknowledge funding from the European Unionâ s Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant INSPiRE
(Grant agreement ID: 793163).
Data visualization tours animates linear projection of multivariate data as its basis (ie. orientation) changes. The spinifex packages generates paths for manual tours by manipulating the contribution of a single variable at a time Cook & Buja (1997) <doi:10.1080/10618600.1997.10474754>. Other types of tours, such as grand (random walk) and guided (optimizing some objective function) are available in the tourr package Wickham et al. <doi:10.18637/jss.v040.i02>. spinifex builds on tourr and can render tours with gganimate and plotly graphics, and allows for exporting as an .html widget and as an .gif, respectively. This work is fully discussed in Spyrison & Cook (2020) <doi:10.32614/RJ-2020-027>.
This package implements the algorithm described in Barron, M., Zhang, S. and Li, J. 2017, "A sparse differential clustering algorithm for tracing cell type changes via single-cell RNA-sequencing data", Nucleic Acids Research, gkx1113, <doi:10.1093/nar/gkx1113>. This algorithm clusters samples from two different populations, links the clusters across the conditions and identifies marker genes for these changes. The package was designed for scRNA-Seq
data but is also applicable to many other data types, just replace cells with samples and genes with variables. The package also contains functions for estimating the parameters for SparseDC
as outlined in the paper. We recommend that users further select their marker genes using the magnitude of the cluster centers.
Analyse light spectra for visual and non-visual (often called melanopic) needs, wrapped up in a Shiny App. Spectran allows for the import of spectra in various CSV forms but also provides a wide range of example spectra and even the creation of own spectral power distributions. The goal of the app is to provide easy access and a visual overview of the spectral calculations underlying common parameters used in the field. It is thus ideal for educational purposes or the creation of presentation ready graphs in lighting research and application. Spectran uses equations and action spectra described in CIE S026 (2018) <doi:10.25039/S026.2018>, DIN/TS 5031-100 (2021) <doi:10.31030/3287213>, and ISO/CIE 23539 (2023) <doi:10.25039/IS0.CIE.23539.2023>.
This package provides a simple package facilitating ML based analysis for physics education research (PER) purposes. The implemented machine learning technique is random forest optimized by item response theory (IRT) for feature selection and genetic algorithm (GA) for hyperparameter tuning. The data analyzed here has been made available in the CRAN repository through the spheredata package. The SPHERE stands for Students Performance in Physics Education Research (PER). The students are the eleventh graders learning physics at the high school curriculum. We follow the stream of multidimensional students assessment as probed by some research based assessments in PER. The goal is to predict the students performance at the end of the learning process. Three learning domains are measured including conceptual understanding, scientific ability, and scientific attitude. Furthermore, demographic backgrounds and potential variables predicting students performance on physics are also demonstrated.
Splines are efficiently represented through their Taylor expansion at the knots. The representation accounts for the support sets and is thus suitable for sparse functional data. Two cases of boundary conditions are considered: zero-boundary or periodic-boundary for all derivatives except the last. The periodical splines are represented graphically using polar coordinates. The B-splines and orthogonal bases of splines that reside on small total support are implemented. The orthogonal bases are referred to as splinets and are utilized for functional data analysis. Random spline generator is implemented as well as all fundamental algebraic and calculus operations on splines. The optimal, in the least square sense, functional fit by splinets to data consisting of sampled values of functions as well as splines build over another set of knots is obtained and used for functional data analysis. The S4-version of the object oriented R is used. <doi:10.48550/arXiv.2102.00733>
, <doi:10.1016/j.cam.2022.114444>, <doi:10.48550/arXiv.2302.07552>
.
This package provides a design-based approach to statistical inference, with a focus on spatial data. Spatially balanced samples are selected using the Generalized Random Tessellation Stratified (GRTS) algorithm. The GRTS algorithm can be applied to finite resources (point geometries) and infinite resources (linear / linestring and areal / polygon geometries) and flexibly accommodates a diverse set of sampling design features, including stratification, unequal inclusion probabilities, proportional (to size) inclusion probabilities, legacy (historical) sites, a minimum distance between sites, and two options for replacement sites (reverse hierarchical order and nearest neighbor). Data are analyzed using a wide range of analysis functions that perform categorical variable analysis, continuous variable analysis, attributable risk analysis, risk difference analysis, relative risk analysis, change analysis, and trend analysis. spsurvey can also be used to summarize objects, visualize objects, select samples that are not spatially balanced, select panel samples, measure the amount of spatial balance in a sample, adjust design weights, and more. For additional details, see Dumelle et al. (2023) <doi:10.18637/jss.v105.i03>.
The developed function is designed for the generation of spatial grids based on user-specified longitude and latitude coordinates. The function first validates the input longitude and latitude values, ensuring they fall within the appropriate geographic ranges. It then creates a polygon from the coordinates and determines the appropriate Universal Transverse Mercator zone based on the provided hemisphere and longitude values. Subsequently, transforming the input Shapefile to the Universal Transverse Mercator projection when necessary. Finally, a spatial grid is generated with the specified interval and saved as a Shapefile. For method details see, Brus,D.J.(2022).<DOI:10.1201/9781003258940>. The function takes into account crucial parameters such as the hemisphere (north or south), desired grid interval, and the output Shapefile path. The developed function is an efficient tool, simplifying the process of empty spatial grid generation for applications such as, geo-statistical analysis, digital soil mapping product generation, etc. Whether for environmental studies, urban planning, or any other geo-spatial analysis, this package caters to the diverse needs of users working with spatial data, enhancing the accessibility and ease of spatial data processing and visualization.
This package provides a sparse covariance estimator based on different thresholding operators.
This package performs sparse linear discriminant analysis for Gaussians and mixture of Gaussian models.
Computes multivariate normal (MVN) densities, and samples from MVN distributions, when the covariance or precision matrix is sparse.
Include interactive sparkline charts <http://omnipotent.net/jquery.sparkline> in all R contexts with the convenience of htmlwidgets'.
Calibration of thresholds of control charts such as CUSUM charts based on past data, taking estimation error into account.
Load Avro Files into Apache Spark using sparklyr'. This allows to read files from Apache Avro <https://avro.apache.org/>.
Spike and slab for prediction and variable selection in linear regression models. Uses a generalized elastic net for variable selection.
Load WARC (Web ARChive) files into Apache Spark using sparklyr'. This allows to read files from the Common Crawl project <http://commoncrawl.org/>.
Calculate change point based on spectral clustering with the option to automatically calculate the number of clusters if this information is not available.
Estimate the internal consistency of your tasks with a permutation based split-half reliability approach. Unofficial release name: "I eat stickers all the time, dude!".
Spectra viewer, organizer, data preparation and property blocks from within R or stand-alone. Binary (application) part is installed separately using spnInstallApp()
from spectrino package.
Import classification results from the RDP Classifier (Ribosomal Database Project), USEARCH sintax, vsearch sintax and the QIIME2 (Quantitative Insights into Microbial Ecology) classifiers into phyloseq tax_table objects.
This package provides a Wrapper around the SVDLIBC library for (truncated) singular value decomposition of a sparse matrix. Currently, only sparse real matrices in Matrix package format are supported.