Alternating least squares is often used to resolve components contributing to data with a bilinear structure; the basic technique may be extended to alternating constrained least squares. This package provides an implementation of multivariate curve resolution alternating least squares (MCR-ALS).
Commonly applied constraints include unimodality, non-negativity, and normalization of components. Several data matrices may be decomposed simultaneously by assuming that one of the two matrices in the bilinear decomposition is shared between datasets.
Rapid realistic routing on multimodal transport networks (walk, bike, public transport and car) using R5', the Rapid Realistic Routing on Real-world and Reimagined networks engine <https://github.com/conveyal/r5>. The package allows users to generate detailed routing analysis or calculate travel time and monetary cost matrices using seamless parallel computing on top of the R5 Java machine. While R5 is developed by Conveyal, the package r5r is independently developed by a team at the Institute for Applied Economic Research (Ipea) with contributions from collaborators. Apart from the documentation in this package, users will find additional information on R5 documentation at <https://docs.conveyal.com/>. Although we try to keep new releases of r5r in synchrony with R5, the development of R5 follows Conveyal's independent update process. Hence, users should confirm the R5 version implied by the Conveyal user manual (see <https://docs.conveyal.com/changelog>) corresponds with the R5 version that r5r depends on. This version of r5r depends on R5 v7.1.
This algorithm conducts variable selection in the classification setting. It repeatedly subsamples variables and runs linear discriminant analysis (LDA) on the subsampled variables. Variables are scored based on the AUC and the t-statistics. Variables then enter a competition and the semi-finalist variables will be evaluated in a final round of LDA classification. The algorithm then outputs a list of variable selected. Qiao, Sun and Fan (2017) <http://people.math.binghamton.edu/qiao/swa.html>.
This package provides tools to calculate trait probability density functions (TPD) at any scale (e.g. populations, species, communities). TPD functions are used to compute several indices of functional diversity, as well as its partition across scales. These indices constitute a unified framework that incorporates the underlying probabilistic nature of trait distributions into uni- or multidimensional functional trait-based studies. See Carmona et al. (2016) <doi:10.1016/j.tree.2016.02.003> for further information.
The elliptical factor model, as an extension of the traditional factor model, effectively overcomes the limitations of the traditional model when dealing with heavy-tailed characteristic data. This package implements sparse principal component methods (SPC) and bi-sparse online principal component estimation (SPOC) for parameter estimation. Includes functionality for calculating mean squared error, relative error, and loading matrix sparsity.The philosophy of the package is described in Guo G. (2023) <doi:10.1007/s00180-022-01270-z>.
Framework for building evolutionary algorithms for both single- and multi-objective continuous or discrete optimization problems. A set of predefined evolutionary building blocks and operators is included. Moreover, the user can easily set up custom objective functions, operators, building blocks and representations sticking to few conventions. The package allows both a black-box approach for standard tasks (plug-and-play style) and a much more flexible white-box approach where the evolutionary cycle is written by hand.
This package provides functionality for constructing statistical models of transcriptomic dynamics in field conditions. It further offers the function to predict expression of a gene given the attributes of samples and meteorological data. Nagano, A. J., Sato, Y., Mihara, M., Antonio, B. A., Motoyama, R., Itoh, H., Naganuma, Y., and Izawa, T. (2012). <doi:10.1016/j.cell.2012.10.048>. Iwayama, K., Aisaka, Y., Kutsuna, N., and Nagano, A. J. (2017). <doi:10.1093/bioinformatics/btx049>.
This is a fast and flexible implementation of the Kalman filter and smoother, which can deal with NAs. It is entirely written in C and relies fully on linear algebra subroutines contained in BLAS and LAPACK. Due to the speed of the filter, the fitting of high-dimensional linear state space models to large datasets becomes possible. This package also contains a plot function for the visualization of the state vector and graphical diagnostics of the residuals.
This package contains a function called gds() which accepts three input parameters like lower limits, upper limits and the frequencies of the corresponding classes. The gds() function calculate and return the values of mean ('gmean'), median ('gmedian'), mode ('gmode'), variance ('gvar'), standard deviation ('gstdev'), coefficient of variance ('gcv'), quartiles ('gq1', gq2', gq3'), inter-quartile range ('gIQR'), skewness ('g1'), and kurtosis ('g2') which facilitate effective data analysis. For skewness and kurtosis calculations we use moments.
The need for anonymization of individual survey responses often leads to many suppressed grid cells in a regular grid. Here we provide functionality for creating multi-resolution gridded data, respecting the confidentiality rules, such as a minimum number of units and dominance by one or more units for each grid cell. The functions also include the possibility for contextual suppression of data. For more details see Skoien et al. (2025) <doi:10.48550/arXiv.2410.17601>.
This package provides efficient implementation of the Narrowest-Over-Threshold methodology for detecting an unknown number of change-points occurring at unknown locations in one-dimensional data following deterministic signal + noise model. Currently implemented scenarios are: piecewise-constant signal, piecewise-constant signal with a heavy-tailed noise, piecewise-linear signal, piecewise-quadratic signal, piecewise-constant signal and with piecewise-constant variance of the noise. For details, see Baranowski, Chen and Fryzlewicz (2019) <doi:10.1111/rssb.12322>.
Provide data generation and estimation tools for the truncated positive normal (tpn) model discussed in Gomez, Olmos, Varela and Bolfarine (2018) <doi:10.1007/s11766-018-3354-x>, the slash tpn distribution discussed in Gomez, Gallardo and Santoro (2021) <doi:10.3390/sym13112164>, the bimodal tpn distribution discussed in Gomez et al. (2022) <doi:10.3390/sym14040665>, the flexible tpn model <doi:10.3390/math11214431> and the unit tpn distribution <doi:10.1016/j.chemolab.2025.105322>.
This package uses a Bayesian hierarchical model to detect enriched regions from ChIP-chip experiments. The common goal in analyzing this ChIP-chip data is to detect DNA-protein interactions from ChIP-chip experiments. The BAC package has mainly been tested with Affymetrix tiling array data. However, we expect it to work with other platforms (e.g. Agilent, Nimblegen, cDNA, etc.). Note that BAC does not deal with normalization, so you will have to normalize your data beforehand.
Roary is a high speed stand alone pan genome pipeline, which takes annotated assemblies in GFF3 format (produced by the Prokka program) and calculates the pan genome. Using a standard desktop PC, it can analyse datasets with thousands of samples, without compromising the quality of the results. 128 samples can be analysed in under 1 hour using 1 GB of RAM and a single processor. Roary is not intended for metagenomics or for comparing extremely diverse sets of genomes.
This package is designed as an integrated package for genetic data analysis of both population and family data. Currently, it contains functions for sample size calculations of both population-based and family-based designs, probability of familial disease aggregation, kinship calculation, statistics in linkage analysis, and association analysis involving genetic markers including haplotype analysis with or without environmental covariates. Over years, the package has been developed in-between many projects hence also in line with the name (gap).
The ggplot2 package is the state-of-the-art toolbox for creating and formatting graphs. However, it is easy to forget how certain formatting commands are named and sometimes users find themselves asking: How do you rotate the x-axis labels again? Or how do you hide the legend...? This package allows users to issue natural language commands related to theme-related styling of plots (colors, font size and such), which then are translated into valid ggplot2 commands.
This package provides simple crosstab output with optional statistics (e.g., Goodman-Kruskal Gamma, Somers d, and Kendall's tau-b) as well as two-way and one-way tables. The package is used within the statistics component of the Masters of Science (MSc) in Social Science of the Internet at the Oxford Internet Institute (OII), University of Oxford, but the functions should be useful for general data analysis and especially for analysis of categorical and ordinal data.
Permutation (randomisation) test for single-case phase design data with two phases (e.g., pre- and post-treatment). Correction for dependency of observations is done through stepwise resampling the time series while varying the distance between observations. The required distance 0,1,2,3.. is determined based on repeated dependency testing while stepwise increasing the distance. In preparation: Vroegindeweij et al. "A Permutation distancing test for single-case observational AB phase design data: A Monte Carlo simulation study".
Implementation of the wavelet-based spatial verification method of Buschow and Friederichs "SAD: Verifying the Scale, Anisotropy and Direction of precipitation forecasts" (2020, submitted to QJRMS). Forecasts and Observations are transformed by a decimated or redundant dual-tree complex wavelet transform to analyze the spatial scale, degree of anisotropy and preferred direction in each field. These structural attributes are compared by a series of scores. An experimental algorithm for the correction of these errors is included as well.
Independent hypothesis weighting (IHW) is a multiple testing procedure that increases power compared to the method of Benjamini and Hochberg by assigning data-driven weights to each hypothesis. The input to IHW is a two-column table of p-values and covariates. The covariate can be any continuous-valued or categorical variable that is thought to be informative on the statistical properties of each hypothesis test, while it is independent of the p-value under the null hypothesis.
Nonparametric detection of nonuniformity and dependence with Binary Expansion Testing (BET). See Kai Zhang (2019) BET on Independence, Journal of the American Statistical Association, 114:528, 1620-1637, <DOI:10.1080/01621459.2018.1537921>, Kai Zhang, Wan Zhang, Zhigen Zhao, Wen Zhou. (2023). BEAUTY Powered BEAST, <doi:10.48550/arXiv.2103.00674> and Wan Zhang, Zhigen Zhao, Michael Baiocchi, Yao Li, Kai Zhang. (2023) SorBET: A Fast and Powerful Algorithm to Test Dependence of Variables, Techinical report.
An ensemble method for the statistical detection of a rare class in two-class classification problems. The method uses an ensemble of classifiers where the constituent models of the ensemble use disjoint subsets (phalanxes) of explanatory variables. We provide an implementation of the phalanx-formation algorithm. Please see Tomal et al. (2015) <doi:10.1214/14-AOAS778>, Tomal et al. (2016) <doi:10.1021/acs.jcim.5b00663>, and Tomal et al. (2019) <arXiv:1706.06971> for more details.
This package provides a selection of 3 different inference rules (including additionally the clamped types of the referred inference rules) and 4 threshold functions in order to obtain the inference of the FCM (Fuzzy Cognitive Map). Moreover, the fcm package returns a data frame of the concepts values of each state after the inference procedure. Fuzzy cognitive maps were introduced by Kosko (1986) <doi:10.1002/int.4550010405> providing ideal causal cognition tools for modeling and simulating dynamic systems.
After develop a ODK <https://opendatakit.org/> frame, we can link the frame to Google Sheets <https://www.google.com/sheets/about/> and collect data through Android <https://www.android.com/>. This data uploaded to a Google sheets'. odk2spss() function help to convert the odk frame into SPSS <https://www.ibm.com/analytics/us/en/technology/spss/> frame. Also able to add downloaded Google sheets data or read data from Google sheets by using ODK frame submission_url'.