Computes exact observation weights for the Kalman filter and smoother, following Koopman and Harvey (2003) <www.sciencedirect.com/science/article/pii/S0165188902000611>. The package provides tools for analyzing linear Gaussian state-space models, allowing users to quantify the contribution of individual observations to filtered and smoothed state estimates. These weights can be used for interpretation, decomposition, and diagnostic analysis in time series models, including applications such as dynamic factor models. See the README for examples.
Rapid realistic routing on multimodal transport networks (walk, bike, public transport and car) using R5', the Rapid Realistic Routing on Real-world and Reimagined networks engine <https://github.com/conveyal/r5>. The package allows users to generate detailed routing analysis or calculate travel time and monetary cost matrices using seamless parallel computing on top of the R5 Java machine. While R5 is developed by Conveyal, the package r5r is independently developed by a team at the Institute for Applied Economic Research (Ipea) with contributions from collaborators. Apart from the documentation in this package, users will find additional information on R5 documentation at <https://docs.conveyal.com/>. Although we try to keep new releases of r5r in synchrony with R5, the development of R5 follows Conveyal's independent update process. Hence, users should confirm the R5 version implied by the Conveyal user manual (see <https://docs.conveyal.com/changelog>) corresponds with the R5 version that r5r depends on. This version of r5r depends on R5 v7.1.
This algorithm conducts variable selection in the classification setting. It repeatedly subsamples variables and runs linear discriminant analysis (LDA) on the subsampled variables. Variables are scored based on the AUC and the t-statistics. Variables then enter a competition and the semi-finalist variables will be evaluated in a final round of LDA classification. The algorithm then outputs a list of variable selected. Qiao, Sun and Fan (2017) <http://people.math.binghamton.edu/qiao/swa.html>.
This package provides tools to calculate trait probability density functions (TPD) at any scale (e.g. populations, species, communities). TPD functions are used to compute several indices of functional diversity, as well as its partition across scales. These indices constitute a unified framework that incorporates the underlying probabilistic nature of trait distributions into uni- or multidimensional functional trait-based studies. See Carmona et al. (2016) <doi:10.1016/j.tree.2016.02.003> for further information.
This package uses a Bayesian hierarchical model to detect enriched regions from ChIP-chip experiments. The common goal in analyzing this ChIP-chip data is to detect DNA-protein interactions from ChIP-chip experiments. The BAC package has mainly been tested with Affymetrix tiling array data. However, we expect it to work with other platforms (e.g. Agilent, Nimblegen, cDNA, etc.). Note that BAC does not deal with normalization, so you will have to normalize your data beforehand.
Roary is a high speed stand alone pan genome pipeline, which takes annotated assemblies in GFF3 format (produced by the Prokka program) and calculates the pan genome. Using a standard desktop PC, it can analyse datasets with thousands of samples, without compromising the quality of the results. 128 samples can be analysed in under 1 hour using 1 GB of RAM and a single processor. Roary is not intended for metagenomics or for comparing extremely diverse sets of genomes.
This package is designed as an integrated package for genetic data analysis of both population and family data. Currently, it contains functions for sample size calculations of both population-based and family-based designs, probability of familial disease aggregation, kinship calculation, statistics in linkage analysis, and association analysis involving genetic markers including haplotype analysis with or without environmental covariates. Over years, the package has been developed in-between many projects hence also in line with the name (gap).
The irregularly-spaced data are interpolated onto regular latitude-longitude grids by weighting each station according to its distance and angle from the center of a search radius. In addition to this, we also provide a simple way (Jones and Hulme, 1996) to grid the irregularly-spaced data points onto regular latitude-longitude grids by averaging all stations in grid-boxes. This study was supported by the National Natural Science Foundation of China (NSFC, Grant No. 42205177).
The elliptical factor model, as an extension of the traditional factor model, effectively overcomes the limitations of the traditional model when dealing with heavy-tailed characteristic data. This package implements sparse principal component methods (SPC) and bi-sparse online principal component estimation (SPOC) for parameter estimation. Includes functionality for calculating mean squared error, relative error, and loading matrix sparsity.The philosophy of the package is described in Guo G. (2023) <doi:10.1007/s00180-022-01270-z>.
Framework for building evolutionary algorithms for both single- and multi-objective continuous or discrete optimization problems. A set of predefined evolutionary building blocks and operators is included. Moreover, the user can easily set up custom objective functions, operators, building blocks and representations sticking to few conventions. The package allows both a black-box approach for standard tasks (plug-and-play style) and a much more flexible white-box approach where the evolutionary cycle is written by hand.
This package provides a system for importing electrophysiological signal, based on the Waveform Database (WFDB) software package, written by Moody et al 2022 <doi:10.13026/gjvw-1m31>. A R-based system to utilize WFDB functions for reading and writing signal data, as well as functions for visualization and analysis are provided. A stable and broadly compatible class for working with signal data, supporting the reading in of cardiac electrophysiological files such as intracardiac electrograms, is introduced.
This is a fast and flexible implementation of the Kalman filter and smoother, which can deal with NAs. It is entirely written in C and relies fully on linear algebra subroutines contained in BLAS and LAPACK. Due to the speed of the filter, the fitting of high-dimensional linear state space models to large datasets becomes possible. This package also contains a plot function for the visualization of the state vector and graphical diagnostics of the residuals.
This package contains a function called gds() which accepts three input parameters like lower limits, upper limits and the frequencies of the corresponding classes. The gds() function calculate and return the values of mean ('gmean'), median ('gmedian'), mode ('gmode'), variance ('gvar'), standard deviation ('gstdev'), coefficient of variance ('gcv'), quartiles ('gq1', gq2', gq3'), inter-quartile range ('gIQR'), skewness ('g1'), and kurtosis ('g2') which facilitate effective data analysis. For skewness and kurtosis calculations we use moments.
The need for anonymization of individual survey responses often leads to many suppressed grid cells in a regular grid. Here we provide functionality for creating multi-resolution gridded data, respecting the confidentiality rules, such as a minimum number of units and dominance by one or more units for each grid cell. The functions also include the possibility for contextual suppression of data. For more details see Skoien et al. (2025) <doi:10.48550/arXiv.2410.17601>.
This package provides efficient implementation of the Narrowest-Over-Threshold methodology for detecting an unknown number of change-points occurring at unknown locations in one-dimensional data following deterministic signal + noise model. Currently implemented scenarios are: piecewise-constant signal, piecewise-constant signal with a heavy-tailed noise, piecewise-linear signal, piecewise-quadratic signal, piecewise-constant signal and with piecewise-constant variance of the noise. For details, see Baranowski, Chen and Fryzlewicz (2019) <doi:10.1111/rssb.12322>.
Provide data generation and estimation tools for the truncated positive normal (tpn) model discussed in Gomez, Olmos, Varela and Bolfarine (2018) <doi:10.1007/s11766-018-3354-x>, the slash tpn distribution discussed in Gomez, Gallardo and Santoro (2021) <doi:10.3390/sym13112164>, the bimodal tpn distribution discussed in Gomez et al. (2022) <doi:10.3390/sym14040665>, the flexible tpn model <doi:10.3390/math11214431> and the unit tpn distribution <doi:10.1016/j.chemolab.2025.105322>.
Independent hypothesis weighting (IHW) is a multiple testing procedure that increases power compared to the method of Benjamini and Hochberg by assigning data-driven weights to each hypothesis. The input to IHW is a two-column table of p-values and covariates. The covariate can be any continuous-valued or categorical variable that is thought to be informative on the statistical properties of each hypothesis test, while it is independent of the p-value under the null hypothesis.
This package provides a suite of functions to test for Functional Measurement Invariance (FMI) between two groups. Implements hierarchical permutation tests for configural, metric, and scalar invariance, adapting concepts from Multi-Group Confirmatory Factor Analysis (MGCFA) to functional data. Methods are based on concepts from: Meredith, W. (1993) <doi:10.1007/BF02294825>,5 Yao, F., Müller, H. G., & Wang, J. L. (2005) <doi:10.1198/016214504000001745>, and Lee, K. Y., & Li, L. (2022) <doi:10.1111/rssb.12471>.
The ggplot2 package is the state-of-the-art toolbox for creating and formatting graphs. However, it is easy to forget how certain formatting commands are named and sometimes users find themselves asking: How do you rotate the x-axis labels again? Or how do you hide the legend...? This package allows users to issue natural language commands related to theme-related styling of plots (colors, font size and such), which then are translated into valid ggplot2 commands.
This package provides functions for the creation, evaluation and test of decision models based in Multi Attribute Utility Theory (MAUT). Can process and evaluate local risk aversion utilities for a set of indexes, compute utilities and weights for the whole decision tree defining the decision model and simulate weights employing Dirichlet distributions under addition constraints in weights. Also includes other rating analysis methods as for example the Colley, Offensive - Defensive ratings and the ranking aggregation with Borda count.
This package provides simple crosstab output with optional statistics (e.g., Goodman-Kruskal Gamma, Somers d, and Kendall's tau-b) as well as two-way and one-way tables. The package is used within the statistics component of the Masters of Science (MSc) in Social Science of the Internet at the Oxford Internet Institute (OII), University of Oxford, but the functions should be useful for general data analysis and especially for analysis of categorical and ordinal data.
Permutation (randomisation) test for single-case phase design data with two phases (e.g., pre- and post-treatment). Correction for dependency of observations is done through stepwise resampling the time series while varying the distance between observations. The required distance 0,1,2,3.. is determined based on repeated dependency testing while stepwise increasing the distance. In preparation: Vroegindeweij et al. "A Permutation distancing test for single-case observational AB phase design data: A Monte Carlo simulation study".
Implementation of the wavelet-based spatial verification method of Buschow and Friederichs "SAD: Verifying the Scale, Anisotropy and Direction of precipitation forecasts" (2020, submitted to QJRMS). Forecasts and Observations are transformed by a decimated or redundant dual-tree complex wavelet transform to analyze the spatial scale, degree of anisotropy and preferred direction in each field. These structural attributes are compared by a series of scores. An experimental algorithm for the correction of these errors is included as well.
Calculates Windowed Cross Correlation for pairs of time series. Provides support for surrogate analysis for nonparametric test of significance. Calculates aggregate statistics over a range of parameter values. Plots the results as Windowed Cross Correlation plots and heat maps. The method is described in "Boker, S. M., Rotondo, J. L., Xu, M., & King, K. (2002). Windowed cross-correlation and peak picking for the analysis of variability in the association between behavioral time series. Psychological Methods, 7(3), 338.".