This package provides a systematic framework for neural networkâ based model selection and forecasting using single hidden layer feed-forward networks. It evaluates all possible combinations of predictor variables and hidden layer configurations, selecting the optimal model based on predictive accuracy criteria such as root mean squared error (RMSE) and mean absolute percentage error (MAPE). Predictors are automatically standardized, and model performance is assessed using out-of-sample validation. The package is designed for empirical modelling and forecasting in economics, agriculture, trade, climate, and related applied research domains where nonlinear relationships and robust predictive performance are of primary interest.
At the Swiss Federal Statistical Office (SFSO), spatial maps of Switzerland are available free of charge as Cartographic bases for small-scale thematic mapping'. This package contains convenience functions to import ESRI (Environmental Systems Research Institute) shape files using the package sf and to plot them easily and quickly without having to worry too much about the technical details. It contains utilities to combine multiple areas to one single polygon and to find neighbours for single regions. For any point on a map, a special locator can be used to determine to which municipality, district or canton it belongs.
This package performs simulation-based inference as an alternative to the delta method for obtaining valid confidence intervals and p-values for regression post-estimation quantities, such as average marginal effects and predictions at representative values. This framework for simulation-based inference is especially useful when the resulting quantity is not normally distributed and the delta method approximation fails. The methodology is described in Greifer, et al. (2025) <doi:10.32614/RJ-2024-015>. clarify is meant to replace some of the functionality of the archived package Zelig'; see the vignette "Translating Zelig to clarify" for replicating this functionality.
Designed for genomic and proteomic data analysis, enabling unbiased PubMed searching, protein interaction network visualization, and comprehensive data summarization. This package aims to help users identify novel targets within their data sets based on protein network interactions and publication precedence of target's association with research context based on literature precedence. Methods in this package are described in detail in: Douglas et al. (2025) <doi:10.1039/D5MO00160A>. Key functionalities of this package also leverage methodologies from previous works, such as: - Szklarczyk et al. (2023) <doi:10.1093/nar/gkac1000> - Winter (2017) <doi:10.32614/RJ-2017-066>.
Several web services are available that provide access to elevation data. This package provides access to many of those services and returns elevation data either as an sf simple features object from point elevation services or as a raster object from raster elevation services. In future versions, elevatr will drop support for raster and will instead return terra objects. Currently, the package supports access to the Amazon Web Services Terrain Tiles <https://registry.opendata.aws/terrain-tiles/>, the Open Topography Global Datasets API <https://opentopography.org/developers/>, and the USGS Elevation Point Query Service <https://apps.nationalmap.gov/epqs/>.
This package provides a faster implementation of Bayesian Causal Forests (BCF; Hahn et al. (2020) <doi:10.1214/19-BA1195>), which uses regression tree ensembles to estimate the conditional average treatment effect of a binary treatment on a scalar output as a function of many covariates. This implementation avoids many redundant computations and memory allocations present in the original BCF implementation, allowing the model to be fit to larger datasets. The implementation was originally developed for the 2022 American Causal Inference Conference's Data Challenge. See Kokandakar et al. (2023) <doi:10.1353/obs.2023.0024> for more details.
This package provides a framework to assist creation of marine ecosystem models, generating either R or C++ code which can then be optimised using the TMB package and standard R tools. Principally designed to reproduce gadget2 models in TMB', but can be extended beyond gadget2's capabilities. Kasper Kristensen, Anders Nielsen, Casper W. Berg, Hans Skaug, Bradley M. Bell (2016) <doi:10.18637/jss.v070.i05> "TMB: Automatic Differentiation and Laplace Approximation.". Begley, J., & Howell, D. (2004) <https://files01.core.ac.uk/download/pdf/225936648.pdf> "An overview of Gadget, the globally applicable area-disaggregated general ecosystem toolbox. ICES.".
The midasml package implements estimation and prediction methods for high-dimensional mixed-frequency (MIDAS) time-series and panel data regression models. The regularized MIDAS models are estimated using orthogonal (e.g. Legendre) polynomials and sparse-group LASSO (sg-LASSO) estimator. For more information on the midasml approach see Babii, Ghysels, and Striaukas (2021, JBES forthcoming) <doi:10.1080/07350015.2021.1899933>. The package is equipped with the fast implementation of the sg-LASSO estimator by means of proximal block coordinate descent. High-dimensional mixed frequency time-series data can also be easily manipulated with functions provided in the package.
This package provides a procedure for comparing multivariate samples associated with different groups. It uses principal component analysis to convert multivariate observations into a set of linearly uncorrelated statistical measures, which are then compared using a number of statistical methods. The procedure is independent of the distributional properties of samples and automatically selects features that best explain their differences, avoiding manual selection of specific points or summary statistics. It is appropriate for comparing samples of time series, images, spectrometric measures or similar multivariate observations. This package is described in Fachada et al. (2016) <doi:10.32614/RJ-2016-055>.
This package provides a model designed to be a reliable testbed where various gene drive interventions for mosquito-borne diseases control. It is being developed to accommodate the use of various mosquito-specific gene drive systems within a population dynamics framework that allows migration of individuals between patches in landscape. Previous work developing the population dynamics can be found in Deredec et al. (2001) <doi:10.1073/pnas.1110717108> and Hancock & Godfray (2007) <doi:10.1186/1475-2875-6-98>, and extensions to accommodate CRISPR homing dynamics in Marshall et al. (2017) <doi:10.1038/s41598-017-02744-7>.
Calculate sample size or power for hierarchical endpoints. The package can handle any type of outcomes (binary, continuous, count, ordinal, time-to-event) and any number of such endpoints. It allows users to calculate sample size with a given power or to calculate power with a given sample size for hypothesis testing based on win ratios, win odds, net benefit, or DOOR (desirability of outcome ranking) as treatment effect between two groups for hierarchical endpoints. The methods of this package are described further in the paper by Barnhart, H. X. et al. (2024, <doi:10.1080/19466315.2024.2365629>).
This package provides a visualization toolkit for preferential data, such as ranked-choice election results, tournament outcomes, and survey responses. The package provides methods to visualise the preference distribution of one contest with bar charts and pairwise comparisons of two contestants, as well as methods to visualise multiple contests through 2D and high-dimensional simplex plots both statically and interactively. HD simplex displays are implemented via projection methods using the tourr and detourr packages, enabling dynamic exploration of high-dimensional preference structure. For more details on HD simplex projection, see Wickham et al. (2011) <doi:10.21105/joss.03419>.
This package implements a suite of shrinkage estimators for multivariate linear regression to improve estimation stability and predictive accuracy. Provides methods including the Stein estimator, Diagonal Shrinkage, the general Shrinkage estimator (solving a Sylvester equation), and Slab Regression (Simple and Generalized). These methods address Stein's paradox by introducing structured bias to reduce variance without requiring cross-validation, except for ShrinkageRR where the intensity is chosen by minimizing an explicit Mean Squared Error (MSE) criterion. Methods are based on Asimit, V., Cidota, M. A., Chen, Z., and Asimit, J. (2025) <https://openaccess.city.ac.uk/id/eprint/35005/>.
Bayesian clustering of spatial regions with similar functional shapes using spanning trees and latent Gaussian models. The method enforces spatial contiguity within clusters and supports a wide range of latent Gaussian models, including non-Gaussian likelihoods, via the R-INLA framework. The algorithm is based on Zhong, R., Chacón-Montalván, E. A., and Moraga, P. (2024) <doi:10.48550/arXiv.2407.12633>, extending the approach of Zhang, B., Sang, H., Luo, Z. T., and Huang, H. (2023) <doi:10.1214/22-AOAS1643>. The package includes tools for model fitting, convergence diagnostics, visualization, and summarization of clustering results.
This package provides diagnostic tests for assessing the informativeness of survey weights in regression models. Implements difference-in-coefficients tests (Hausman 1978 <doi:10.2307/1913827>; Pfeffermann 1993 <doi:10.2307/1403631>), weight-association tests (DuMouchel and Duncan 1983 <doi:10.2307/2288185>; Pfeffermann and Sverchkov 1999 <https://www.jstor.org/stable/25051118>; Pfeffermann and Sverchkov 2003 <ISBN:9780470845672>; Wu and Fuller 2005 <https://www.jstor.org/stable/27590461>), estimating equations tests (Pfeffermann and Sverchkov 2003 <ISBN:9780470845672>), and non-parametric permutation tests. Includes simulation utilities replicating Wang et al. (2023 <doi:10.1111/insr.12509>) and extensions.
Modern classes for tracking and movement data, building on sf spatial infrastructure, and early theoretical work from Turchin (1998, ISBN: 9780878938476), and Calenge et al. (2009) <doi:10.1016/j.ecoinf.2008.10.002>. Tracking data are series of locations with at least 2-dimensional spatial coordinates (x,y), a time index (t), and individual identification (id) of the object being monitored; movement data are made of trajectories, i.e. the line representation of the path, composed by steps (the straight-line segments connecting successive locations). sftrack is designed to handle movement of both living organisms and inanimate objects.
R implementation of TFactS to predict which are the transcription factors (TFs), regulated in a biological condition based on lists of differentially expressed genes (DEGs) obtained from transcriptome experiments. This package is based on the TFactS concept by Essaghir et al. (2010) <doi:10.1093/nar/gkq149> and expands it. It allows users to perform TFactS'-like enrichment approach. The package can import and use the original catalogue file from the TFactS as well as users defined catalogues of interest that are not supported by TFactS (e.g., Arabidopsis).
This package provides a set of functions providing several visualization tools for exploring the behavior of the components in a network meta-analysis of multi-component (complex) interventions: - components descriptive analysis - heat plot of the two-by-two component combinations - leaving one component combination out scatter plot - violin plot for specific component combinations effects - density plot for components effects - waterfall plot for the interventions effects that differ by a certain component combination - network graph of components - rank heat plot of components for multiple outcomes. The implemented tools are described by Seitidis et al. (2023) <doi:10.1002/jrsm.1617>.
This method generates a tour path by interpolating between d-D frames in p-D using Givens rotations. The algorithm arises from the problem of zeroing elements of a matrix. This interpolation method is useful for showing specific d-D frames in the tour, as opposed to d-D planes, as done by the geodesic interpolation. It is useful for projection pursuit indexes which are not s invariant. See more details in Buj, Cook, Asimov and Hurley (2005) <doi:10.1016/S0169-7161(04)24014-7> and Batsaikhan, Cook and Laa (2023) <doi:10.48550/arXiv.2311.08181>.
This package provides tools for large-scale protein motif analysis and visualization in R. PMScanR facilitates the identification of motifs using external tools like PROSITE's ps_scan (handling necessary file downloads and execution) and enables downstream analysis of results. Key features include parsing scan outputs, converting formats (e.g., to GFF-like structures), generating motif occurrence matrices, and creating informative visualizations such as heatmaps, sequence logos (via seqLogo/ggseqlogo). The package also offers an optional Shiny-based graphical user interface for interactive analysis, aiming to streamline the process of exploring motif patterns across multiple protein sequences.
The speckle package contains functions for the analysis of single cell RNA-seq data. The speckle package currently contains functions to analyse differences in cell type proportions. There are also functions to estimate the parameters of the Beta distribution based on a given counts matrix, and a function to normalise a counts matrix to the median library size. There are plotting functions to visualise cell type proportions and the mean-variance relationship in cell type proportions and counts. As our research into specialised analyses of single cell data continues we anticipate that the package will be updated with new functions.
The maximum likelihood estimation (MLE) of the count data models along with standard error of the estimates and Akaike information model section criterion are provided. The functions allow to compute the MLE for the following distributions such as the Bell distribution, the Borel distribution, the Poisson distribution, zero inflated Bell distribution, zero inflated Bell Touchard distribution, zero inflated Poisson distribution, zero one inflated Bell distribution and zero one inflated Poisson distribution. Moreover, the probability mass function (PMF), distribution function (CDF), quantile function (QF) and random numbers generation of the Bell Touchard and zero inflated Bell Touchard distribution are also provided.
The twoStepsBenchmark() and threeRuleSmooth() functions allow you to disaggregate a low-frequency time series with higher frequency time series, using the French National Accounts methodology. The aggregated sum of the resulting time series is strictly equal to the low-frequency time series within the benchmarking window. Typically, the low-frequency time series is an annual one, unknown for the last year, and the high frequency one is either quarterly or monthly. See "Methodology of quarterly national accounts", Insee Méthodes N°126, by Insee (2012, ISBN:978-2-11-068613-8, <https://www.insee.fr/en/information/2579410>).
This package provides functions for the computation of functional elastic shape means over sets of open planar curves. The package is particularly suitable for settings where these curves are only sparsely and irregularly observed. It uses a novel approach for elastic shape mean estimation, where planar curves are treated as complex functions and a full Procrustes mean is estimated from the corresponding smoothed Hermitian covariance surface. This is combined with the methods for elastic mean estimation proposed in Steyer, Stöcker, Greven (2022) <doi:10.1111/biom.13706>. See Stöcker et. al. (2022) <arXiv:2203.10522> for details.