The Harmonised Index of Consumer Prices (HICP) is the key economic figure to measure inflation in the euro area. The methodology underlying the HICP is documented in the HICP Methodological Manual (<https://ec.europa.eu/eurostat/web/products-manuals-and-guidelines/w/ks-gq-24-003>). Based on the manual, this package provides functions to access and work with HICP data from Eurostat's public database (<https://ec.europa.eu/eurostat/data/database>).
This package implements the Kidney Failure Risk Equation (KFRE; Tangri and colleagues (2011) <doi:10.1001/jama.2011.451>; Tangri and colleagues (2016) <doi:10.1001/jama.2015.18202>) to compute 2- and 5-year kidney failure risk using 4-, 6-, and 8-variable models. Includes helpers to append risk columns to data frames, classify chronic kidney disease (CKD) stages and end-stage renal disease (ESRD) outcomes, and evaluate and plot model performance.
Accompanies the book "Nonparametric Statistical Methods Using R, 2nd Edition" by Kloke and McKean (2024, ISBN:9780367651350). Includes methods, datasets, and random number generation useful for the study of robust and/or nonparametric statistics. Emphasizes classical nonparametric methods for a variety of designs --- especially one-sample and two-sample problems. Includes methods for general scores, including estimation and testing for the two-sample location problem as well as Hogg's adaptive method.
The qda() function from package MASS is extended to calculate a weighted linear (LDA) and quadratic discriminant analysis (QDA) by changing the group variances and group means based on cell-wise uncertainties. The uncertainties can be derived e.g. through relative errors for each individual measurement (cell), not only row-wise or column-wise uncertainties. The method can be applied compositional data (e.g. portions of substances, concentrations) and non-compositional data.
This package provides a multivariate weather generator for daily climate variables based on weather-states (Flecher et al. (2010) <doi:10.1029/2009WR008098>). It uses a Markov chain for modeling the succession of weather states. Conditionally to the weather states, the multivariate variables are modeled using the family of Complete Skew-Normal distributions. Parameters are estimated on measured series. Must include the variable Rain and can accept as many other variables as desired.
The SCDE package implements a set of statistical methods for analyzing single-cell RNA-seq data. SCDE fits individual error models for single-cell RNA-seq measurements. These models can then be used for assessment of differential expression between groups of cells, as well as other types of analysis. The SCDE package also contains the pagoda framework which applies pathway and gene set overdispersion analysis to identify aspects of transcriptional heterogeneity among single cells.
This package offers an implementation of the Abnormal blood profile score (ABPS). The ABPS is a part of the Athlete biological passport program of the World anti-doping agency, which combines several blood parameters into a single score in order to detect blood doping. The package also contains functions to calculate other scores used in anti-doping programs, such as the ratio of hemoglobin to reticulocytes (OFF-score), as well as example data.
The adapted pair correlation function transfers the concept of the pair correlation function from point patterns to patterns of objects of finite size and irregular shape (e.g. lakes within a country). The pair correlation function describes the spatial distribution of objects, e.g. random, aggregated or regularly spaced. This is a reimplementation of the method suggested by Nuske et al. (2009) <doi:10.1016/j.foreco.2009.09.050> using the library GEOS'.
Uncertainty quantification and inverse estimation by probabilistic generative models from the beginning of the data analysis. An example is a Fourier basis method for inverse estimation in scattering analysis of microscopy videos. It does not require specifying a certain range of Fourier bases and it substantially reduces computational cost via the generalized Schur algorithm. See the reference: Mengyang Gu, Yue He, Xubo Liu and Yimin Luo (2023), <doi:10.48550/arXiv.2309.02468>.
This package provides a collection of LaTeX styles using Beamer customization for pdf-based presentation slides in RMarkdown'. At present it contains RMarkdown adaptations of the LaTeX themes Metropolis (formerly mtheme') theme by Matthias Vogelgesang and others (now included in TeXLive'), the IQSS by Ista Zahn (which is included here), and the Monash theme by Rob J Hyndman. Additional (free) fonts may be needed: Metropolis prefers Fira', and IQSS requires Libertinus'.
Compute the fixed effects dynamic panel threshold model suggested by Ramà rez-Rondán (2020) <doi:10.1080/07474938.2019.1624401>, and dynamic panel linear model suggested by Hsiao et al. (2002) <doi:10.1016/S0304-4076(01)00143-9>, where maximum likelihood type estimators are used. Multiple thresholds estimation based on Markov Chain Monte Carlo (MCMC) is allowed, and model selection of linear model, threshold model and multiple threshold model is also allowed.
This package implements an explicit exploration strategy for evolutionary algorithms in order to have a more effective search in solving optimization problems. Along with this exploration search strategy, a set of four different Estimation of Distribution Algorithms (EDAs) are also implemented for solving optimization problems in continuous domains. The implemented explicit exploration strategy in this package is described in Salinas-Gutiérrez and Muñoz Zavala (2023) <doi:10.1016/j.asoc.2023.110230>.
This package provides methods and tools for the analysis of Genome Wide Identity-by-Descent ('gwid') mapping data, focusing on testing whether there is a higher occurrence of Identity-By-Descent (IBD) segments around potential causal variants in cases compared to controls, which is crucial for identifying rare variants. To enhance its analytical power, gwid incorporates a Sliding Window Approach, allowing for the detection and analysis of signals from multiple Single Nucleotide Polymorphisms (SNPs).
Penalized regression for generalized linear models for measurement error problems (aka. errors-in-variables). The package contains a version of the lasso (L1-penalization) which corrects for measurement error (Sorensen et al. (2015) <doi:10.5705/ss.2013.180>). It also contains an implementation of the Generalized Matrix Uncertainty Selector, which is a version the (Generalized) Dantzig Selector for the case of measurement error (Sorensen et al. (2018) <doi:10.1080/10618600.2018.1425626>).
The Iterative Cumulative Sum of Squares (ICSS) algorithm by Inclan/Tiao (1994) <https://www.jstor.org/stable/2290916> detects multiple change points, i.e. structural break points, in the variance of a sequence of independent observations. For series of moderate size (i.e. 200 observations and beyond), the ICSS algorithm offers results comparable to those obtained by a Bayesian approach or by likelihood ration tests, without the heavy computational burden required by these approaches.
This package provides functions to access and download data from various NASA APIs <https://api.nasa.gov/#browseAPI>, including: Astronomy Picture of the Day (APOD), Mars Rover Photos, Earth Polychromatic Imaging Camera (EPIC), Near Earth Object Web Service (NeoWs), Earth Observatory Natural Event Tracker (EONET), and NASA Earthdata CMR Search. Most endpoints require a NASA API key for access. Data is retrieved, cleaned for analysis, and returned in a dataframe-friendly format.
We connect the multi-class Neyman-Pearson classification (NP) problem to the cost-sensitive learning (CS) problem, and propose two algorithms (NPMC-CX and NPMC-ER) to solve the multi-class NP problem through cost-sensitive learning tools. Under certain conditions, the two algorithms are shown to satisfy multi-class NP properties. More details are available in the paper "Neyman-Pearson Multi-class Classification via Cost-sensitive Learning" (Ye Tian and Yang Feng, 2021).
Allows to build complex SQL (Structured Query Language) queries dynamically. Classes and/or factory functions are used to produce a syntax tree from which the final character string is generated. Strings and identifiers are automatically quoted using the right quotes, using either ANSI (American National Standards Institute) quoting or the quoting style of an existing database connector. Style can be configured to set uppercase/lowercase for keywords, remove unnecessary spaces, or omit optional keywords.
In Switzerland, the landscape of municipalities is changing rapidly mainly due to mergers. The Swiss Municipal Data Merger Tool automatically detects these mutations and maps municipalities over time, i.e. municipalities of an old state to municipalities of a new state. This functionality is helpful when working with datasets that are based on different spatial references. The package's idea and use case is discussed in the following article: <doi:10.1111/spsr.12487>.
This package performs variable selection/feature reduction under a clustering or classification framework. In particular, it can be used in an automated fashion using mixture model-based methods ('teigen and mclust are currently supported). Can account for mixtures of non-Gaussian distributions via Manly transform (via ManlyMix'). See Andrews and McNicholas (2014) <doi:10.1007/s00357-013-9139-2> and Neal and McNicholas (2023) <doi:10.48550/arXiv.2305.16464>.
Implementation of the weighted iterative proportional fitting (WIPF) procedure for updating/adjusting a N-dimensional array (currently N<=3) given a weight structure and some target marginals. Acknowledgements: The author wish to thank Ministerio de Ciencia, Innovación y Universidades (grant PID2021-128228NB-I00) and Fundación Mapfre (grant Modelización espacial e intra-anual de la mortalidad en España. Una herramienta automática para el cálculo de productos de vida') for supporting this research.
Perform fast functional enrichment on feature lists (like genes or proteins) using the hypergeometric distribution. Tailored for speed, this package is ideal for interactive platforms such as Shiny. It supports the retrieval of functional data from sources like GO, KEGG, Reactome, Bioplanet and WikiPathways. By downloading and preparing data first, it allows for rapid successive tests on various feature selections without the need for repetitive, time-consuming preparatory steps typical of other packages.
rifi analyses data from rifampicin time series created by microarray or RNAseq. rifi is a transcriptome data analysis tool for the holistic identification of transcription and decay associated processes. The decay constants and the delay of the onset of decay is fitted for each probe/bin. Subsequently, probes/bins of equal properties are combined into segments by dynamic programming, independent of a existing genome annotation. This allows to detect transcript segments of different stability or transcriptional events within one annotated gene. In addition to the classic decay constant/half-life analysis, rifi detects processing sites, transcription pausing sites, internal transcription start sites in operons, sites of partial transcription termination in operons, identifies areas of likely transcriptional interference by the collision mechanism and gives an estimate of the transcription velocity. All data are integrated to give an estimate of continous transcriptional units, i.e. operons. Comprehensive output tables and visualizations of the full genome result and the individual fits for all probes/bins are produced.
This package provides functions to fit Accurate Generalized Linear Model (AGLM) models, visualize them, and predict for new data. AGLM is defined as a regularized GLM which applies a sort of feature transformations using a discretization of numerical features and specific coding methodologies of dummy variables. For more information on AGLM, see Suguru Fujita, Toyoto Tanaka, Kenji Kondo and Hirokazu Iwasawa (2020) <https://www.institutdesactuaires.com/global/gene/link.php?doc_id=16273&fg=1>.