Enables the user to calculate Value at Risk (VaR) and Expected Shortfall (ES) by means of various parametric and semiparametric GARCH-type models. For the latter the estimation of the nonparametric scale function is carried out by means of a data-driven smoothing approach. Model quality, in terms of forecasting VaR and ES, can be assessed by means of various backtesting methods such as the traffic light test for VaR and a newly developed traffic light test for ES. The approaches implemented in this package are described in e.g. Feng Y., Beran J., Letmathe S. and Ghosh S. (2020) <https://ideas.repec.org/p/pdn/ciepap/137.html> as well as Letmathe S., Feng Y. and Uhde A. (2021) <https://ideas.repec.org/p/pdn/ciepap/141.html>.
This package provides a novel PRS model is introduced to enhance the prediction accuracy by utilising GxE effects. This package performs Genome Wide Association Studies (GWAS) and Genome Wide Environment Interaction Studies (GWEIS) using a discovery dataset. The package has the ability to obtain polygenic risk scores (PRSs) for a target sample. Finally it predicts the risk values of each individual in the target sample. Users have the choice of using existing models (Li et al., 2015) <doi:10.1093/annonc/mdu565>, (Pandis et al., 2013) <doi:10.1093/ejo/cjt054>, (Peyrot et al., 2018) <doi:10.1016/j.biopsych.2017.09.009> and (Song et al., 2022) <doi:10.1038/s41467-022-32407-9>, as well as newly proposed models for genomic risk prediction (refer to the URL for more details).
This resource provides tools to create, compare, and post-process spatial isotope assignment models of animal origin. It generates probability-of-origin maps for individuals based on user-provided tissue and environment isotope values (e.g., as generated by IsoMAP, Bowen et al. [2013] <doi:10.1111/2041-210X.12147>) using the framework established in Bowen et al. (2010) <doi:10.1146/annurev-earth-040809-152429>). The package isocat can then quantitatively compare and cluster these maps to group individuals by similar origin. It also includes techniques for applying four approaches (cumulative sum, odds ratio, quantile only, and quantile simulation) with which users can summarize geographic origins and probable distance traveled by individuals. Campbell et al. [2020] establishes several of the functions included in this package <doi:10.1515/ami-2020-0004>.
Intended to create standard human-in-the-loop validity tests for typical automated content analysis such as topic modeling and dictionary-based methods. This package offers a standard workflow with functions to prepare, administer and evaluate a human-in-the-loop validity test. This package provides functions for validating topic models using word intrusion, topic intrusion (Chang et al. 2009, <https://papers.nips.cc/paper/3700-reading-tea-leaves-how-humans-interpret-topic-models>) and word set intrusion (Ying et al. 2021) <doi:10.1017/pan.2021.33> tests. This package also provides functions for generating gold-standard data which are useful for validating dictionary-based methods. The default settings of all generated tests match those suggested in Chang et al. (2009) and Song et al. (2020) <doi:10.1080/10584609.2020.1723752>.
This package contains a function that imports data from a CSV file, or uses manually entered data from the format (x, y, weight) and plots the appropriate ACC vs LOI graph and LMA graph. The main function is plotLMA (source file, header) that takes a data set and plots the appropriate LMA and ACC graphs. If no source file (a string) was passed, a manual data entry window is opened. The header parameter indicates by TRUE/FALSE (false by default) if the source CSV file has a header row or not. The dataset should contain only one independent variable (x) and one dependent variable (y) and can contain a weight for each observation.
This package covers many important models used in marketing and micro-econometrics applications, including Bayes Regression (univariate or multivariate dep var), Bayes Seemingly Unrelated Regression (SUR), Binary and Ordinal Probit, Multinomial Logit (MNL) and Multinomial Probit (MNP), Multivariate Probit, Negative Binomial (Poisson) Regression, Multivariate Mixtures of Normals (including clustering), Dirichlet Process Prior Density Estimation with normal base, Hierarchical Linear Models with normal prior and covariates, Hierarchical Linear Models with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a Dirichlet Process prior and covariates, Hierarchical Negative Binomial Regression Models, Bayesian analysis of choice-based conjoint data, Bayesian treatment of linear instrumental variables models, Analysis of Multivariate Ordinal survey data with scale usage heterogeneity, and Bayesian Analysis of Aggregate Random Coefficient Logit Models.
Carries out model-based clustering, classification and discriminant analysis using five different models. The models are all based on the generalized hyperbolic distribution. The first model MGHD (Browne and McNicholas (2015) <doi:10.1002/cjs.11246>) is the classical mixture of generalized hyperbolic distributions. The MGHFA (Tortora et al. (2016) <doi:10.1007/s11634-015-0204-z>) is the mixture of generalized hyperbolic factor analyzers for high dimensional data sets. The MSGHD is the mixture of multiple scaled generalized hyperbolic distributions, the cMSGHD is a MSGHD with convex contour plots and the MCGHD', mixture of coalesced generalized hyperbolic distributions is a new more flexible model (Tortora et al. (2019)<doi:10.1007/s00357-019-09319-3>. The paper related to the software can be found at <doi:10.18637/jss.v098.i03>.
In the recent past, measurement of coverage has been mainly through two-stage cluster sampled surveys either as part of a nutrition assessment or through a specific coverage survey known as Centric Systematic Area Sampling (CSAS). However, such methods are resource intensive and often only used for final programme evaluation meaning results arrive too late for programme adaptation. SLEAC, which stands for Simplified Lot Quality Assurance Sampling Evaluation of Access and Coverage, is a low resource method designed specifically to address this limitation and is used regularly for monitoring, planning and importantly, timely improvement to programme quality, both for agency and Ministry of Health (MoH) led programmes. SLEAC is designed to complement the Semi-quantitative Evaluation of Access and Coverage (SQUEAC) method. This package provides functions for use in conducting a SLEAC assessment.
This package provides a computational method that infers copy number variations (CNV) in cancer scRNA-seq data and reconstructs the tumor phylogeny. It integrates signals from gene expression, allelic ratio, and population haplotype structures to accurately infer allele-specific CNVs in single cells and reconstruct their lineage relationship. It does not require tumor/normal-paired DNA or genotype data, but operates solely on the donor scRNA-data data (for example, 10x Cell Ranger output). It can be used to:
detect allele-specific copy number variations from single-cells
differentiate tumor versus normal cells in the tumor microenvironment
infer the clonal architecture and evolutionary history of profiled tumors
For details on the method see Gao et al in Nature Biotechnology 2022.
BUSseq R package fits an interpretable Bayesian hierarchical model---the Batch Effects Correction with Unknown Subtypes for scRNA seq Data (BUSseq)---to correct batch effects in the presence of unknown cell types. BUSseq is able to simultaneously correct batch effects, clusters cell types, and takes care of the count data nature, the overdispersion, the dropout events, and the cell-specific sequencing depth of scRNA-seq data. After correcting the batch effects with BUSseq, the corrected value can be used for downstream analysis as if all cells were sequenced in a single batch. BUSseq can integrate read count matrices obtained from different scRNA-seq platforms and allow cell types to be measured in some but not all of the batches as long as the experimental design fulfills the conditions listed in our manuscript.
This package provides a set of functions to access the ARDECO (Annual Regional Database of the European Commission) data directly from the official ARDECO public repository through the exploitation of the ARDECO APIs. The APIs are completely transparent to the user and the provided functions provide a direct access to the ARDECO data. The ARDECO database is a collection of variables related to demography, employment, labour market, domestic product, capital formation. Each variable can be exposed in one or more units of measure as well as refers to total values plus additional dimensions like economic sectors, gender, age classes. Data can be also aggregated at country level according to the tercet classes as defined by EUROSTAT. The description of the ARDECO database can be found at the following URL <https://territorial.ec.europa.eu/ardeco>.
An implementation of common statistical analysis and models with differential privacy (Dwork et al., 2006a) <doi:10.1007/11681878_14> guarantees. The package contains, for example, functions providing differentially private computations of mean, variance, median, histograms, and contingency tables. It also implements some statistical models and machine learning algorithms such as linear regression (Kifer et al., 2012) <https://proceedings.mlr.press/v23/kifer12.html> and SVM (Chaudhuri et al., 2011) <https://jmlr.org/papers/v12/chaudhuri11a.html>. In addition, it implements some popular randomization mechanisms, including the Laplace mechanism (Dwork et al., 2006a) <doi:10.1007/11681878_14>, Gaussian mechanism (Dwork et al., 2006b) <doi:10.1007/11761679_29>, analytic Gaussian mechanism (Balle & Wang, 2018) <https://proceedings.mlr.press/v80/balle18a.html>, and exponential mechanism (McSherry & Talwar, 2007) <doi:10.1109/FOCS.2007.66>.
Post-construction fatality monitoring studies at wind facilities are based on data from searches for bird and bat carcasses in plots beneath turbines. Bird and bat carcasses can fall outside of the search plot. Bird and bat carcasses from wind turbines often fall outside of the searched area. To compensate, area correction (AC) estimations are calculated to estimate the percentage of fatalities that fall within the searched area versus those that fall outside of it. This package provides two likelihood based methods and one physics based method (Hull and Muir (2010) <doi:10.1080/14486563.2010.9725253>, Huso and Dalthorp (2014) <doi:10.1002/jwmg.663>) to estimate the carcass fall distribution. There are also functions for calculating the proportion of area searched within one unit annuli, log logistic distribution functions, and truncated distribution functions.
MDS is a statistic tool for reduction of dimensionality, using as input a distance matrix of dimensions n à n. When n is large, classical algorithms suffer from computational problems and MDS configuration can not be obtained. With this package, we address these problems by means of six algorithms, being two of them original proposals: - Landmark MDS proposed by De Silva V. and JB. Tenenbaum (2004). - Interpolation MDS proposed by Delicado P. and C. Pachón-Garcà a (2021) <arXiv:2007.11919> (original proposal). - Reduced MDS proposed by Paradis E (2018). - Pivot MDS proposed by Brandes U. and C. Pich (2007) - Divide-and-conquer MDS proposed by Delicado P. and C. Pachón-Garcà a (2021) <arXiv:2007.11919> (original proposal). - Fast MDS, proposed by Yang, T., J. Liu, L. McMillan and W. Wang (2006).
This package implements the conditionally symmetric multidimensional Gaussian mixture model (csmGmm) for large-scale testing of composite null hypotheses in genetic association applications such as mediation analysis, pleiotropy analysis, and replication analysis. In such analyses, we typically have J sets of K test statistics where K is a small number (e.g. 2 or 3) and J is large (e.g. 1 million). For each one of the J sets, we want to know if we can reject all K individual nulls. Please see the vignette for a quickstart guide. The paper describing these methods is "Testing a Large Number of Composite Null Hypotheses Using Conditionally Symmetric Multidimensional Gaussian Mixtures in Genome-Wide Studies" by Sun R, McCaw Z, & Lin X (Journal of the American Statistical Association 2025, <doi:10.1080/01621459.2024.2422124>).
Genome-wide association studies (GWAS) have been widely used for identifying common variants associated with complex diseases. Due to the small effect sizes of common variants, the power to detect individual risk variants is generally low. Complementary to SNP-level analysis, a variety of gene-based association tests have been proposed. However, the power of existing gene-based tests is often dependent on the underlying genetic models, and it is not known a priori which test is optimal. Here we proposed COMBined Association Test (COMBAT) to incorporate strengths from multiple existing gene-based tests, including VEGAS, GATES and simpleM. Compared to individual tests, COMBAT shows higher overall performance and robustness across a wide range of genetic models. The algorithm behind this method is described in Wang et al (2017) <doi:10.1534/genetics.117.300257>.
Chaos theory has been hailed as a revolution of thoughts and attracting ever increasing attention of many scientists from diverse disciplines. Chaotic systems are nonlinear deterministic dynamic systems which can behave like an erratic and apparently random motion. A relevant field inside chaos theory and nonlinear time series analysis is the detection of a chaotic behaviour from empirical time series data. One of the main features of chaos is the well known initial value sensitivity property. Methods and techniques related to test the hypothesis of chaos try to quantify the initial value sensitive property estimating the Lyapunov exponents. The DChaos package provides different useful tools and efficient algorithms which test robustly the hypothesis of chaos based on the Lyapunov exponent in order to know if the data generating process behind time series behave chaotically or not.
Reliability Analysis and Maintenance Optimization using Hidden Markov Models (HMM). The use of HMMs to model the state of a system which is not directly observable and instead certain indicators (signals) of the true situation are provided via a control system. A hidden model can provide key information about the system dependability, such as the reliability of the system and related measures. An estimation procedure is implemented based on the Baum-Welch algorithm. Classical structures such as K-out-of-N systems and Shock models are illustrated. Finally, the maintenance of the system is considered in the HMM context and two functions for new preventive maintenance strategies are considered. Maintenance efficiency is measured in terms of expected cost. Methods are described in Gamiz, Limnios, and Segovia-Garcia (2023) <doi:10.1016/j.ejor.2022.05.006>.
The method of synthetic controls is a widely-adopted tool for evaluating causal effects of policy changes in settings with observational data. In many settings where it is applicable, researchers want to identify causal effects of policy changes on a treated unit at an aggregate level while having access to data at a finer granularity. This package implements a simple extension of the synthetic controls estimator, developed in Gunsilius (2023) <doi:10.3982/ECTA18260>, that takes advantage of this additional structure and provides nonparametric estimates of the heterogeneity within the aggregate unit. The idea is to replicate the quantile function associated with the treated unit by a weighted average of quantile functions of the control units. The package contains tools for aggregating and plotting the resulting distributional estimates, as well as for carrying out inference on them.
HIGHT(HIGh security and light weigHT) algorithm is a block cipher encryption algorithm developed to provide confidentiality in computing environments that demand low power consumption and lightweight, such as RFID(Radio-Frequency Identification) and USN(Ubiquitous Sensor Network), or in mobile environments that require low power consumption and lightweight, such as smartphones and smart cards. Additionally, it is designed with a simple structure that enables it to be used with basic arithmetic operations, XOR, and circular shifts in 8-bit units. This algorithm was designed to consider both safety and efficiency in a very simple structure suitable for limited environments, compared to the former 128-bit encryption algorithm SEED. In December 2010, it became an ISO(International Organization for Standardization) standard. The detailed procedure is described in Hong et al. (2006) <doi:10.1007/11894063_4>.
This package provides functions for analysing eye tracking data, including event detection, visualizations and area of interest (AOI) based analyses. The package includes implementations of the IV-T, I-DT, adaptive velocity threshold, and Identification by two means clustering (I2MC) algorithms. See separate documentation for each function. The principles underlying I-VT and I-DT algorithms are described in Salvucci & Goldberg (2000) <doi:10.1145/355017.355028>. Two-means clustering is described in Hessels et al. (2017), <doi: 10.3758/s13428-016-0822-1>. The adaptive velocity threshold algorithm is described in Nyström & Holmqvist (2010),<doi:10.3758/BRM.42.1.188>. A documentation of the kollaR can be found in Kleberg et al (2026) <doi:10.3758/s13428-025-02903-z>. Cite this paper when using kollaR See a demonstration in the URL.
Create and use data frame labels for data frame objects (frame labels), their columns (name labels), and individual values of a column (value labels). Value labels include one-to-one and many-to-one labels for nominal and ordinal variables, as well as numerical range-based value labels for continuous variables. Convert value-labeled variables so each value is replaced by its corresponding value label. Add values-converted-to-labels columns to a value-labeled data frame while preserving parent columns. Filter and subset a value-labeled data frame using labels, while returning results in terms of values. Overlay labels in place of values in common R commands to increase interpretability. Generate tables of value frequencies, with categories expressed as raw values or as labels. Access data frames that show value-to-label mappings for easy reference.
Allows users to calculate pairwise Nei's Genetic Distances (Nei 1972), pairwise Fixation Indexes (Fst) (Weir & Cockerham 1984) and also Genomic Relationship matrixes following Yang et al. (2010) in mixed and single ploidy populations. Bootstrapping across loci is implemented during Fst calculation to generate confidence intervals and p-values around pairwise Fst values. StAMPP utilises SNP genotype data of any ploidy level (with the ability to handle missing data) and is coded to utilise multithreading where available to allow efficient analysis of large datasets. StAMPP is able to handle genotype data from genlight objects allowing integration with other packages such adegenet. Please refer to LW Pembleton, NOI Cogan & JW Forster, 2013, Molecular Ecology Resources, 13(5), 946-952. <doi:10.1111/1755-0998.12129> for the appropriate citation and user manual. Thank you in advance.
Analyzing Inductively Coupled Plasma - Mass Spectrometry (ICP-MS) measurement data to evaluate isotope ratios (IRs) is a complex process. The IsoCor package facilitates this process and renders it reproducible by providing a function to run a Shiny'-App locally in any web browser. In this App the user can upload data files of various formats, select ion traces, apply peak detection and perform calculation of IRs and delta values. Results are provided as figures and tables and can be exported. The App, therefore, facilitates data processing of ICP-MS experiments to quickly obtain optimal processing parameters compared to traditional Excel worksheet based approaches. A more detailed description can be found in the corresponding article <doi:10.1039/D2JA00208F>. The most recent version of IsoCor can be tested online at <https://apps.bam.de/shn00/IsoCor/>.