Many of the models encountered in applications of point process methods to the study of spatio-temporal phenomena are covered in stpp'. This package provides statistical tools for analyzing the global and local second-order properties of spatio-temporal point processes, including estimators of the space-time inhomogeneous K-function and pair correlation function. It also includes tools to get static and dynamic display of spatio-temporal point patterns. See Gabriel et al (2013) <doi:10.18637/jss.v053.i02>.
Stagewise techniques implemented with Generalized Estimating Equations to handle individual, group, bi-level, and interaction selection. Stagewise approaches start with an empty model and slowly build the model over several iterations, which yields a path of candidate models from which model selection can be performed. This slow brewing approach gives stagewise techniques a unique flexibility that allows simple incorporation of Generalized Estimating Equations; see Vaughan, G., Aseltine, R., Chen, K., Yan, J., (2017) <doi:10.1111/biom.12669> for details.
Computes the optimal sample size for various 2-group designs (e.g., when comparing the means of two groups assuming equal variances, unequal variances, or comparing proportions) when the aim is to maximize the rewards over the full decision procedure of a) running a trial (with the computed sample size), and b) subsequently administering the winning treatment to the remaining N-n units in the population. Sample sizes and expected rewards for standard t- and z- tests are also provided.
Simulation of reconstructed phylogenetic trees under tree-wide time-heterogeneous birth-death processes and estimation of diversification parameters under the same model. Speciation and extinction rates can be any function of time and mass-extinction events at specific times can be provided. Trees can be simulated either conditioned on the number of species, the time of the process, or both. Additionally, the likelihood equations are implemented for convenience and can be used for Maximum Likelihood (ML) estimation and Bayesian inference.
Penalized weighted least-squares estimate for variable selection on correlated multiply imputed data and penalized estimating equations for generalized linear models with multiple imputation. Reference: Li, Y., Yang, H., Yu, H., Huang, H., Shen, Y*. (2023) "Penalized estimating equations for generalized linear models with multiple imputation", <doi:10.1214/22-AOAS1721>. Li, Y., Yang, H., Yu, H., Huang, H., Shen, Y*. (2023) "Penalized weighted least-squares estimate for variable selection on correlated multiply imputed data", <doi:10.1093/jrsssc/qlad028>.
With the development of high-throughput techniques, more and more gene expression analysis tend to replace hybridization-based microarrays with the revolutionary technology.The novel method encodes the category again by employing the rank of samples for each gene in each class. We then consider the correlation coefficient of gene and class with rank of sample and new rank of category. The highest correlation coefficient genes are considered as the feature genes which are most effective to classify the samples.
The edge package implements methods for carrying out differential expression analyses of genome-wide gene expression studies. Significance testing using the optimal discovery procedure and generalized likelihood ratio tests (equivalent to F-tests and t-tests) are implemented for general study designs. Special functions are available to facilitate the analysis of common study designs, including time course experiments. Other packages such as sva and qvalue are integrated in edge to provide a wide range of tools for gene expression analysis.
An integrated pipeline to predict the potential synthetic lethality partners (SLPs) of tumour mutations, based on gene expression, mutation profiling and cell line genetic screens data. It has builtd-in support for data from cBioPortal
. The primary SLPs correlating with muations in WT and compensating for the loss of function of mutations are predicted by random forest based methods (GENIE3) and Rank Products, respectively. Genetic screens are employed to identfy consensus SLPs leads to reduced cell viability when perturbed.
Computes 26 financial risk measures for any continuous distribution. The 26 financial risk measures include value at risk, expected shortfall due to Artzner et al. (1999) <DOI:10.1007/s10957-011-9968-2>, tail conditional median due to Kou et al. (2013) <DOI:10.1287/moor.1120.0577>, expectiles due to Newey and Powell (1987) <DOI:10.2307/1911031>, beyond value at risk due to Longin (2001) <DOI:10.3905/jod.2001.319161>, expected proportional shortfall due to Belzunce et al. (2012) <DOI:10.1016/j.insmatheco.2012.05.003>, elementary risk measure due to Ahmadi-Javid (2012) <DOI:10.1007/s10957-011-9968-2>, omega due to Shadwick and Keating (2002), sortino ratio due to Rollinger and Hoffman (2013), kappa due to Kaplan and Knowles (2004), Wang (1998)'s <DOI:10.1080/10920277.1998.10595708> risk measures, Stone (1973)'s <DOI:10.2307/2978638> risk measures, Luce (1980)'s <DOI:10.1007/BF00135033> risk measures, Sarin (1987)'s <DOI:10.1007/BF00126387> risk measures, Bronshtein and Kurelenkova (2009)'s risk measures.
The output gap indicates the percentage difference between the actual output of an economy and its potential. Since potential output is a latent process, the estimation of the output gap poses a challenge and numerous filtering techniques have been proposed. RGAP facilitates the estimation of a Cobb-Douglas production function type output gap, as suggested by the European Commission (Havik et al. 2014) <https://ideas.repec.org/p/euf/ecopap/0535.html>. To that end, the non-accelerating wage rate of unemployment (NAWRU) and the trend of total factor productivity (TFP) can be estimated in two bivariate unobserved component models by means of Kalman filtering and smoothing. RGAP features a flexible modeling framework for the appropriate state-space models and offers frequentist as well as Bayesian estimation techniques. Additional functionalities include direct access to the AMECO <https://economy-finance.ec.europa.eu/economic-research-and-databases/economic-databases/ameco-database_en> database and automated model selection procedures. See the paper by Streicher (2022) <http://hdl.handle.net/20.500.11850/552089> for details.
This is an implementation of design methods for binomial reliability demonstration tests (BRDTs) with failure count data. The acceptance decision uncertainty of BRDT has been quantified and the impacts of the uncertainty on related reliability assurance activities such as reliability growth (RG) and warranty services (WS) are evaluated. This package is associated with the work from the published paper "Optimal Binomial Reliability Demonstration Tests Design under Acceptance Decision Uncertainty" by Suiyao Chen et al. (2020) <doi:10.1080/08982112.2020.1757703>.
This package provides a self-contained set of methods to aid clinical trial safety investigators, statisticians and researchers, in the early detection of adverse events using groupings by body-system or system organ class. This work was supported by the Engineering and Physical Sciences Research Council (UK) (EPSRC) [award reference 1521741] and Frontier Science (Scotland) Ltd. The package title c212 is in reference to the original Engineering and Physical Sciences Research Council (UK) funded project which was named CASE 2/12.
This package provides a comprehensive framework in R for modeling and forecasting economic scenarios based on multi-level dynamic factor model. The package enables users to: (i) extract global and block-specific factors using a flexible multilevel factor structure; (ii) compute asymptotically valid confidence regions for the estimated factors, accounting for uncertainty in the factor loadings; (iii) estimate factor-augmented quantile regressions; (iv) recover full predictive densities from these quantile forecasts; and (v) estimate the density when the factors are stressed.
Neural network has potential in forestry modelling. This package is designed to create and assess Artificial Intelligence based Neural Networks with varying architectures for prediction of volume of forest trees using two input features: height and diameter at breast height, as they are the key factors in predicting volume, therefore development and validation of efficient volume prediction neural network model is necessary. This package has been developed using the algorithm of Tabassum et al. (2022) <doi:10.18805/ag.D-5555>.
Ing and Lai (2011) <doi:10.5705/ss.2010.081> proposed a high-dimensional model selection procedure that comprises three steps: orthogonal greedy algorithm (OGA), high-dimensional information criterion (HDIC), and Trim. The first two steps, OGA and HDIC, are used to sequentially select input variables and determine stopping rules, respectively. The third step, Trim, is used to delete irrelevant variables remaining in the second step. This package aims at fitting a high-dimensional linear regression model via OGA+HDIC+Trim.
This package provides a set of functions is provided for 1) the stratum lengths analysis along a chosen direction, 2) fast estimation of continuous lag spatial Markov chains model parameters and probability computing (also for large data sets), 3) transition probability maps and transiograms drawing, 4) simulation methods for categorical random fields. More details on the methodology are discussed in Sartore (2013) <doi:10.32614/RJ-2013-022> and Sartore et al. (2016) <doi:10.1016/j.cageo.2016.06.001>.
Fits the regularization path of regression models (linear and logistic) with additively combined penalty terms. All possible combinations with Least Absolute Shrinkage and Selection Operator (LASSO), Smoothly Clipped Absolute Deviation (SCAD), Minimax Concave Penalty (MCP) and Exponential Penalty (EP) are supported. This includes Sparse Group LASSO (SGL), Sparse Group SCAD (SGS), Sparse Group MCP (SGM) and Sparse Group EP (SGE). For more information, see Buch, G., Schulz, A., Schmidtmann, I., Strauch, K., & Wild, P. S. (2024) <doi:10.1002/bimj.202200334>.
Fit Bayesian hierarchical models of animal abundance and occurrence via the rstan package, the R interface to the Stan C++ library. Supported models include single-season occupancy, dynamic occupancy, and N-mixture abundance models. Covariates on model parameters are specified using a formula-based interface similar to package unmarked', while also allowing for estimation of random slope and intercept terms. References: Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>; Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.
This package provides support for transformations of numeric aggregates between statistical classifications (e.g. occupation or industry categorisations) using the Crossmaps framework. Implements classes for representing transformations between a source and target classification as graph structures, and methods for validating and applying crossmaps to transform data collected under the source classification into data indexed using the target classification codes. Documentation about the Crossmaps framework is provided in the included vignettes and in Huang (2024, <doi:10.48550/arXiv.2406.14163>
).
DNA methylation contains information about the regulatory state of the cell. MIRA aggregates genome-scale DNA methylation data into a DNA methylation profile for a given region set with shared biological annotation. Using this profile, MIRA infers and scores the collective regulatory activity for the region set. MIRA facilitates regulatory analysis in situations where classical regulatory assays would be difficult and allows public sources of region sets to be leveraged for novel insight into the regulatory state of DNA methylation datasets.
This package provides a scale based normalization (SCBN) method to identify genes with differential expression between different species. It takes into account the available knowledge of conserved orthologous genes and the hypothesis testing framework to detect differentially expressed orthologous genes. The method on this package are described in the article A statistical normalization method and differential expression analysis for RNA-seq data between different species by Yan Zhou, Jiadi Zhu, Tiejun Tong, Junhui Wang, Bingqing Lin, Jun Zhang (2018, pending publication).
This package provides the ASUS procedure for estimating a high dimensional sparse parameter in the presence of auxiliary data that encode side information on sparsity. It is a robust data combination procedure in the sense that even when pooling non-informative auxiliary data ASUS would be at least as efficient as competing soft thresholding based methods that do not use auxiliary data. For more information, please see the paper Adaptive Sparse Estimation with Side Information by Banerjee, Mukherjee and Sun (JASA 2020).
This package provides a comprehensive approach for identifying and estimating change points in multivariate time series through various statistical methods. Implements the multiple change point detection methodology from Ryan & Killick (2023) <doi:10.1080/00401706.2023.2183261> and a novel estimation methodology from Fotopoulos et al. (2023) <doi:10.1007/s00362-023-01495-0> generalized to fit the detection methodologies. Performs both detection and estimation of change points, providing visualization and summary information of the estimation process for each detected change point.
This package provides a general, flexible framework for estimating parameters and empirical sandwich variance estimator from a set of unbiased estimating equations (i.e., M-estimation in the vein of Stefanski & Boos (2002) <doi:10.1198/000313002753631330>). All examples from Stefanski & Boos (2002) are published in the corresponding Journal of Statistical Software paper "The Calculus of M-Estimation in R with geex" by Saul & Hudgens (2020) <doi:10.18637/jss.v092.i02>. Also provides an API to compute finite-sample variance corrections.