The new yield tables developed by the Northwest German Forest Research Institute (NW-FVA) provide a forest management tool for the five main commercial tree species oak, beech, spruce, Douglas-fir and pine for northwestern Germany. The new method applied for deriving yield tables combines measurements of growth and yield trials with growth simulations using a state-of-the-art single-tree growth simulator. By doing so, the new yield tables reflect the current increment level and the recommended graduated thinning from above is the underlying management concept. The yield tables are provided along with methods for deriving the site index and for interpolating between age and site indices and extrapolating beyond age and site index ranges. The inter-/extrapolations are performed traditionally by the rule of proportion or with a functional approach.
Interpretation methods for analyzing the behavior and individual predictions of modern neural networks in a three-step procedure: Converting the model, running the interpretation method, and visualizing the results. Implemented methods are, e.g., Connection Weights described by Olden et al. (2004) <doi:10.1016/j.ecolmodel.2004.03.013>, layer-wise relevance propagation ('LRP') described by Bach et al. (2015) <doi:10.1371/journal.pone.0130140>, deep learning important features ('DeepLIFT
') described by Shrikumar et al. (2017) <doi:10.48550/arXiv.1704.02685>
and gradient-based methods like SmoothGrad
described by Smilkov et al. (2017) <doi:10.48550/arXiv.1706.03825>
, Gradient x Input or Vanilla Gradient'. Details can be found in the accompanying scientific paper: Koenen & Wright (2024, Journal of Statistical Software, <doi:10.18637/jss.v111.i08>).
Classical tests of goodness-of-fit aim to validate the conformity of a postulated model to the data under study. In their standard formulation, however, they do not allow exploring how the hypothesized model deviates from the truth nor do they provide any insight into how the rejected model could be improved to better fit the data. To overcome these shortcomings, we establish a comprehensive framework for goodness-of-fit which naturally integrates modeling, estimation, inference and graphics. In this package, the deviance tests and comparison density plots are performed to conduct the LP smoothed inference, where the letter L denotes nonparametric methods based on quantiles and P stands for polynomials. Simulations methods are used to perform variance estimation, inference and post-selection adjustments. Algeri S. and Zhang X. (2020) <arXiv:2005.13011>
.
We present a novel statistical framework for identifying differential distributions in single-cell RNA-sequencing (scRNA-seq
) data between treatment conditions by modeling gene expression read counts using generalized linear models (GLMs). We model each gene independently under each treatment condition using error distributions Poisson (P), Negative Binomial (NB), Zero-inflated Poisson (ZIP) and Zero-inflated Negative Binomial (ZINB) with log link function and model based normalization for differences in sequencing depth. Since all four distributions considered in our framework belong to the same family of distributions, we first perform a Kolmogorov-Smirnov (KS) test to select genes belonging to the family of ZINB distributions. Genes passing the KS test will be then modeled using GLMs. Model selection is done by calculating the Bayesian Information Criterion (BIC) and likelihood ratio test (LRT) statistic.
Principal Component Analysis (PCA) extracts the fundamental structure of the data without the need to build any model to represent it. This "summary" of the data is arrived at through a process of reduction that can transform the large number of variables into a lesser number that are uncorrelated (i.e. the 'principal components'), while at the same time being capable of easy interpretation on the original data. PCAtools provides functions for data exploration via PCA, and allows the user to generate publication-ready figures. PCA is performed via BiocSingular
; users can also identify an optimal number of principal components via different metrics, such as the elbow method and Horn's parallel analysis, which has relevance for data reduction in single-cell RNA-seq (scRNA-seq) and high dimensional mass cytometry data.
This package implements fast and exact computation of Gaussian stochastic process with the Matern kernel using forward filtering and backward smoothing algorithm. It includes efficient implementations of the inverse Kalman filter, with applications such as estimating particle interaction functions. These tools support models with or without noise. Additionally, the package offers algorithms for fast parameter estimation in latent factor models, where the factor loading matrix is orthogonal, and latent processes are modeled by Gaussian processes. See the references: 1) Mengyang Gu and Yanxun Xu (2020), Journal of Computational and Graphical Statistics; 2) Xinyi Fang and Mengyang Gu (2024), <doi:10.48550/arXiv.2407.10089>
; 3) Mengyang Gu and Weining Shen (2020), Journal of Machine Learning Research; 4) Yizi Lin, Xubo Liu, Paul Segall and Mengyang Gu (2025), <doi:10.48550/arXiv.2501.01324>
.
Facilitates estimation of full univariate and bivariate probability density functions and cumulative distribution functions along with full quantile functions (univariate) and nonparametric correlation (bivariate) using Hermite series based estimators. These estimators are particularly useful in the sequential setting (both stationary and non-stationary) and one-pass batch estimation setting for large data sets. Based on: Stephanou, Michael, Varughese, Melvin and Macdonald, Iain. "Sequential quantiles via Hermite series density estimation." Electronic Journal of Statistics 11.1 (2017): 570-607 <doi:10.1214/17-EJS1245>, Stephanou, Michael and Varughese, Melvin. "On the properties of Hermite series based distribution function estimators." Metrika (2020) <doi:10.1007/s00184-020-00785-z> and Stephanou, Michael and Varughese, Melvin. "Sequential estimation of Spearman rank correlation using Hermite series estimators." Journal of Multivariate Analysis (2021) <doi:10.1016/j.jmva.2021.104783>.
DepInfeR
integrates two experimentally accessible input data matrices: the drug sensitivity profiles of cancer cell lines or primary tumors ex-vivo (X), and the drug affinities of a set of proteins (Y), to infer a matrix of molecular protein dependencies of the cancers (ß). DepInfeR
deconvolutes the protein inhibition effect on the viability phenotype by using regularized multivariate linear regression. It assigns a “dependence coefficient” to each protein and each sample, and therefore could be used to gain a causal and accurate understanding of functional consequences of genomic aberrations in a heterogeneous disease, as well as to guide the choice of pharmacological intervention for a specific cancer type, sub-type, or an individual patient. For more information, please read out preprint on bioRxiv
: https://doi.org/10.1101/2022.01.11.475864.
Age-Period-Cohort (APC) analyses are used to differentiate relevant drivers for long-term developments. The APCtools package offers visualization techniques and general routines to simplify the workflow of an APC analysis. Sophisticated functions are available both for descriptive and regression model-based analyses. For the former, we use density (or ridgeline) matrices and (hexagonally binned) heatmaps as innovative visualization techniques building on the concept of Lexis diagrams. Model-based analyses build on the separation of the temporal dimensions based on generalized additive models, where a tensor product interaction surface (usually between age and period) is utilized to represent the third dimension (usually cohort) on its diagonal. Such tensor product surfaces can also be estimated while accounting for further covariates in the regression model. See Weigert et al. (2021) <doi:10.1177/1354816620987198> for methodological details.
This package provides a collection of datasets and simplified functions for an introductory (geo)statistics module at University College London. Provides functionality for compositional, directional and spatial data, including ternary diagrams, Wulff and Schmidt stereonets, and ordinary kriging interpolation. Implements logistic and (additive and centred) logratio transformations. Computes vector averages and concentration parameters for the von-Mises distribution. Includes a collection of natural and synthetic fractals, and a simulator for deterministic chaos using a magnetic pendulum example. The main purpose of these functions is pedagogical. Researchers can find more complete alternatives for these tools in other packages such as compositions', robCompositions
', sp', gstat and RFOC'. All the functions are written in plain R, with no compiled code and a minimal number of dependencies. Theoretical background and worked examples are available at <https://tinyurl.com/UCLgeostats/>.
Given two unbiased samples of patient level data on cost and effectiveness for a pair of treatments, make head-to-head treatment comparisons by (i) generating the bivariate bootstrap resampling distribution of ICE uncertainty for a specified value of the shadow price of health, lambda, (ii) form the wedge-shaped ICE confidence region with specified confidence fraction within [0.50, 0.99] that is equivariant with respect to changes in lambda, (iii) color the bootstrap outcomes within the above confidence wedge with economic preferences from an ICE map with specified values of lambda, beta and gamma parameters, (iv) display VAGR and ALICE acceptability curves, and (v) illustrate variation in ICE preferences by displaying potentially non-linear indifference(iso-preference) curves from an ICE map with specified values of lambda, beta and either gamma or eta parameters.
Biodiversity is a multifaceted concept covering different levels of organization from genes to ecosystems. iNEXT.3D
extends iNEXT
to include three dimensions (3D) of biodiversity, i.e., taxonomic diversity (TD), phylogenetic diversity (PD) and functional diversity (FD). This package provides functions to compute standardized 3D diversity estimates with a common sample size or sample coverage. A unified framework based on Hill numbers and their generalizations (Hill-Chao numbers) are used to quantify 3D. All 3D estimates are in the same units of species/lineage equivalents and can be meaningfully compared. The package features size- and coverage-based rarefaction and extrapolation sampling curves to facilitate rigorous comparison of 3D diversity across individual assemblages. Asymptotic 3D diversity estimates are also provided. See Chao et al. (2021) <doi:10.1111/2041-210X.13682> for more details.
This package provides a universal, user friendly, single-cell and bulk RNA sequencing visualization toolkit that allows highly customizable creation of color blindness friendly, publication-quality figures. dittoSeq accepts both SingleCellExperiment (SCE) and Seurat objects, as well as the import and usage, via conversion to an SCE, of SummarizedExperiment or DGEList bulk data. Visualizations include dimensionality reduction plots, heatmaps, scatterplots, percent composition or expression across groups, and more. Customizations range from size and title adjustments to automatic generation of annotations for heatmaps, overlay of trajectory analysis onto any dimensionality reduciton plot, hidden data overlay upon cursor hovering via ggplotly conversion, and many more. All with simple, discrete inputs. Color blindness friendliness is powered by legend adjustments (enlarged keys), and by allowing the use of shapes or letter-overlay in addition to the carefully selected codedittoColors().
The contribution of variables in Bayesian Additive Regression Trees (BART) and Bayesian Additive Regression Trees with Post-Stratification (BARP) models is computed using permutation-based Shapley values. The computed SHAP values are then utilized to visualize the contribution of each variable through various plots. The computation of SHAP values for most models follows the methodology proposed by Strumbel and Kononenko (2014) <doi:10.1007/s10115-013-0679-x>, while for XGBoost, the approach introduced by Lundberg et al. (2020) <doi:10.1038/s42256-019-0138-9> was also considered. The BART model was referenced based on the works of Chipman, George, and McCulloch
(2010) <doi:10.1214/09-AOAS285> and Kapelner and Bleich (2013) <doi:10.18637/jss.v070.i04>, while the methodology for the BARP model was based on Bisbee (2019) <doi:10.1017/S0003055419000480>.
The peak fitting of spectral data is performed by using the frame work of EM algorithm. We adapted the EM algorithm for the peak fitting of spectral data set by considering the weight of the intensity corresponding to the measurement energy steps (Matsumura, T., Nagamura, N., Akaho, S., Nagata, K., & Ando, Y. (2019, 2021 and 2023) <doi:10.1080/14686996.2019.1620123>, <doi:10.1080/27660400.2021.1899449> <doi:10.1080/27660400.2022.2159753>. The package efficiently estimates the parameters of Gaussian mixture model during iterative calculation between E-step and M-step, and the parameters are converged to a local optimal solution. This package can support the investigation of peak shift with two advantages: (1) a large amount of data can be processed at high speed; and (2) stable and automatic calculation can be easily performed.
Energy-Vorticity theory (EVT) is the fundamental theory to describe processes in the atmosphere by combining conserved quantities from hydrodynamics and thermodynamics. The package meteoEVT
provides functions to calculate many energetic and vortical quantities, like potential vorticity, Bernoulli function and dynamic state index (DSI) [e.g. Weber and Nevir, 2008, <doi:10.1111/j.1600-0870.2007.00272.x>], for given gridded data, like ERA5 reanalyses. These quantities can be studied directly or can be used for many applications in meteorology, e.g., the objective identification of atmospheric fronts. For this purpose, separate function are provided that allow the detection of fronts based on the thermic front parameter [Hewson, 1998, <doi:10.1017/S1350482798000553>], the F diagnostic [Parfitt et al., 2017, <doi:10.1002/2017GL073662>] and the DSI [Mack et al., 2022, <arXiv:2208.11438>
].
Testing and documenting code that communicates with remote servers can be painful. Dealing with authentication, server state, and other complications can make testing seem too costly to bother with. But it doesn't need to be that hard. This package enables one to test all of the logic on the R sides of the API in your package without requiring access to the remote service. Importantly, it provides three contexts that mock the network connection in different ways, as well as testing functions to assert that HTTP requests were---or were not---made. It also allows one to safely record real API responses to use as test fixtures. The ability to save responses and load them offline also enables one to write vignettes and other dynamic documents that can be distributed without access to a live server.
This package provides a recently proposed Bayesian BIN model disentangles the underlying processes that enable forecasters and forecasting methods to improve, decomposing forecasting accuracy into three components: bias, partial information, and noise. By describing the differences between two groups of forecasters, the model allows the user to carry out useful inference, such as calculating the posterior probabilities of the treatment reducing bias, diminishing noise, or increasing information. It also provides insight into how much tamping down bias and noise in judgment or enhancing the efficient extraction of valid information from the environment improves forecasting accuracy. This package provides easy access to the BIN model. For further information refer to the paper Ville A. Satopää, Marat Salikhov, Philip E. Tetlock, and Barbara Mellers (2021) "Bias, Information, Noise: The BIN Model of Forecasting" <doi:10.1287/mnsc.2020.3882>.
Calculates conditional exact tests (Fisher's exact test, Blaker's exact test, or exact McNemar's
test) and unconditional exact tests (including score-based tests on differences in proportions, ratios of proportions, and odds ratios, and Boshcloo's test) with appropriate matching confidence intervals, and provides power and sample size calculations. Gives melded confidence intervals for the binomial case (Fay, et al, 2015, <DOI:10.1111/biom.12231>). Gives boundary-optimized rejection region test (Gabriel, et al, 2018, <DOI:10.1002/sim.7579>), an unconditional exact test for the situation where the controls are all expected to fail. Gives confidence intervals compatible with exact McNemar's
or sign tests (Fay and Lumbard, 2021, <DOI:10.1002/sim.8829>). For review of these kinds of exact tests see Fay and Hunsberger (2021, <DOI:10.1214/21-SS131>).
Regression methods to quantify the relation between two measurement methods are provided by this package. In particular it addresses regression problems with errors in both variables and without repeated measurements. It implements the Clinical Laboratory Standard International (CLSI) recommendations (see J. A. Budd et al. (2018, <https://clsi.org/standards/products/method-evaluation/documents/ep09/>) for analytical method comparison and bias estimation using patient samples. Furthermore, algorithms for Theil-Sen and equivariant Passing-Bablok estimators are implemented, see F. Dufey (2020, <doi:10.1515/ijb-2019-0157>) and J. Raymaekers and F. Dufey (2022, <arXiv:2202:08060>
). Further the robust M-Deming and MM-Deming (experimental) are available, see G. Pioda (2021, <arXiv:2105:04628>
). A comprehensive overview over the implemented methods and references can be found in the manual pages mcrPioda-package
and mcreg'.
The OLStrajr package provides comprehensive functions for ordinary least squares (OLS) trajectory analysis and case-by-case OLS regression as outlined in Carrig, Wirth, and Curran (2004) <doi:10.1207/S15328007SEM1101_9> and Rogosa and Saner (1995) <doi:10.3102/10769986020002149>. It encompasses two primary functions, OLStraj()
and cbc_lm()
. The OLStraj()
function simplifies the estimation of individual growth curves over time via OLS regression, with options for visualizing both group-level and individual-level growth trajectories and support for linear and quadratic models. The cbc_lm()
function facilitates case-by-case OLS estimates and provides unbiased mean population intercept and slope estimators by averaging OLS intercepts and slopes across cases. It further offers standard error calculations across bootstrap replicates and computation of 95% confidence intervals based on empirical distributions from the resampling processes.
This Haskell package is intended for those who are tired of keeping long lists of dependencies to the same essential libraries in each package as well as the endless imports of the same APIs all over again.
It also supports the modern tendencies in the language.
To solve those problems this package does the following:
Reexport the original APIs under the
Rebase
namespace.Export all the possible non-conflicting symbols from the
Rebase.Prelude
module.Give priority to the modern practices in the conflicting cases.
The policy behind the package is only to reexport the non-ambiguous and non-controversial APIs, which the community has obviously settled on. The package is intended to rapidly evolve with the contribution from the community, with the missing features being added with pull-requests.
This package provides tools for detecting cellwise outliers and robust methods to analyze data which may contain them. Contains the implementation of the algorithms described in Rousseeuw and Van den Bossche (2018) <doi:10.1080/00401706.2017.1340909> (open access) Hubert et al. (2019) <doi:10.1080/00401706.2018.1562989> (open access), Raymaekers and Rousseeuw (2021) <doi:10.1080/00401706.2019.1677270> (open access), Raymaekers and Rousseeuw (2021) <doi:10.1007/s10994-021-05960-5> (open access), Raymaekers and Rousseeuw (2021) <doi:10.52933/jdssv.v1i3.18> (open access), Raymaekers and Rousseeuw (2022) <arXiv:2207.13493>
(open access) Rousseeuw (2022) <doi:10.1016/j.ecosta.2023.01.007> (open access). Examples can be found in the vignettes: "DDC_examples", "MacroPCA_examples
", "wrap_examples", "transfo_examples", "DI_examples", "cellMCD_examples
" , "Correspondence_analysis_examples", and "cellwise_weights_examples".
We use the Alternating Direction Method of Multipliers (ADMM) for parameter estimation in high-dimensional, single-modality mediation models. To improve the sensitivity and specificity of estimated mediation effects, we offer the sure independence screening (SIS) function for dimension reduction. The available penalty options include Lasso, Elastic Net, Pathway Lasso, and Network-constrained Penalty. The methods employed in the package are based on Boyd, S., Parikh, N., Chu, E., Peleato, B., & Eckstein, J. (2011). <doi:10.1561/2200000016>, Fan, J., & Lv, J. (2008) <doi:10.1111/j.1467-9868.2008.00674.x>, Li, C., & Li, H. (2008) <doi:10.1093/bioinformatics/btn081>, Tibshirani, R. (1996) <doi:10.1111/j.2517-6161.1996.tb02080.x>, Zhao, Y., & Luo, X. (2022) <doi:10.4310/21-sii673>, and Zou, H., & Hastie, T. (2005) <doi:10.1111/j.1467-9868.2005.00503.x>.