Calculate sample size or power for hierarchical endpoints. The package can handle any type of outcomes (binary, continuous, count, ordinal, time-to-event) and any number of such endpoints. It allows users to calculate sample size with a given power or to calculate power with a given sample size for hypothesis testing based on win ratios, win odds, net benefit, or DOOR (desirability of outcome ranking) as treatment effect between two groups for hierarchical endpoints. The methods of this package are described further in the paper by Barnhart, H. X. et al. (2024, <doi:10.1080/19466315.2024.2365629>).
Bayesian clustering of spatial regions with similar functional shapes using spanning trees and latent Gaussian models. The method enforces spatial contiguity within clusters and supports a wide range of latent Gaussian models, including non-Gaussian likelihoods, via the R-INLA framework. The algorithm is based on Zhong, R., Chacón-Montalván, E. A., and Moraga, P. (2024) <doi:10.48550/arXiv.2407.12633>, extending the approach of Zhang, B., Sang, H., Luo, Z. T., and Huang, H. (2023) <doi:10.1214/22-AOAS1643>. The package includes tools for model fitting, convergence diagnostics, visualization, and summarization of clustering results.
This package provides diagnostic tests for assessing the informativeness of survey weights in regression models. Implements difference-in-coefficients tests (Hausman 1978 <doi:10.2307/1913827>; Pfeffermann 1993 <doi:10.2307/1403631>), weight-association tests (DuMouchel and Duncan 1983 <doi:10.2307/2288185>; Pfeffermann and Sverchkov 1999 <https://www.jstor.org/stable/25051118>; Pfeffermann and Sverchkov 2003 <ISBN:9780470845672>; Wu and Fuller 2005 <https://www.jstor.org/stable/27590461>), estimating equations tests (Pfeffermann and Sverchkov 2003 <ISBN:9780470845672>), and non-parametric permutation tests. Includes simulation utilities replicating Wang et al. (2023 <doi:10.1111/insr.12509>) and extensions.
Modern classes for tracking and movement data, building on sf spatial infrastructure, and early theoretical work from Turchin (1998, ISBN: 9780878938476), and Calenge et al. (2009) <doi:10.1016/j.ecoinf.2008.10.002>. Tracking data are series of locations with at least 2-dimensional spatial coordinates (x,y), a time index (t), and individual identification (id) of the object being monitored; movement data are made of trajectories, i.e. the line representation of the path, composed by steps (the straight-line segments connecting successive locations). sftrack is designed to handle movement of both living organisms and inanimate objects.
R implementation of TFactS to predict which are the transcription factors (TFs), regulated in a biological condition based on lists of differentially expressed genes (DEGs) obtained from transcriptome experiments. This package is based on the TFactS concept by Essaghir et al. (2010) <doi:10.1093/nar/gkq149> and expands it. It allows users to perform TFactS'-like enrichment approach. The package can import and use the original catalogue file from the TFactS as well as users defined catalogues of interest that are not supported by TFactS (e.g., Arabidopsis).
This package provides a set of functions providing several visualization tools for exploring the behavior of the components in a network meta-analysis of multi-component (complex) interventions: - components descriptive analysis - heat plot of the two-by-two component combinations - leaving one component combination out scatter plot - violin plot for specific component combinations effects - density plot for components effects - waterfall plot for the interventions effects that differ by a certain component combination - network graph of components - rank heat plot of components for multiple outcomes. The implemented tools are described by Seitidis et al. (2023) <doi:10.1002/jrsm.1617>.
This method generates a tour path by interpolating between d-D frames in p-D using Givens rotations. The algorithm arises from the problem of zeroing elements of a matrix. This interpolation method is useful for showing specific d-D frames in the tour, as opposed to d-D planes, as done by the geodesic interpolation. It is useful for projection pursuit indexes which are not s invariant. See more details in Buj, Cook, Asimov and Hurley (2005) <doi:10.1016/S0169-7161(04)24014-7> and Batsaikhan, Cook and Laa (2023) <doi:10.48550/arXiv.2311.08181>.
This package provides tools for large-scale protein motif analysis and visualization in R. PMScanR facilitates the identification of motifs using external tools like PROSITE's ps_scan (handling necessary file downloads and execution) and enables downstream analysis of results. Key features include parsing scan outputs, converting formats (e.g., to GFF-like structures), generating motif occurrence matrices, and creating informative visualizations such as heatmaps, sequence logos (via seqLogo/ggseqlogo). The package also offers an optional Shiny-based graphical user interface for interactive analysis, aiming to streamline the process of exploring motif patterns across multiple protein sequences.
The speckle package contains functions for the analysis of single cell RNA-seq data. The speckle package currently contains functions to analyse differences in cell type proportions. There are also functions to estimate the parameters of the Beta distribution based on a given counts matrix, and a function to normalise a counts matrix to the median library size. There are plotting functions to visualise cell type proportions and the mean-variance relationship in cell type proportions and counts. As our research into specialised analyses of single cell data continues we anticipate that the package will be updated with new functions.
This package provides a unified syntax for the simulation-based comparison of different single-stage basket trial designs with a binary endpoint and equal sample sizes in all baskets. Methods include the designs by Baumann et al. (2024) <doi:10.48550/arXiv.2309.06988>, Fujikawa et al. (2020) <doi:10.1002/bimj.201800404>, Berry et al. (2020) <doi:10.1177/1740774513497539>, Neuenschwander et al. (2016) <doi:10.1002/pst.1730> and Psioda et al. (2021) <doi:10.1093/biostatistics/kxz014>. For the latter three designs, the functions are mostly wrappers for functions provided by the packages bhmbasket and bmabasket'.
The maximum likelihood estimation (MLE) of the count data models along with standard error of the estimates and Akaike information model section criterion are provided. The functions allow to compute the MLE for the following distributions such as the Bell distribution, the Borel distribution, the Poisson distribution, zero inflated Bell distribution, zero inflated Bell Touchard distribution, zero inflated Poisson distribution, zero one inflated Bell distribution and zero one inflated Poisson distribution. Moreover, the probability mass function (PMF), distribution function (CDF), quantile function (QF) and random numbers generation of the Bell Touchard and zero inflated Bell Touchard distribution are also provided.
The twoStepsBenchmark() and threeRuleSmooth() functions allow you to disaggregate a low-frequency time series with higher frequency time series, using the French National Accounts methodology. The aggregated sum of the resulting time series is strictly equal to the low-frequency time series within the benchmarking window. Typically, the low-frequency time series is an annual one, unknown for the last year, and the high frequency one is either quarterly or monthly. See "Methodology of quarterly national accounts", Insee Méthodes N°126, by Insee (2012, ISBN:978-2-11-068613-8, <https://www.insee.fr/en/information/2579410>).
This package provides functions for the computation of functional elastic shape means over sets of open planar curves. The package is particularly suitable for settings where these curves are only sparsely and irregularly observed. It uses a novel approach for elastic shape mean estimation, where planar curves are treated as complex functions and a full Procrustes mean is estimated from the corresponding smoothed Hermitian covariance surface. This is combined with the methods for elastic mean estimation proposed in Steyer, Stöcker, Greven (2022) <doi:10.1111/biom.13706>. See Stöcker et. al. (2022) <arXiv:2203.10522> for details.
The Graphical Group Ridge GGRidge package package classifies ridge regression predictors in disjoint groups of conditionally correlated variables and derives different penalties (shrinkage parameters) for these groups of predictors. It combines the ridge regression method with the graphical model for high-dimensional data (i.e. the number of predictors exceeds the number of cases) or ill-conditioned data (e.g. in the presence of multicollinearity among predictors). The package reduces the mean square errors and the extent of over-shrinking of predictors as compared to the ridge method.Aldahmani, S. and Zoubeidi, T. (2020) <DOI:10.1080/00949655.2020.1803320>.
This package provides tools to assist planning and monitoring of time-to-event trials under complicated censoring assumptions and/or non-proportional hazards. There are three main components: The first is analytic calculation of predicted time-to-event trial properties, providing estimates of expected hazard ratio, event numbers and power under different analysis methods. The second is simulation, allowing stochastic estimation of these same properties. Thirdly, it provides parametric event prediction using blinded trial data, including creation of prediction intervals. Methods are based upon numerical integration and a flexible object-orientated structure for defining event, censoring and recruitment distributions (Curves).
We implement and extend the Dividing Local Gaussian Process algorithm by Lederer et al. (2020) <doi:10.48550/arXiv.2006.09446>. Its main use case is in online learning where it is used to train a network of local GPs (referred to as tree) by cleverly partitioning the input space. In contrast to a single GP, GPTreeO is able to deal with larger amounts of data. The package includes methods to create the tree and set its parameter, incorporating data points from a data stream as well as making joint predictions based on all relevant local GPs.
This package implements Multi-Calibration Boosting (2018) <https://proceedings.mlr.press/v80/hebert-johnson18a.html> and Multi-Accuracy Boosting (2019) <doi:10.48550/arXiv.1805.12317> for the multi-calibration of a machine learning model's prediction. MCBoost updates predictions for sub-groups in an iterative fashion in order to mitigate biases like poor calibration or large accuracy differences across subgroups. Multi-Calibration works best in scenarios where the underlying data & labels are unbiased, but resulting models are. This is often the case, e.g. when an algorithm fits a majority population while ignoring or under-fitting minority populations.
This package provides statistical process control tools for stochastic textured surfaces. The current version supports the following tools: (1) generic modeling of stochastic textured surfaces. (2) local defect monitoring and diagnostics in stochastic textured surfaces, which was proposed by Bui and Apley (2018a) <doi:10.1080/00401706.2017.1302362>. (3) global change monitoring in the nature of stochastic textured surfaces, which was proposed by Bui and Apley (2018b) <doi:10.1080/00224065.2018.1507559>. (4) computation of dissimilarity matrix of stochastic textured surface images, which was proposed by Bui and Apley (2019b) <doi:10.1016/j.csda.2019.01.019>.
The goal of the package is to equip the jmcm package (current version 0.2.1) with estimations of the covariance of estimated parameters. Two methods are provided. The first method is to use the inverse of estimated Fisher's information matrix, see M. Pourahmadi (2000) <doi:10.1093/biomet/87.2.425>, M. Maadooliat, M. Pourahmadi and J. Z. Huang (2013) <doi:10.1007/s11222-011-9284-6>, and W. Zhang, C. Leng, C. Tang (2015) <doi:10.1111/rssb.12065>. The second method is bootstrap based, see Liu, R.Y. (1988) <doi:10.1214/aos/1176351062> for reference.
Extracts sentiment and sentiment-derived plot arcs from text using a variety of sentiment dictionaries conveniently packaged for consumption by R users. Implemented dictionaries include syuzhet (default) developed in the Nebraska Literary Lab, afinn developed by Finn Arup Nielsen, bing developed by Minqing Hu and Bing Liu, and nrc developed by Mohammad, Saif M. and Turney, Peter D. Applicable references are available in README.md and in the documentation for the get_sentiment function. The package also provides a hack for implementing Stanford's coreNLP sentiment parser. The package provides several methods for plot arc normalization.
R7RS-small Scheme library for reading and writing RSV data format, a very simple binary format for storing tables of strings. It is a competitor for CSV (Comma Separated Values) and TSV (Tab Separated Values). Its main benefit is that the strings are represented as Unicode encoded as UTF-8, and the value and row separators are byte values that are never used in UTF-8, so the strings do not need any error prone escaping and thus can be written and read verbatim.
The RSV format is specified in https://github.com/Stenway/RSV-Specification.
The barbieQ package provides a series of robust statistical tools for analysing barcode count data generated from cell clonal tracking (i.e., lineage tracing) experiments. In these experiments, an initial cell and its offspring collectively form a clone (i.e., lineage). A unique barcode sequence, incorporated into the DNA of the inital cell, is inherited within the clone. This one-to-one mapping of barcodes to clones enables clonal tracking of their behaviors. By counting barcodes, researchers can quantify the population abundance of individual clones under specific experimental perturbations. barbieQ supports barcode count data preprocessing, statistical testing, and visualization.
sRACIPE implements a randomization-based method for gene circuit modeling. It allows us to study the effect of both the gene expression noise and the parametric variation on any gene regulatory circuit (GRC) using only its topology, and simulates an ensemble of models with random kinetic parameters at multiple noise levels. Statistical analysis of the generated gene expressions reveals the basin of attraction and stability of various phenotypic states and their changes associated with intrinsic and extrinsic noises. sRACIPE provides a holistic picture to evaluate the effects of both the stochastic nature of cellular processes and the parametric variation.
This package provides a series of wrapper functions to implement the 10 maximum likelihood models of animal orientation described by Schnute and Groot (1992) <DOI:10.1016/S0003-3472(05)80068-5>. The functions also include the ability to use different optimizer methods and calculate various model selection metrics (i.e., AIC, AICc, BIC). The ability to perform variants of the Hermans-Rasson test and Pycke test is also included as described in Landler et al. (2019) <DOI:10.1186/s12898-019-0246-8>. The latest version also includes a new method to calculate circular-circular and circular-linear distance correlations.