Computes segregation indices, including the Index of Dissimilarity, as well as the information-theoretic indices developed by Theil (1971) <isbn:978-0471858454>, namely the Mutual Information Index (M) and Theil's Information Index (H). The M, further described by Mora and Ruiz-Castillo (2011) <doi:10.1111/j.1467-9531.2011.01237.x> and Frankel and Volij (2011) <doi:10.1016/j.jet.2010.10.008>, is a measure of segregation that is highly decomposable. The package provides tools to decompose the index by units and groups (local segregation), and by within and between terms. The package also provides a method to decompose differences in segregation as described by Elbers (2021) <doi:10.1177/0049124121986204>. The package includes standard error estimation by bootstrapping, which also corrects for small sample bias. The package also contains functions for visualizing segregation patterns.
Anscombe's quartet are a set of four two-variable datasets that have several common summary statistics but which have very different joint distributions. This becomes apparent when the data are plotted, which illustrates the importance of using graphical displays in Statistics. This package enables the creation of datasets that have identical marginal sample means and sample variances, sample correlation, least squares regression coefficients and coefficient of determination. The user supplies an initial dataset, which is shifted, scaled and rotated in order to achieve target summary statistics. The general shape of the initial dataset is retained. The target statistics can be supplied directly or calculated based on a user-supplied dataset. The datasauRus
package <https://cran.r-project.org/package=datasauRus>
provides further examples of datasets that have markedly different scatter plots but share many sample summary statistics.
Compute Buishand Range Test, Pettit Test, SNHT, Student t-test, and Mann-Whitney Rank Test, to identify breakpoints in series. For all functions NA is allowed. Since all of the mention methods identify only one breakpoint in a series, a general function to look for N breakpoint is given. Also, the Yamamoto test for climate jump is available. Alexandersson, H. (1986) <doi:10.1002/joc.3370060607>, Buishand, T. (1982) <doi:10.1016/0022-1694(82)90066-X>, Hurtado, S. I., Zaninelli, P. G., & Agosta, E. A. (2020) <doi:10.1016/j.atmosres.2020.104955>, Mann, H. B., Whitney, D. R. (1947) <doi:10.1214/aoms/1177730491>, Pettitt, A. N. (1979) <doi:10.2307/2346729>, Ruxton, G. D., jul (2006) <doi:10.1093/beheco/ark016>, Yamamoto, R., Iwashima, T., Kazadi, S. N., & Hoshiai, M. (1985) <doi:10.2151/jmsj1965.63.6_1157>.
This package contains all of the functions necessary for the complete analysis of a continuous glucose monitoring study and can be applied to data measured by various existing CGM devices such as FreeStyle
Libre', Glutalor', Dexcom and Medtronic CGM'. It reads a series of data files, is able to convert various formats of time stamps, can deal with missing values, calculates both regular statistics and nonlinear statistics, and conducts group comparison. It also displays results in a concise format. Also contains two unique features new to CGM analysis: one is the implementation of strictly standard mean difference and the class of effect size; the other is the development of a new type of plot called antenna plot. It corresponds to Zhang XD'(2018)<doi:10.1093/bioinformatics/btx826>'s article CGManalyzer: an R package for analyzing continuous glucose monitoring studies'.
Calculate Overall Survival or Recurrence-Free Survival for breast cancer patients, using NHS Predict'. The time interval for the estimation can be set up to 15 years, with default at 10. Incremental therapy benefits are estimated for hormone therapy, chemotherapy, trastuzumab, and bisphosphonates. An additional function, suited for SCAN audits, features a more user-friendly version of the code, with fewer inputs, but necessitates the correct standardised inputs. This work is not affiliated with the development of NHS Predict and its underlying statistical model. Details on NHS Predict can be found at: <doi:10.1186/bcr2464>. The web version of NHS Predict': <https://breast.predict.nhs.uk/>. A small dataset of 50 fictional patient observations is provided for the purpose of running examples with the main two functions, and an additional dataset is provided for running example with the dedicated SCAN function.
This package provides estimates for the bivariate and trivariate distribution functions and bivariate and trivariate survival functions for censored gap times. Two approaches, using existing methodologies, are considered: (i) the Lin's estimator, which is based on the extension the Kaplan-Meier estimator of the distribution function for the first event time and the Inverse Probability of Censoring Weights for the second time (Lin DY, Sun W, Ying Z (1999) <doi:10.1093/biomet/86.1.59> and (ii) another estimator based on Kaplan-Meier weights (Una-Alvarez J, Meira-Machado L (2008) <https://w3.math.uminho.pt/~lmachado/Biometria_conference.pdf>). The proposed methods are the landmark estimators based on subsampling approach, and the estimator based on weighted cumulative hazard estimator. The package also provides nonparametric estimator conditional to a given continuous covariate. All these methods have been submitted to be published.
Finds single- and two-arm designs using stochastic curtailment, as described by Law et al. (2022) <doi:10.1080/10543406.2021.2009498> and Law et al. (2021) <doi:10.1002/pst.2067> respectively. Designs can be single-stage or multi-stage. Non-stochastic curtailment is possible as a special case. Desired error-rates, maximum sample size and lower and upper anticipated response rates are inputted and suitable designs are returned with operating characteristics. Stopping boundaries and visualisations are also available. The package can find designs using other approaches, for example designs by Simon (1989) <doi:10.1016/0197-2456(89)90015-9> and Mander and Thompson (2010) <doi:10.1016/j.cct.2010.07.008>. Other features: compare and visualise designs using a weighted sum of expected sample sizes under the null and alternative hypotheses and maximum sample size; visualise any binary outcome design.
Facilitates creation and manipulation of metric graphs, such as street or river networks. Further facilitates operations and visualizations of data on metric graphs, and the creation of a large class of random fields and stochastic partial differential equations on such spaces. These random fields can be used for simulation, prediction and inference. In particular, linear mixed effects models including random field components can be fitted to data based on computationally efficient sparse matrix representations. Interfaces to the R packages INLA and inlabru are also provided, which facilitate working with Bayesian statistical models on metric graphs. The main references for the methods are Bolin, Simas and Wallin (2024) <doi:10.3150/23-BEJ1647>, Bolin, Kovacs, Kumar and Simas (2023) <doi:10.1090/mcom/3929> and Bolin, Simas and Wallin (2023) <doi:10.48550/arXiv.2304.03190>
and <doi:10.48550/arXiv.2304.10372>
.
This package provides a collection of R functions that are widely used by the Petersen Lab. Included are functions for various purposes, including evaluating the accuracy of judgments and predictions, performing scoring of assessments, generating correlation matrices, conversion of data between various types, data management, psychometric evaluation, extensions related to latent variable modeling, various plotting capabilities, and other miscellaneous useful functions. By making the package available, we hope to make our methods reproducible and replicable by others and to help others perform their data processing and analysis methods more easily and efficiently. The codebase is provided in Petersen (2024) <doi:10.5281/zenodo.7602890> and on CRAN: <doi: 10.32614/CRAN.package.petersenlab>. The package is described in "Principles of Psychological Assessment: With Applied Examples in R" (Petersen, 2024, 2025) <doi:10.1201/9781003357421>, <doi:10.25820/work.007199>, <doi:10.5281/zenodo.6466589>.
Random simulations of fuzzy numbers are still a challenging problem. The aim of this package is to provide the respective procedures to simulate fuzzy random variables, especially in the case of the piecewise linear fuzzy numbers (PLFNs, see Coroianua et al. (2013) <doi:10.1016/j.fss.2013.02.005> for the further details). Additionally, the special resampling algorithms known as the epistemic bootstrap are provided (see Grzegorzewski and Romaniuk (2022) <doi:10.34768/amcs-2022-0021>, Grzegorzewski and Romaniuk (2022) <doi:10.1007/978-3-031-08974-9_39>, Romaniuk et al. (2024) <doi:10.32614/RJ-2024-016>) together with the functions to apply statistical tests and estimate various characteristics based on the epistemic bootstrap. The package also includes real-life datasets of epistemic fuzzy triangular and trapezoidal numbers. The fuzzy numbers used in this package are consistent with the FuzzyNumbers
package.
This package contains functions for the classification and ranking of top candidate features, reconstruction of networks from adjacency matrices and data frames, analysis of the topology of the network and calculation of centrality measures, and identification of the most influential nodes. Also, a function is provided for running SIRIR model, which is the combination of leave-one-out cross validation technique and the conventional SIR model, on a network to unsupervisedly rank the true influence of vertices. Additionally, some functions have been provided for the assessment of dependence and correlation of two network centrality measures as well as the conditional probability of deviation from their corresponding means in opposite direction. Fred Viole and David Nawrocki (2013, ISBN:1490523995). Csardi G, Nepusz T (2006). "The igraph software package for complex network research." InterJournal
, Complex Systems, 1695. Adopted algorithms and sources are referenced in function document.
Generates the following sequential two-arm experimental designs: (1) completely randomized (Bernoulli) (2) balanced completely randomized (3) Efron's (1971) Biased Coin (4) Atkinson's (1982) Covariate-Adjusted Biased Coin (5) Kapelner and Krieger's (2014) Covariate-Adjusted Matching on the Fly (6) Kapelner and Krieger's (2021) CARA Matching on the Fly with Differential Covariate Weights (Naive) (7) Kapelner and Krieger's (2021) CARA Matching on the Fly with Differential Covariate Weights (Stepwise) and also provides the following types of inference: (1) estimation (with both Z-style estimators and OLS estimators), (2) frequentist testing (via asymptotic distribution results and via employing the nonparametric randomization test) and (3) frequentist confidence intervals (only under the superpopulation sampling assumption currently). Details can be found in our publication: Kapelner and Krieger "A Matching Procedure for Sequential Experiments that Iteratively Learns which Covariates Improve Power" (2020) <arXiv:2010.05980>
.
These Rcpp'-based functions compute the efficient score statistics for grouped time-to-event data (Prentice and Gloeckler, 1978), with the optional inclusion of baseline covariates. Functions for estimating the parameter of interest and nuisance parameters, including baseline hazards, using maximum likelihood are also provided. A parallel set of functions allow for the incorporation of family structure of related individuals (e.g., trios). Note that the current implementation of the frailty model (Ripatti and Palmgren, 2000) is sensitive to departures from model assumptions, and should be considered experimental. For these data, the exact proportional-hazards-model-based likelihood is computed by evaluating multiple variable integration. The integration is accomplished using the Cuba library (Hahn, 2005), and the source files are included in this package. The maximization process is carried out using Brent's algorithm, with the C++ code file from John Burkardt and John Denker (Brent, 2002).
Fast, optimal, and reproducible clustering algorithms for circular, periodic, or framed data. The algorithms introduced here are based on a core algorithm for optimal framed clustering the authors have developed (Debnath & Song 2021) <doi:10.1109/TCBB.2021.3077573>. The runtime of these algorithms is O(K N log^2 N), where K is the number of clusters and N is the number of circular data points. On a desktop computer using a single processor core, millions of data points can be grouped into a few clusters within seconds. One can apply the algorithms to characterize events along circular DNA molecules, circular RNA molecules, and circular genomes of bacteria, chloroplast, and mitochondria. One can also cluster climate data along any given longitude or latitude. Periodic data clustering can be formulated as circular clustering. The algorithms offer a general high-performance solution to circular, periodic, or framed data clustering.
Computes the extended spring indices (SI-x) and false spring exposure indices (FSEI). The SI-x indices are standard indices used for analysis in spring phenology studies. In addition, the FSEI is also from research on the climatology of false springs and adjusted to include an early and late false spring exposure index. The indices include the first leaf index, first bloom index, and false spring exposure indices, along with all calculations for all functions needed to calculate each index. The main function returns all indices, but each function can also be run separately. Allstadt et al. (2015) <doi: 10.1088/1748-9326/10/10/104008> Ault et al. (2015) <doi: 10.1016/j.cageo.2015.06.015> Peterson and Abatzoglou (2014) <doi: 10.1002/2014GL059266> Schwarz et al. (2006) <doi: 10.1111/j.1365-2486.2005.01097.x> Schwarz et al. (2013) <doi: 10.1002/joc.3625>.
Detect feedback loops (cycles, circuits) between species (nodes) in ordinary differential equation (ODE) models. Feedback loops are paths from a node to itself without visiting any other node twice, and they have important regulatory functions. Loops are reported with their order of participating nodes and their length, and whether the loop is a positive or a negative feedback loop. An upper limit of the number of feedback loops limits runtime (which scales with feedback loop count). Model parametrizations and values of the modelled variables are accounted for. Computation uses the characteristics of the Jacobian matrix as described e.g. in Thomas and Kaufman (2002) <doi:10.1016/s1631-0691(02)01452-x>. Input can be the Jacobian matrix of the ODE model or the ODE function definition; in the latter case, the Jacobian matrix is determined using numDeriv
'. Graph-based algorithms from igraph are employed for path detection.
This package contains the functions for construction and visualization of underlying and reflexivity graphs of the three families of the proximity catch digraphs (PCDs), see (Ceyhan (2005) ISBN:978-3-639-19063-2), and for computing the edge density of these PCD-based graphs which are then used for testing the patterns of segregation and association against complete spatial randomness (CSR)) or uniformity in one and two dimensional cases. The PCD families considered are Arc-Slice PCDs, Proportional-Edge (PE) PCDs (Ceyhan et al. (2006) <doi:10.1016/j.csda.2005.03.002>) and Central Similarity PCDs (Ceyhan et al. (2007) <doi:10.1002/cjs.5550350106>). See also (Ceyhan (2016) <doi:10.1016/j.stamet.2016.07.003>) for edge density of the underlying and reflexivity graphs of PE-PCDs. The package also has tools for visualization of PCD-based graphs for one, two, and three dimensional data.
The C++ header files of the Stan project are provided by this package. There is a shared object containing part of the CVODES
library, but it is not accessible from R. r-stanheaders
is only useful for developers who want to utilize the LinkingTo
directive of their package's DESCRIPTION file to build on the Stan library without incurring unnecessary dependencies.
The Stan project develops a probabilistic programming language that implements full or approximate Bayesian statistical inference via Markov Chain Monte Carlo or variational methods and implements (optionally penalized) maximum likelihood estimation via optimization. The Stan library includes an advanced automatic differentiation scheme, templated statistical and linear algebra functions that can handle the automatically differentiable scalar types (and doubles, ints, etc.), and a parser for the Stan language. The r-rstan
package provides user-facing R functions to parse, compile, test, estimate, and analyze Stan models.
Org-Babel support for evaluating rust code. Much of this is modeled after `ob-C'. Just like the `ob-C', you can specify :flags headers when compiling with the "rust run" command. Unlike `ob-C', you can also specify :args which can be a list of arguments to pass to the binary. If you quote the value passed into the list, it will use `ob-ref to find the reference data. If you do not include a main function or a package name, `ob-rust will provide it for you and it's the only way to properly use very limited implementation: - currently only support :results output. ; Requirements: - You must have rust and cargo installed and the rust and cargo should be in your `exec-path rust command. - rust-script - `rust-mode is also recommended for syntax highlighting and formatting. Not this particularly needs it, it just assumes you have it.
This package provides easy access to essential climate change datasets to non-climate experts. Users can download the latest raw data from authoritative sources and view it via pre-defined ggplot2 charts. Datasets include atmospheric CO2, methane, emissions, instrumental and proxy temperature records, sea levels, Arctic/Antarctic sea-ice, Hurricanes, and Paleoclimate data. Sources include: NOAA Mauna Loa Laboratory <https://gml.noaa.gov/ccgg/trends/data.html>, Global Carbon Project <https://www.globalcarbonproject.org/carbonbudget/>, NASA GISTEMP <https://data.giss.nasa.gov/gistemp/>, National Snow and Sea Ice Data Center <https://nsidc.org/home>, CSIRO <https://research.csiro.au/slrwavescoast/sea-level/measurements-and-data/sea-level-data/>, NOAA Laboratory for Satellite Altimetry <https://www.star.nesdis.noaa.gov/socd/lsa/SeaLevelRise/>
and HURDAT Atlantic Hurricane Database <https://www.aoml.noaa.gov/hrd/hurdat/Data_Storm.html>, Vostok Paleo carbon dioxide and temperature data: <doi:10.3334/CDIAC/ATG.009>.
Tool for easy prior construction and visualization. It helps to formulates joint prior distributions for variance parameters in latent Gaussian models. The resulting prior is robust and can be created in an intuitive way. A graphical user interface (GUI) can be used to choose the joint prior, where the user can click through the model and select priors. An extensive guide is available in the GUI. The package allows for direct inference with the specified model and prior. Using a hierarchical variance decomposition, we formulate a joint variance prior that takes the whole model structure into account. In this way, existing knowledge can intuitively be incorporated at the level it applies to. Alternatively, one can use independent variance priors for each model components in the latent Gaussian model. Details can be found in the accompanying scientific paper: Hem, Fuglstad, Riebler (2024, Journal of Statistical Software, <doi:10.18637/jss.v110.i03>).
Org Ref is an Emacs library that provides rich support for citations, labels and cross-references in Org mode.
The basic idea of Org Ref is that it defines a convenient interface to insert citations from a reference database (e.g., from BibTeX files), and a set of functional Org links for citations, cross-references and labels that export properly to LaTeX, and that provide clickable functionality to the user. Org Ref interfaces with Helm BibTeX to facilitate citation entry, and it can also use RefTeX.
It also provides a fairly large number of utilities for finding bad citations, extracting BibTeX entries from citations in an Org file, and functions to create and modify BibTeX entries from a variety of sources, most notably from a DOI.
Org Ref is especially suitable for Org documents destined for LaTeX export and scientific publication. Org Ref is also useful for research documents and notes.
Simulate inventory policies with and without forecasting, facilitate inventory analysis calculations such as stock levels and re-order points,pricing and promotions calculations. The package includes calculations of inventory metrics, stock-out calculations and ABC analysis calculations. The package includes revenue management techniques such as Multi-product optimization,logit and polynomial model optimization. The functions are referenced from : 1-Harris, Ford W. (1913). "How many parts to make at once". Factory, The Magazine of Management. 2- Nahmias, S. Production and Operations Analysis. McGraw-Hill
International Edition. 3-Silver, E.A., Pyke, D.F., Peterson, R. Inventory Management and Production Planning and Scheduling. 4-Ballou, R.H. Business Logistics Management. 5-MIT Micromasters Program. 6- Columbia University course for supply and demand analysis. 8- Price Elasticity of Demand MATH 104,Mark Mac Lean (with assistance from Patrick Chan) 2011W For further details or correspondence :<www.linkedin.com/in/haythamomar>, <www.rescaleanalytics.com>.
Perform inference in the secondary analysis setting with linked data potentially containing mismatch errors. Only the linked data file may be accessible and information about the record linkage process may be limited or unavailable. Implements the General Framework for Regression with Mismatched Data developed by Slawski et al. (2023) <doi:10.48550/arXiv.2306.00909>
. The framework uses a mixture model for pairs of linked records whose two components reflect distributions conditional on match status, i.e., correct match or mismatch. Inference is based on composite likelihood and the Expectation-Maximization (EM) algorithm. The package currently supports Cox Proportional Hazards Regression (right-censored data only) and Generalized Linear Regression Models (Gaussian, Gamma, Poisson, and Logistic (binary models only)). Information about the underlying record linkage process can be incorporated into the method if available (e.g., assumed overall mismatch rate, safe matches, predictors of match status, or predicted probabilities of correct matches).