Tightens an observational block design into a smaller design with either smaller or fewer blocks while controlling for covariates. The method uses fine balance, optimal subset matching (Rosenbaum, 2012 <doi:10.1198/jcgs.2011.09219>) and two-criteria matching (Zhang et al 2023 <doi:10.1080/01621459.2021.1981337>). The main function is tighten()
. The suggested rrelaxiv package for solving minimum cost flow problems: (i) derives from Bertsekas and Tseng (1988) <doi:10.1007/BF02288322>, (ii) is not available on CRAN due to its academic license, (iii) may be downloaded from GitHub
at <https://github.com/josherrickson/rrelaxiv/>, (iv) is not essential to use the package.
This package provides a method for comparing the results of two binary diagnostic tests using paired data. Users can rapidly perform descriptive and inferential statistics in a single function call. Options permit users to select which parameters they are interested in comparing and methods for correction for multiple comparisons. Confidence intervals are calculated using the methods with the best coverage. Hypothesis tests use the methods with the best asymptotic performance. A summary of the methods is available in Roldán-Nofuentes (2020) <doi:10.1186/s12874-020-00988-y>. This package is targeted at clinical researchers who want to rapidly and effectively compare results from binary diagnostic tests.
Temporally autocorrelated populations are correlated in their vital rates (growth, death, etc.) from year to year. It is very common for populations, whether they be bacteria, plants, or humans, to be temporally autocorrelated. This poses a challenge for stochastic population modeling, because a temporally correlated population will behave differently from an uncorrelated one. This package provides tools for simulating populations with white noise (no temporal autocorrelation), red noise (positive temporal autocorrelation), and blue noise (negative temporal autocorrelation). The algebraic formulation for autocorrelated noise comes from Ruokolainen et al. (2009) <doi:10.1016/j.tree.2009.04.009>. Models for unstructured populations and for structured populations (matrix models) are available.
Spatial data are generally auto-correlated, meaning that if two units selected are close to each other, then it is likely that they share the same properties. For this reason, when sampling in the population it is often needed that the sample is well spread over space. A new method to draw a sample from a population with spatial coordinates is proposed. This method is called wave (Weakly Associated Vectors) sampling. It uses the less correlated vector to a spatial weights matrix to update the inclusion probabilities vector into a sample. For more details see Raphaël Jauslin and Yves Tillé (2019) <doi:10.1007/s13253-020-00407-1>.
An implementation of representation-dependent gene level operations for genetic algorithms with genes representing permutations: Initialization of genes, mutation, and crossover. The crossover operation provided is position-based crossover (Syswerda, G., Chap. 21 in Davis, L. (1991, ISBN:0-442-00173-8). For mutation, several variants are included: Order-based mutation (Syswerda, G., Chap. 21 in Davis, L. (1991, ISBN:0-442-00173-8), randomized Lin-Kernighan heuristics (Croes, G. A. (1958) <doi:10.1287/opre.6.6.791> and Lin, S. and Kernighan. B. W. (1973) <doi:10.1287/opre.21.2.498>), and randomized greedy operators. A random mix operator for mutation selects a mutation variant randomly.
Annotated HPLC-ESI-MS lipid data in positive ionization mode from an experiment in which cultures of the marine diatom Phaeodactylum tricornutum were treated with various concentrations of hydrogen peroxide (H2O2) to induce oxidative stress. The experiment is described in Graff van Creveld, et al., 2015, "Early perturbation in mitochondria redox homeostasis in response to environmental stress predicts cell fate in diatoms," ISME Journal 9:385-395. PtH2O2lipids
consists of two objects: A CAMERA xsAnnotate
object (ptH2O2lipids$xsAnnotate
) and LOBSTAHS LOBSet object (ptH2O2lipids$xsAnnotate$LOBSet
). The LOBSet includes putative compound assignments from the default LOBSTAHS database. Isomer annotation is recorded in three other LOBSet slots.
There is a long tradition of studying the flux of carbon from the biosphere to the atmosphere by following a particular cohort of litter (wood, leaves, roots, or other organic material) through time. The resulting data are mass remaining and time. A variety of functional forms may be used to fit the resulting data. Some work better empirically. Some are better connected to a process-based understanding. Some have a small number of free parameters; others have more. This package matches decomposition data to a family of these curves using likelihood--based fitting. This package is based on published research by Cornwell & Weedon (2013) <doi:10.1111/2041-210X.12138>.
iClusterPlus is developed for integrative clustering analysis of multi-type genomic data and is an enhanced version of iCluster proposed and developed by Shen, Olshen and Ladanyi (2009). Multi-type genomic data arise from the experiments where biological samples (e.g. tumor samples) are analyzed by multiple techniques, for instance, array comparative genomic hybridization (aCGH), gene expression microarray, RNA-seq and DNA-seq, and so on. In the iClusterPlus model, binary observations such as somatic mutation are modeled as Binomial processes; categorical observations such as copy number states are realizations of Multinomial random variables; counts are modeled as Poisson random processes; and continuous measures are modeled by Gaussian distributions.
Mitteroecker & Gunz (2009) <doi:10.1007/s11692-009-9055-x> describe how geometric morphometric methods allow researchers to quantify the size and shape of physical biological structures. We provide tools to extend geometric morphometric principles to the study of non-physical structures, hormone profiles, as outlined in Ehrlich et al (2021) <doi:10.1002/ajpa.24514>. Easily transform daily measures into multivariate landmark-based data. Includes custom functions to apply multivariate methods for data exploration as well as hypothesis testing. Also includes shiny web app to streamline data exploration. Developed to study menstrual cycle hormones but functions have been generalized and should be applicable to any biomarker over any time period.
This package performs demographic, bifurcation and evolutionary analysis of physiologically structured population models, which is a class of models that consistently translates continuous-time models of individual life history to the population level. A model of individual life history has to be implemented specifying the individual-level functions that determine the life history, such as development and mortality rates and fecundity. M.A. Kirkilionis, O. Diekmann, B. Lisser, M. Nool, B. Sommeijer & A.M. de Roos (2001) <doi:10.1142/S0218202501001264>. O.Diekmann, M.Gyllenberg & J.A.J.Metz (2003) <doi:10.1016/S0040-5809(02)00058-8>. A.M. de Roos (2008) <doi:10.1111/j.1461-0248.2007.01121.x>.
Aimed at applying the Harvest classification tree algorithm, modified algorithm of classic classification tree.The harvested tree has advantage of deleting redundant rules in trees, leading to a simplify and more efficient tree model.It was firstly used in drug discovery field, but it also performs well in other kinds of data, especially when the region of a class is disconnected. This package also improves the basic harvest classification tree algorithm by extending the field of data of algorithm to both continuous and categorical variables. To learn more about the harvest classification tree algorithm, you can go to http://www.stat.ubc.ca/Research/TechReports/techreports/220.pdf
for more information.
Use stem analysis data to reconstructing tree growth and carbon accumulation. Users can independently or in combination perform a number of standard tasks for any tree species. (i) Age class determination. (ii) The cumulative growth, mean annual increment, and current annual increment of diameter at breast height (DBH) with bark, tree height, and stem volume with bark are estimated. (iii) Tree biomass and carbon storage estimation from volume and allometric models are calculated. (iv) Height-diameter relationship is fitted with nonlinear models, if diameter at breast height (DBH) or tree height are available, which can be used to retrieve tree height and diameter at breast height (DBH). <https://github.com/forestscientist/StemAnalysis>
.
This package provides functions for identification and transportation of causal effects. Provides a conditional causal effect identification algorithm (IDC) by Shpitser, I. and Pearl, J. (2006) <http://ftp.cs.ucla.edu/pub/stat_ser/r329-uai.pdf>, an algorithm for transportability from multiple domains with limited experiments by Bareinboim, E. and Pearl, J. (2014) <http://ftp.cs.ucla.edu/pub/stat_ser/r443.pdf>, and a selection bias recovery algorithm by Bareinboim, E. and Tian, J. (2015) <http://ftp.cs.ucla.edu/pub/stat_ser/r445.pdf>. All of the previously mentioned algorithms are based on a causal effect identification algorithm by Tian , J. (2002) <http://ftp.cs.ucla.edu/pub/stat_ser/r309.pdf>.
For a given graph containing vertices, edges, and a signal associated with the vertices, the PathwaySpace
package performs a convolution operation, which involves a weighted combination of neighboring vertices and their associated signals. The package then uses a decay function to project these signals, creating geodesic paths on a 2D-image space. PathwaySpace
could have various applications, such as visualizing and analyzing network data in a graphical format that highlights the relationships and signal strengths between vertices. It can be particularly useful for understanding the influence of signals through complex networks. By combining graph theory, signal processing, and visualization, the PathwaySpace
package provides a novel way of representing and analyzing graph data.
Multivariate data analysis is the simultaneous observation of more than one characteristic. In contrast to the analysis of univariate data, in this approach not only a single variable or the relation between two variables can be investigated, but the relations between many attributes can be considered. For the statistical analysis of chemical data one has to take into account the special structure of this type of data. This package contains about 30 functions, mostly for regression, classification and model evaluation and includes some data sets used in the R help examples. It was designed as a R companion to the book "Introduction to Multivariate Statistical Analysis in Chemometrics" written by K. Varmuza and P. Filzmoser (2009).
This package provides functions to perform comparative causal mediation analysis to compare the mediation effects of different treatments via a common mediator. Results contain the estimates and confidence intervals for the two comparative causal mediation analysis estimands, as well as the ATE and ACME for each treatment. Functions provided in the package will automatically assess the comparative causal mediation analysis scope conditions (i.e. for each comparative causal mediation estimand, a numerator and denominator that are both estimated with the desired statistical significance and of the same sign). Results will be returned for each comparative causal mediation estimand only if scope conditions are met for it. See details in Bansak(2020)<doi:10.1017/pan.2019.31>.
This package provides functions to calculate weights, estimates of changes and corresponding variance estimates for panel data with non-response. Partially overlapping samples are handled. Initially, weights are calculated by linear calibration. By default, the survey package is used for this purpose. It is also possible to use ReGenesees
, which can be installed from <https://github.com/DiegoZardetto/ReGenesees>
. Variances of linear combinations (changes and averages) and ratios are calculated from a covariance matrix based on residuals according to the calibration model. The methodology was presented at the conference, The Use of R in Official Statistics, and is described in Langsrud (2016) <http://www.revistadestatistica.ro/wp-content/uploads/2016/06/RRS2_2016_A021.pdf>.
This package provides tools for crop breeding analysis including Genetic Coefficient of Variation (GCV), Phenotypic Coefficient of Variation (PCV), heritability, genetic advance calculations, stability analysis using the Eberhart-Russell model, two-way ANOVA for genotype-environment interactions, and Additive Main Effects and Multiplicative Interaction (AMMI) analysis. These tools are developed for crop breeding research and stability evaluation under various environmental conditions. The methods are based on established statistical and biometrical principles. Refer to Eberhart and Russell (1966) <doi:10.2135/cropsci1966.0011183X000600010011x> for stability parameters, Fisher (1935) "The Design of Experiments" <ISBN:9780198522294>, Falconer (1996) "Introduction to Quantitative Genetics" <ISBN:9780582243026>, and Singh and Chaudhary (1985) "Biometrical Methods in Quantitative Genetic Analysis" <ISBN:9788122433764> for foundational methodologies.
Create a skeleton shiny application with create_template()
that is reproducible, can be saved and meets academic standards for attribution. Forked from wallace'. Code is split into modules that are loaded and linked together automatically and each call one function. Guidance pages explain modules to users and flexible logging informs them of any errors. Options enable asynchronous operations, viewing of source code, interactive maps and data tables. Use to create complex analytical applications, following best practices in open science and software development. Includes functions for automating repetitive development tasks and an example application at run_shinyscholar()
that requires install.packages("shinyscholar", dependencies = TRUE). A guide to developing applications can be found on the package website.
This package provides a likelihood method is implemented to present evidence for evaluating bioequivalence (BE). The functions use bioequivalence data [area under the blood concentration-time curve (AUC) and peak concentration (Cmax)] from various crossover designs commonly used in BE studies including a fully replicated, a partially replicated design, and a conventional 2x2 crossover design. They will calculate the profile likelihoods for the mean difference, total standard deviation ratio, and within subject standard deviation ratio for a test and a reference drug. A plot of a standardized profile likelihood can be generated along with the maximum likelihood estimate and likelihood intervals, which present evidence for bioequivalence. See Liping Du and Leena Choi (2015) <doi:10.1002/pst.1661>.
Empirical Bayes thresholding using the methods developed by I. M. Johnstone and B. W. Silverman. The basic problem is to estimate a mean vector given a vector of observations of the mean vector plus white noise, taking advantage of possible sparsity in the mean vector. Within a Bayesian formulation, the elements of the mean vector are modelled as having, independently, a distribution that is a mixture of an atom of probability at zero and a suitable heavy-tailed distribution. The mixing parameter can be estimated by a marginal maximum likelihood approach. This leads to an adaptive thresholding approach on the original data. Extensions of the basic method, in particular to wavelet thresholding, are also implemented within the package.
This package provides a comprehensive suite of functions to design and annotate CRISPR guide RNA (gRNAs
) sequences. This includes on- and off-target search, on-target efficiency scoring, off-target scoring, full gene and TSS contextual annotations, and SNP annotation (human only). It currently support five types of CRISPR modalities (modes of perturbations): CRISPR knockout, CRISPR activation, CRISPR inhibition, CRISPR base editing, and CRISPR knockdown. All types of CRISPR nucleases are supported, including DNA- and RNA-target nucleases such as Cas9, Cas12a, and Cas13d. All types of base editors are also supported. gRNA
design can be performed on reference genomes, transcriptomes, and custom DNA and RNA sequences. Both unpaired and paired gRNA
designs are enabled.
Subgroup analyses are routinely performed in clinical trial analyses. From a methodological perspective, two key issues of subgroup analyses are multiplicity (even if only predefined subgroups are investigated) and the low sample sizes of subgroups which lead to highly variable estimates, see e.g. Yusuf et al (1991) <doi:10.1001/jama.1991.03470010097038>. This package implements subgroup estimates based on Bayesian shrinkage priors, see Carvalho et al (2019) <https://proceedings.mlr.press/v5/carvalho09a.html>. In addition, estimates based on penalized likelihood inference are available, based on Simon et al (2011) <doi:10.18637/jss.v039.i05>. The corresponding shrinkage based forest plots address the aforementioned issues and can complement standard forest plots in practical clinical trial analyses.
Google offers public access to global search volumes from its search engine through the Google Trends portal. The package downloads these search volumes provided by Google Trends and uses them to measure and analyze the distribution of search scores across countries or within countries. The package allows researchers and analysts to use these search scores to investigate global trends based on patterns within these scores. This offers insights such as degree of internationalization of firms and organizations or dissemination of political, social, or technological trends across the globe or within single countries. An outline of the package's methodological foundations and potential applications is available as a working paper: <https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3969013>.