Spatial data are generally auto-correlated, meaning that if two units selected are close to each other, then it is likely that they share the same properties. For this reason, when sampling in the population it is often needed that the sample is well spread over space. A new method to draw a sample from a population with spatial coordinates is proposed. This method is called wave (Weakly Associated Vectors) sampling. It uses the less correlated vector to a spatial weights matrix to update the inclusion probabilities vector into a sample. For more details see Raphaël Jauslin and Yves Tillé (2019) <doi:10.1007/s13253-020-00407-1>.
An implementation of representation-dependent gene level operations for genetic algorithms with genes representing permutations: Initialization of genes, mutation, and crossover. The crossover operation provided is position-based crossover (Syswerda, G., Chap. 21 in Davis, L. (1991, ISBN:0-442-00173-8). For mutation, several variants are included: Order-based mutation (Syswerda, G., Chap. 21 in Davis, L. (1991, ISBN:0-442-00173-8), randomized Lin-Kernighan heuristics (Croes, G. A. (1958) <doi:10.1287/opre.6.6.791> and Lin, S. and Kernighan. B. W. (1973) <doi:10.1287/opre.21.2.498>), and randomized greedy operators. A random mix operator for mutation selects a mutation variant randomly.
This package provides Python-based extensions to enhance data analytics workflows, particularly for tasks involving data preprocessing and predictive modeling. Includes tools for data sampling, transformation, feature selection, balancing strategies (e.g., SMOTE), and model construction. These capabilities leverage Python libraries via the reticulate interface, enabling seamless integration with a broader machine learning ecosystem. Supports instance selection and hybrid workflows that combine R and Python functionalities for flexible and reproducible analytical pipelines. The architecture is inspired by the Experiment Lines approach, which promotes modularity, extensibility, and interoperability across tools. More information on Experiment Lines is available in Ogasawara et al. (2009) <doi:10.1007/978-3-642-02279-1_20>.
Mitteroecker & Gunz (2009) <doi:10.1007/s11692-009-9055-x> describe how geometric morphometric methods allow researchers to quantify the size and shape of physical biological structures. We provide tools to extend geometric morphometric principles to the study of non-physical structures, hormone profiles, as outlined in Ehrlich et al (2021) <doi:10.1002/ajpa.24514>. Easily transform daily measures into multivariate landmark-based data. Includes custom functions to apply multivariate methods for data exploration as well as hypothesis testing. Also includes shiny web app to streamline data exploration. Developed to study menstrual cycle hormones but functions have been generalized and should be applicable to any biomarker over any time period.
For a given graph containing vertices, edges, and a signal associated with the vertices, the PathwaySpace package performs a convolution operation, which involves a weighted combination of neighboring vertices and their associated signals. The package then uses a decay function to project these signals, creating geodesic paths on a 2D-image space. PathwaySpace could have various applications, such as visualizing network data in a graphical format that highlights the relationships and signal strengths between vertices. It can be particularly useful for understanding the influence of signals through complex networks. By combining graph theory, signal processing, and visualization, the PathwaySpace package provides a novel way of representing graph data.
Simplifies regression modeling in R by integrating multiple modeling and summarization tools into a cohesive, user-friendly interface. Designed to be accessible for researchers, particularly those in Low- and Middle-Income Countries (LMIC). Built upon widely accepted statistical methods, including logistic regression (Hosmer et al. 2013, ISBN:9781118548429), log-binomial regression (Spiegelman and Hertzmark 2005 <doi:10.1093/aje/kwi188>), Poisson and robust Poisson regression (Zou 2004 <doi:10.1093/aje/kwh090>), negative binomial regression (Hilbe 2011, ISBN:9780521179515), and linear regression (Kutner et al. 2005, ISBN:9780071122214). Leverages multiple dependencies to ensure high-quality output and generate reproducible, publication-ready tables in alignment with best practices in epidemiology and applied statistics.
This package performs demographic, bifurcation and evolutionary analysis of physiologically structured population models, which is a class of models that consistently translates continuous-time models of individual life history to the population level. A model of individual life history has to be implemented specifying the individual-level functions that determine the life history, such as development and mortality rates and fecundity. M.A. Kirkilionis, O. Diekmann, B. Lisser, M. Nool, B. Sommeijer & A.M. de Roos (2001) <doi:10.1142/S0218202501001264>. O.Diekmann, M.Gyllenberg & J.A.J.Metz (2003) <doi:10.1016/S0040-5809(02)00058-8>. A.M. de Roos (2008) <doi:10.1111/j.1461-0248.2007.01121.x>.
Aimed at applying the Harvest classification tree algorithm, modified algorithm of classic classification tree.The harvested tree has advantage of deleting redundant rules in trees, leading to a simplify and more efficient tree model.It was firstly used in drug discovery field, but it also performs well in other kinds of data, especially when the region of a class is disconnected. This package also improves the basic harvest classification tree algorithm by extending the field of data of algorithm to both continuous and categorical variables. To learn more about the harvest classification tree algorithm, you can go to http://www.stat.ubc.ca/Research/TechReports/techreports/220.pdf for more information.
High throughput toxicokinetics ("HTTK") is the combination of 1) chemical-specific in vitro measurements or in silico predictions and 2) generic mathematical models, to predict absorption, distribution, metabolism, and excretion by the body. HTTK methods have been described by Pearce et al. (2017) (<doi:10.18637/jss.v079.i04>) and Breen et al. (2021) (<doi:10.1080/17425255.2021.1935867>). Here we provide examples (vignettes) applying HTTK to solve various problems in bioinformatics, toxicology, and exposure science. In accordance with Davidson-Fritz et al. (2025) (<doi:10.1371/journal.pone.0321321>), whenever a new HTTK model is developed, the code to generate the figures evaluating that model is added as a new vignettte.
Use stem analysis data to reconstructing tree growth and carbon accumulation. Users can independently or in combination perform a number of standard tasks for any tree species. (i) Age class determination. (ii) The cumulative growth, mean annual increment, and current annual increment of diameter at breast height (DBH) with bark, tree height, and stem volume with bark are estimated. (iii) Tree biomass and carbon storage estimation from volume and allometric models are calculated. (iv) Height-diameter relationship is fitted with nonlinear models, if diameter at breast height (DBH) or tree height are available, which can be used to retrieve tree height and diameter at breast height (DBH). <https://github.com/forestscientist/StemAnalysis>.
Multivariate data analysis is the simultaneous observation of more than one characteristic. In contrast to the analysis of univariate data, in this approach not only a single variable or the relation between two variables can be investigated, but the relations between many attributes can be considered. For the statistical analysis of chemical data one has to take into account the special structure of this type of data. This package contains about 30 functions, mostly for regression, classification and model evaluation and includes some data sets used in the R help examples. It was designed as a R companion to the book "Introduction to Multivariate Statistical Analysis in Chemometrics" written by K. Varmuza and P. Filzmoser (2009).
This package provides functions for identification and transportation of causal effects. Provides a conditional causal effect identification algorithm (IDC) by Shpitser, I. and Pearl, J. (2006) <http://ftp.cs.ucla.edu/pub/stat_ser/r329-uai.pdf>, an algorithm for transportability from multiple domains with limited experiments by Bareinboim, E. and Pearl, J. (2014) <http://ftp.cs.ucla.edu/pub/stat_ser/r443.pdf>, and a selection bias recovery algorithm by Bareinboim, E. and Tian, J. (2015) <http://ftp.cs.ucla.edu/pub/stat_ser/r445.pdf>. All of the previously mentioned algorithms are based on a causal effect identification algorithm by Tian , J. (2002) <http://ftp.cs.ucla.edu/pub/stat_ser/r309.pdf>.
adverSCarial is an R Package designed for generating and analyzing the vulnerability of scRNA-seq classifiers to adversarial attacks. The package is versatile and provides a format for integrating any type of classifier. It offers functions for studying and generating two types of attacks, single gene attack and max change attack. The single-gene attack involves making a small modification to the input to alter the classification. The max-change attack involves making a large modification to the input without changing its classification. The CGD attack is based on an estimated gradient descent. against adversarial attacks. The package provides a comprehensive solution for evaluating the robustness of scRNA-seq classifiers against adversarial attacks.
This package provides functions to perform comparative causal mediation analysis to compare the mediation effects of different treatments via a common mediator. Results contain the estimates and confidence intervals for the two comparative causal mediation analysis estimands, as well as the ATE and ACME for each treatment. Functions provided in the package will automatically assess the comparative causal mediation analysis scope conditions (i.e. for each comparative causal mediation estimand, a numerator and denominator that are both estimated with the desired statistical significance and of the same sign). Results will be returned for each comparative causal mediation estimand only if scope conditions are met for it. See details in Bansak(2020)<doi:10.1017/pan.2019.31>.
This package provides tools for crop breeding analysis including Genetic Coefficient of Variation (GCV), Phenotypic Coefficient of Variation (PCV), heritability, genetic advance calculations, stability analysis using the Eberhart-Russell model, two-way ANOVA for genotype-environment interactions, and Additive Main Effects and Multiplicative Interaction (AMMI) analysis. These tools are developed for crop breeding research and stability evaluation under various environmental conditions. The methods are based on established statistical and biometrical principles. Refer to Eberhart and Russell (1966) <doi:10.2135/cropsci1966.0011183X000600010011x> for stability parameters, Fisher (1935) "The Design of Experiments" <ISBN:9780198522294>, Falconer (1996) "Introduction to Quantitative Genetics" <ISBN:9780582243026>, and Singh and Chaudhary (1985) "Biometrical Methods in Quantitative Genetic Analysis" <ISBN:9788122433764> for foundational methodologies.
This package provides functions to calculate weights, estimates of changes and corresponding variance estimates for panel data with non-response. Partially overlapping samples are handled. Initially, weights are calculated by linear calibration. By default, the survey package is used for this purpose. It is also possible to use ReGenesees, which can be installed from <https://github.com/DiegoZardetto/ReGenesees>. Variances of linear combinations (changes and averages) and ratios are calculated from a covariance matrix based on residuals according to the calibration model. The methodology was presented at the conference, The Use of R in Official Statistics, and is described in Langsrud (2016) <http://www.revistadestatistica.ro/wp-content/uploads/2016/06/RRS2_2016_A021.pdf>.
Create a skeleton shiny application with create_template() that is reproducible, can be saved and meets academic standards for attribution. Forked from wallace'. Code is split into modules that are loaded and linked together automatically and each call one function. Guidance pages explain modules to users and flexible logging informs them of any errors. Options enable asynchronous operations, viewing of source code, interactive maps and data tables. Use to create complex analytical applications, following best practices in open science and software development. Includes functions for automating repetitive development tasks and an example application at run_shinyscholar() that requires install.packages("shinyscholar", dependencies = TRUE). A guide to developing applications can be found on the package website.
This package provides a comprehensive suite of functions to design and annotate CRISPR guide RNA (gRNAs) sequences. This includes on- and off-target search, on-target efficiency scoring, off-target scoring, full gene and TSS contextual annotations, and SNP annotation (human only). It currently support five types of CRISPR modalities (modes of perturbations): CRISPR knockout, CRISPR activation, CRISPR inhibition, CRISPR base editing, and CRISPR knockdown. All types of CRISPR nucleases are supported, including DNA- and RNA-target nucleases such as Cas9, Cas12a, and Cas13d. All types of base editors are also supported. gRNA design can be performed on reference genomes, transcriptomes, and custom DNA and RNA sequences. Both unpaired and paired gRNA designs are enabled.
This package provides a likelihood method is implemented to present evidence for evaluating bioequivalence (BE). The functions use bioequivalence data [area under the blood concentration-time curve (AUC) and peak concentration (Cmax)] from various crossover designs commonly used in BE studies including a fully replicated, a partially replicated design, and a conventional 2x2 crossover design. They will calculate the profile likelihoods for the mean difference, total standard deviation ratio, and within subject standard deviation ratio for a test and a reference drug. A plot of a standardized profile likelihood can be generated along with the maximum likelihood estimate and likelihood intervals, which present evidence for bioequivalence. See Liping Du and Leena Choi (2015) <doi:10.1002/pst.1661>.
Empirical Bayes thresholding using the methods developed by I. M. Johnstone and B. W. Silverman. The basic problem is to estimate a mean vector given a vector of observations of the mean vector plus white noise, taking advantage of possible sparsity in the mean vector. Within a Bayesian formulation, the elements of the mean vector are modelled as having, independently, a distribution that is a mixture of an atom of probability at zero and a suitable heavy-tailed distribution. The mixing parameter can be estimated by a marginal maximum likelihood approach. This leads to an adaptive thresholding approach on the original data. Extensions of the basic method, in particular to wavelet thresholding, are also implemented within the package.
Subgroup analyses are routinely performed in clinical trial analyses. From a methodological perspective, two key issues of subgroup analyses are multiplicity (even if only predefined subgroups are investigated) and the low sample sizes of subgroups which lead to highly variable estimates, see e.g. Yusuf et al (1991) <doi:10.1001/jama.1991.03470010097038>. This package implements subgroup estimates based on Bayesian shrinkage priors, see Carvalho et al (2019) <https://proceedings.mlr.press/v5/carvalho09a.html>. In addition, estimates based on penalized likelihood inference are available, based on Simon et al (2011) <doi:10.18637/jss.v039.i05>. The corresponding shrinkage based forest plots address the aforementioned issues and can complement standard forest plots in practical clinical trial analyses.
Google offers public access to global search volumes from its search engine through the Google Trends portal. The package downloads these search volumes provided by Google Trends and uses them to measure and analyze the distribution of search scores across countries or within countries. The package allows researchers and analysts to use these search scores to investigate global trends based on patterns within these scores. This offers insights such as degree of internationalization of firms and organizations or dissemination of political, social, or technological trends across the globe or within single countries. An outline of the package's methodological foundations and potential applications is available as a working paper: <https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3969013>.
This R package provides a single procedure guix.install(), which allows users to install R packages via Guix right from within their running R session. If the requested R package does not exist in Guix at this time, the package and all its missing dependencies will be imported recursively and the generated package definitions will be written to ~/.Rguix/packages.scm. This record of imported packages can be used later to reproduce the environment, and to add the packages in question to a proper Guix channel (or Guix itself). guix.install() not only supports installing packages from CRAN, but also from Bioconductor or even arbitrary git or mercurial repositories, replacing the need for installation via devtools.
EventPointer is an R package to identify alternative splicing events that involve either simple (case-control experiment) or complex experimental designs such as time course experiments and studies including paired-samples. The algorithm can be used to analyze data from either junction arrays (Affymetrix Arrays) or sequencing data (RNA-Seq). The software returns a data.frame with the detected alternative splicing events: gene name, type of event (cassette, alternative 3',...,etc), genomic position, statistical significance and increment of the percent spliced in (Delta PSI) for all the events. The algorithm can generate a series of files to visualize the detected alternative splicing events in IGV. This eases the interpretation of results and the design of primers for standard PCR validation.