Selects matched samples of the original treated and control groups with similar covariate distributions -- can be used to match exactly on covariates, to match on propensity scores, or perform a variety of other matching procedures. The package also implements a series of recommendations offered in Ho, Imai, King, and Stuart (2007) <DOI:10.1093/pan/mpl013>. (The gurobi package, which is not on CRAN, is optional and comes with an installation of the Gurobi Optimizer, available at <https://www.gurobi.com>.).
The Self-Organizing Maps with Built-in Missing Data Imputation. Missing values are imputed and regularly updated during the online Kohonen algorithm. Our method can be used for data visualisation, clustering or imputation of missing data. It is an extension of the online algorithm of the kohonen package. The method is described in the article "Self-Organizing Maps for Exploration of Partially Observed Data and Imputation of Missing Values" by S. Rejeb, C. Duveau, T. Rebafka (2022) <arXiv:2202.07963>.
When a network is partially observed (here, NAs in the adjacency matrix rather than 1 or 0 due to missing information between node pairs), it is possible to account for the underlying process that generates those NAs. missSBM', presented in Barbillon, Chiquet and Tabouy (2022) <doi:10.18637/jss.v101.i12>, adjusts the popular stochastic block model from network data sampled under various missing data conditions, as described in Tabouy, Barbillon and Chiquet (2019) <doi:10.1080/01621459.2018.1562934>.
Analise multivariada, tendo funcoes que executam analise de correspondencia simples (CA) e multipla (MCA), analise de componentes principais (PCA), analise de correlacao canonica (CCA), analise fatorial (FA), escalonamento multidimensional (MDS), analise discriminante linear (LDA) e quadratica (QDA), analise de cluster hierarquico e nao hierarquico, regressao linear simples e multipla, analise de multiplos fatores (MFA) para dados quantitativos, qualitativos, de frequencia (MFACT) e dados mistos, biplot, scatter plot, projection pursuit (PP), grant tour e outras funcoes uteis para a analise multivariada.
Allows performing forwards prediction for the General Unified Threshold model of Survival using compiled ode code. This package was created to avoid dependency with the morse package that requires the installation of JAGS'. This package is based on functions from the morse package v3.3.1: Virgile Baudrot, Sandrine Charles, Marie Laure Delignette-Muller, Wandrille Duchemin, Benoit Goussen, Nils Kehrein, Guillaume Kon-Kam-King, Christelle Lopes, Philippe Ruiz, Alexander Singer and Philippe Veber (2021) <https://CRAN.R-project.org/package=morse>.
Utilizes the lme4 and optimx packages (previously the optim() function from stats') to estimate (generalized) linear mixed models (GLMM) with factor structures using a profile likelihood approach, as outlined in Jeon and Rabe-Hesketh (2012) <doi:10.3102/1076998611417628> and Rockwood and Jeon (2019) <doi:10.1080/00273171.2018.1516541>. Factor analysis and item response models can be extended to allow for an arbitrary number of nested and crossed random effects, making it useful for multilevel and cross-classified models.
Assessment for statistically-based PPQ sampling plan, including calculating the passing probability, optimizing the baseline and high performance cutoff points, visualizing the PPQ plan and power dynamically. The analytical idea is based on the simulation methods from the textbook Burdick, R. K., LeBlond, D. J., Pfahler, L. B., Quiroz, J., Sidor, L., Vukovinsky, K., & Zhang, L. (2017). Statistical Methods for CMC Applications. In Statistical Applications for Chemistry, Manufacturing and Controls (CMC) in the Pharmaceutical Industry (pp. 227-250). Springer, Cham.
There are three functions: qol, miss_qol and miss_patient takes input of the data set containing the answers of QOL questionnaire. It will compute the three types of domain based scale scores: Global, Functional, and Symptoms. In case of missing data, the miss_qol and miss_patient functions will make the required changes and then calculate the domain-wise scale scores. Finally, provide an output replacing the question columns with the domain-based scale scores in the original data set.
Conduct various tests for evaluating implicit biases in word embeddings: Word Embedding Association Test (Caliskan et al., 2017), <doi:10.1126/science.aal4230>, Relative Norm Distance (Garg et al., 2018), <doi:10.1073/pnas.1720347115>, Mean Average Cosine Similarity (Mazini et al., 2019) <arXiv:1904.04047>, SemAxis (An et al., 2018) <arXiv:1806.05521>, Relative Negative Sentiment Bias (Sweeney & Najafian, 2019) <doi:10.18653/v1/P19-1162>, and Embedding Coherence Test (Dev & Phillips, 2019) <arXiv:1901.07656>.
United is a software tool which can be downloaded at the following website <http://www.schroepl.net/pbm/software/united/>. In general, it is a virtual manager game for football teams. This package contains helpful functions for determining an optimal formation for a virtual match in United. E.g. knowing that the opponent has a strong defensive it is advisable to beat him in the midfield. Furthermore, this package contains functions for computing the optimal usage of hardness in a game.
We provide several avenues to predict and account for user-based mortality and tag loss during mark-recapture studies. When planning a study on a target species, the retentionmort_generation() function can be used to produce multiple synthetic mark-recapture datasets to anticipate the error associated with a planned field study to guide method development to reduce error. Similarly, if field data was already collected, the retentionmort() function can be used to predict the error from already generated data to adjust for user-based mortality and tag loss. The test_dataset_retentionmort() function will provide an example dataset of how data should be inputted into the function to run properly. Lastly, the retentionmort_figure() function can be used on any dataset generated from either model function to produce an rmarkdown printout of preliminary analysis associated with the model, including summary statistics and figures. Methods and results pertaining to the formation of this package can be found in McCutcheon et al. (in review, "Predicting tagging-related mortality and tag loss during mark-recapture studies").
This package reads Bruker NMR data directories both zipped and unzipped. It provides automated and efficient signal processing for untargeted NMR metabolomics. It is able to interpolate the samples, detect outliers, exclude regions, normalize, detect peaks, align the spectra, integrate peaks, manage metadata and visualize the spectra. After spectra processing, it can apply multivariate analysis on extracted data. Efficient plotting with 1-D data is also available. Basic reading of 1D ACD/Labs exported JDX samples is also available.
This package provides efficient low-level and highly reusable S4 classes for storing ranges of integers, RLE vectors (Run-Length Encoding), and, more generally, data that can be organized sequentially (formally defined as Vector objects), as well as views on these Vector objects. Efficient list-like classes are also provided for storing big collections of instances of the basic classes. All classes in the package use consistent naming and share the same rich and consistent "Vector API" as much as possible.
This package provides the data for the gene expression enrichment analysis conducted in the package ABAEnrichment. The package includes three datasets which are derived from the Allen Brain Atlas:
Gene expression data from Human Brain (adults) averaged across donors,
Gene expression data from the Developing Human Brain pooled into five age categories and averaged across donors, and
a developmental effect score based on the Developing Human Brain expression data.
All datasets are restricted to protein coding genes.
The robin-map library is a C++ implementation of a fast hash map and hash set using open-addressing and linear robin hood hashing with backward shift deletion to resolve collisions.
Four classes are provided: tsl::robin_map, tsl::robin_set, tsl::robin_pg_map and tsl::robin_pg_set. The first two are faster and use a power of two growth policy, the last two use a prime growth policy instead and are able to cope better with a poor hash function.
This package contains implementation of DecontX (Yang et al. 2020), a decontamination algorithm for single-cell RNA-seq, and DecontPro (Yin et al. 2023), a decontamination algorithm for single cell protein expression data. DecontX is a novel Bayesian method to computationally estimate and remove RNA contamination in individual cells without empty droplet information. DecontPro is a Bayesian method that estimates the level of contamination from ambient and background sources in CITE-seq ADT dataset and decontaminate the dataset.
This package aims at creating a predictive model of regulatory sequences used to score unknown sequences based on the content of DNA motifs, next-generation sequencing (NGS) peaks and signals and other numerical scores of the sequences using supervised classification. The package contains a workflow based on the support vector machine (SVM) algorithm that maps features to sequences, optimize SVM parameters and feature number and creates a model that can be stored and used to score the regulatory potential of unknown sequences.
VeloViz uses each cell’s current observed and predicted future transcriptional states inferred from RNA velocity analysis to build a nearest neighbor graph between cells in the population. Edges are then pruned based on a cosine correlation threshold and/or a distance threshold and the resulting graph is visualized using a force-directed graph layout algorithm. VeloViz can help ensure that relationships between cell states are reflected in the 2D embedding, allowing for more reliable representation of underlying cellular trajectories.
Toolbox for the experimental aquatic chemist, focused on acidification and CO2 air-water exchange. It contains all elements to model the pH, the related CO2 air-water exchange, and aquatic acid-base chemistry for an arbitrary marine, estuarine or freshwater system. It contains a suite of tools for sensitivity analysis, visualisation, modelling of chemical batches, and can be used to build dynamic models of aquatic systems. As from version 1.0-4, it also contains functions to calculate the buffer factors.
Intended to facilitate acoustic analysis of (animal) sound propagation experiments, which typically aim to quantify changes in signal structure when transmitted in a given habitat by broadcasting and re-recording animal sounds at increasing distances. The package offers a workflow with functions to prepare the data set for analysis as well as to calculate and visualize several degradation metrics, including blur ratio, signal-to-noise ratio, excess attenuation and envelope correlation among others (Dabelsteen et al 1993 <doi:10.1121/1.406682>).
This package provides tools for measuring the compositionality of signalling systems (in particular the information-theoretic measure due to Spike (2016) <http://hdl.handle.net/1842/25930> and the Mantel test for distance matrix correlation (after Dietz 1983) <doi:10.1093/sysbio/32.1.21>), functions for computing string and meaning distance matrices as well as an implementation of the Page test for monotonicity of ranks (Page 1963) <doi:10.1080/01621459.1963.10500843> with exact p-values up to k = 22.
Estimating mutation and selection coefficients on synonymous codon bias usage based on models of ribosome overhead cost (ROC). Multinomial logistic regression and Markov Chain Monte Carlo are used to estimate and predict protein production rates with/without the presence of expressions and measurement errors. Work flows with examples for simulation, estimation and prediction processes are also provided with parallelization speedup. The whole framework is tested with yeast genome and gene expression data of Yassour, et al. (2009) <doi:10.1073/pnas.0812841106>.
This package provides functions for range estimation in birds based on Pennycuick (2008) and Pennycuick (1975), Flight program which compliments Pennycuick (2008) requires manual entry of birds which can be tedious when there are hundreds of birds to estimate. Implemented are two ODE methods discussed in Pennycuick (1975) and time-marching computation methods as in Pennycuick (1998) and Pennycuick (2008). See Pennycuick (1975, ISBN:978-0-12-249405-5), Pennycuick (1998) <doi:10.1006/jtbi.1997.0572>, and Pennycuick (2008, ISBN:9780080557816).
Visualise sequential distributions using a range of plotting styles. Sequential distribution data can be input as either simulations or values corresponding to percentiles over time. Plots are added to existing graphic devices using the fan function. Users can choose from four different styles, including fan chart type plots, where a set of coloured polygon, with shadings corresponding to the percentile values are layered to represent different uncertainty levels. Full details in R Journal article; Abel (2015) <doi:10.32614/RJ-2015-002>.