The goal of statcodelists is to promote the reuse and exchange of statistical information and related metadata with making the internationally standardized SDMX code lists available for the R user. SDMX has been published as an ISO International Standard (ISO 17369). The metadata definitions, including the codelists are updated regularly according to the standard. The authoritative version of the code lists made available in this package is <https://sdmx.org/?page_id=3215/>.
planttfhunter is used to identify plant transcription factors (TFs) from protein sequence data and classify them into families and subfamilies using the classification scheme implemented in PlantTFDB
. TFs are identified using pre-built hidden Markov model profiles for DNA-binding domains. Then, auxiliary and forbidden domains are used with DNA-binding domains to classify TFs into families and subfamilies (when applicable). Currently, TFs can be classified in 58 different TF families/subfamilies.
This Python module provides line editing functions similar to the default Emacs-style ones of GNU Readline. Unlike the Python standard library's readline
package, this one allows access to those capabilities in settings outside of a standard command-line interface. It is especially well-suited to interfacing with Urwid, due to a shared syntax for describing key inputs.
Currently, all stateless Readline commands are implemented. Yanking and history are not supported.
With Serverspec, you can write RSpec tests for checking your servers are configured correctly.
Serverspec tests your servers’ actual state by executing command locally, via SSH, via WinRM, via Docker API and so on. So you don’t need to install any agent softwares on your servers and can use any configuration management tools, Puppet, Ansible, CFEngine, Itamae and so on.
But the true aim of Serverspec is to help refactoring infrastructure code.
This package implements an innovative approach to community detection in social networks using Association Rules Learning. The package provides tools for processing graph and rules objects, generating association rules, and detecting communities based on node interactions. Designed to facilitate advanced research in Social Network Analysis, this package leverages association rules learning for enhanced community detection. This approach is described in El-Moussaoui et al. (2021) <doi:10.1007/978-3-030-66840-2_3>.
Loads and displays images, selectively masks specified background colors, bins pixels by color using either data-dependent or automatically generated color bins, quantitatively measures color similarity among images using one of several distance metrics for comparing pixel color clusters, and clusters images by object color similarity. Uses CIELAB, RGB, or HSV color spaces. Originally written for use with organism coloration (reef fish color diversity, butterfly mimicry, etc), but easily applicable for any image set.
The user can directly compute and display false discovery rates from inputted p-values or z-scores under a variety of assumptions. p.fdr()
computes FDRs, adjusted p-values and decision reject vectors from inputted p-values or z-values. get.pi0()
estimates the proportion of data that are truly null. plot.p.fdr()
plots the FDRs, adjusted p-values, and the raw p-values points against their rejection threshold lines.
Check concordance of a vector of mutation impacts with standard dictionaries such as Sequence Ontology (SO) <http://www.sequenceontology.org/>, Mutation Annotation Format (MAF) <https://docs.gdc.cancer.gov/Encyclopedia/pages/Mutation_Annotation_Format_TCGAv2/> or Prediction and Annotation of Variant Effects (PAVE) <https://github.com/hartwigmedical/hmftools/tree/master/pave>. It enables conversion between SO/PAVE and MAF terms and selection of the most severe consequence where multiple ampersand (&) delimited impacts are given.
This package provides functions for testing randomness for a univariate time series with arbitrary distribution (discrete, continuous, mixture of both types) and for testing independence between random variables with arbitrary distributions. The test statistics are based on the multilinear empirical copula and multipliers are used to compute P-values. The test of independence between random variables appeared in Genest, Nešlehová, Rémillard & Murphy (2019) and the test of randomness appeared in Nasri (2022).
Split Knockoff is a data adaptive variable selection framework for controlling the (directional) false discovery rate (FDR) in structural sparsity, where variable selection on linear transformation of parameters is of concern. This proposed scheme relaxes the linear subspace constraint to its neighborhood, often known as variable splitting in optimization. Simulation experiments can be reproduced following the Vignette. Split Knockoffs is first defined in Cao et al. (2021) <doi:10.48550/arXiv.2103.16159>
.
Calculates federal and state income taxes in the United States. It acts as a wrapper to the NBER's TAXSIM 35 (<http://taxsim.nber.org/taxsim35/>) tax simulator. TAXSIM 35 conducts the calculations, while usincometaxes prepares the data for TAXSIM 35, sends the data to TAXSIM 35's server or communicates with the Web Assembly file, retrieves the data, and places it into a data frame. All without the user worrying about this process.
This package implements the chain binomial model for analysis of infectious disease data. Contains functions for calculating probabilities of the final size of infectious disease outbreaks using the method from D. Ludwig (1975) <doi:10.1016/0025-5564(75)90119-4> and for outbreaks that are not concluded, from Lindstrøm et al. (2024) <doi:10.48550/arXiv.2403.03948>
. The package also contains methods for estimation and regression analysis of secondary attack rates.
This package implements Data Envelopment Analysis (DEA) with a hyperbolic orientation using a non-linear programming solver. It enables flexible estimations with weight restrictions, non-discretionary variables, and a generalized distance function. Additionally, it allows for the calculation of slacks and super-efficiency scores. The methods are detailed in à ttl et al. (2023), <doi:10.1016/j.dajour.2023.100343>. Furthermore, the package provides a non-linear profitability estimation built upon the DEA framework.
The Programme for International Student Assessment (PISA) is a global study conducted by the Organization for Economic Cooperation and Development (OECD) in member and non-member countries to assess educational systems by assessing 15-year-old school students academic performance in mathematics, science, and reading. This datasets contains information on their scores and other socioeconomic characteristics, information about their school and its infrastructure, as well as the countries that are taking part in the program.
This package provides a collection of sparse and regularized discriminant analysis methods intended for small-sample, high-dimensional data sets. The package features the High-Dimensional Regularized Discriminant Analysis classifier from Ramey et al. (2017) <arXiv:1602.01182>
. Other classifiers include those from Dudoit et al. (2002) <doi:10.1198/016214502753479248>, Pang et al. (2009) <doi:10.1111/j.1541-0420.2009.01200.x>, and Tong et al. (2012) <doi:10.1093/bioinformatics/btr690>.
Table 1 is the classical way to describe the patients in a clinical study. The amount of splits in the data in such a table is limited. Table1Heatmap draws a heatmap of all crosstables that can be generated with the data. Users can choose between showing the actual crosstables or direction of effect of associations, and highlight associations by number of patients or p-values. v1.2 - fixed "missing "no visible global function definition for ..".
RNA-seq data generated by some library preparation methods, such as rRNA-depletion-based
method and the SMART-seq method, might be contaminated by genomic DNA (gDNA
), if DNase I disgestion is not performed properly during RNA preparation. CleanUpRNAseq
is developed to check if RNA-seq data is suffered from gDNA
contamination. If so, it can perform correction for gDNA
contamination and reduce false discovery rate of differentially expressed genes.
Import gaze data from edf files generated by the SR Research <https://www.sr-research.com/> EyeLink
eye tracker. Gaze data, both recorded events and samples, is imported per trial. The package allows to extract events of interest, such as saccades, blinks, etc. as well as recorded variables and custom events (areas of interest, triggers) into separate tables. The package requires EDF API library that can be obtained at <https://www.sr-research.com/support/>.
This package provides functions and classes for spatial resampling to use with the rsample package, such as spatial cross-validation (Brenning, 2012) <doi:10.1109/IGARSS.2012.6352393>. The scope of rsample and spatialsample is to provide the basic building blocks for creating and analyzing resamples of a spatial data set, but neither package includes functions for modeling or computing statistics. The resampled spatial data sets created by spatialsample do not contain much overhead in memory.
This package provides functionality of a statistical testing implementation whether a dataset comes from a symmetric distribution when the center of symmetry is unknown, including Wilcoxon test and sign test procedure. In addition, sample size determination for both tests is provided. The Wilcoxon test procedure is described in Vexler et al. (2023) <https://www.sciencedirect.com/science/article/abs/pii/S0167947323000579>, and the sign test is outlined in Gastwirth (1971) <https://www.jstor.org/stable/2284233>.
Understanding spatial association is essential for spatial statistical inference, including factor exploration and spatial prediction. Geographically optimal similarity (GOS) model is an effective method for spatial prediction, as described in Yongze Song (2022) <doi:10.1007/s11004-022-10036-8>. GOS was developed based on the geographical similarity principle, as described in Axing Zhu (2018) <doi:10.1080/19475683.2018.1534890>. GOS has advantages in more accurate spatial prediction using fewer samples and critically reduced prediction uncertainty.
This package provides the ability to create dynamic citations in which the bibliographic information is pulled from the web rather than having to be entered into a local database such as bibtex ahead of time. The package is primarily aimed at authoring in the R markdown format, and can provide outputs for web-based authoring such as linked text for inline citations. Cite using a DOI', URL, or bibtex file key. See the package URL for details.
Visualize confounder control in meta-analysis. metaconfoundr is an approach to evaluating bias in studies used in meta-analyses based on the causal inference framework. Study groups create a causal diagram displaying their assumptions about the scientific question. From this, they develop a list of important confounders'. Then, they evaluate whether studies controlled for these variables well. metaconfoundr is a toolkit to facilitate this process and visualize the results as heat maps, traffic light plots, and more.
HybridExpress
can be used to perform comparative transcriptomics analysis of hybrids (or allopolyploids) relative to their progenitor species. The package features functions to perform exploratory analyses of sample grouping, identify differentially expressed genes in hybrids relative to their progenitors, classify genes in expression categories (N = 12) and classes (N = 5), and perform functional analyses. We also provide users with graphical functions for the seamless creation of publication-ready figures that are commonly used in the literature.