Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
This package provides tools for dealing with Unique Molecular Identifiers (UMIs) and Random Molecular Tags (RMTs) in genetic sequences. There are six tools: the extract and whitelist commands are used to prepare a fastq containing UMIs +/- cell barcodes for alignment. The remaining commands, group, dedup, and count/count_tab, are used to identify PCR duplicates using the UMIs and perform different levels of analysis depending on the needs of the user.
Entrez Direct (EDirect) is a method for accessing the National Center for Biotechnology Information's (NCBI) set of interconnected databases (publication, sequence, structure, gene, variation, expression, etc.) from a terminal. Functions take search terms from command-line arguments. Individual operations are combined to build multi-step queries. Record retrieval and formatting normally complete the process.
EDirect also provides an argument-driven function that simplifies the extraction of data from document summaries or other results that are returned in structured XML format. This can eliminate the need for writing custom software to answer ad hoc questions.
Trinity assembles transcript sequences from Illumina RNA-Seq data. Trinity represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms and to tease apart transcripts derived from paralogous genes.
This package provides graphical user interfaces to organize and visualize Nanopore sequencing data.
This is an R package providing additional capabilities and speed for GenomicRanges operations.
This is a package for normalizing Hi-C contact counts efficiently.
QTLtools is a tool set for molecular QTL discovery and analysis. It allows going from the raw genetic sequence data to collection of molecular Quantitative Trait Loci (QTLs) in few easy-to-perform steps.
CENTIPEDE applies a hierarchical Bayesian mixture model to infer regions of the genome that are bound by particular transcription factors. It starts by identifying a set of candidate binding sites, and then aims to classify the sites according to whether each site is bound or not bound by a transcription factor. CENTIPEDE is an unsupervised learning algorithm that discriminates between two different types of motif instances using as much relevant information as possible.
Ritornello is a ChIP-seq peak calling algorithm based on signal processing that can accurately call binding events without the need to do a pair total DNA input or IgG control sample. It has been tested for use with narrow binding events such as transcription factor ChIP-seq.
This package provides data for the book "Computational Genomics with R".
twobitreader is a Python library for reading .2bit files as used by the UCSC genome browser.
IMP's broad goal is to contribute to a comprehensive structural characterization of biomolecules ranging in size and complexity from small peptides to large macromolecular assemblies, by integrating data from diverse biochemical and biophysical experiments. IMP provides a C++ and Python toolbox for solving complex modeling problems, and a number of applications for tackling some common problems in a user-friendly way.
CellBender is a software package for eliminating technical artifacts from high-throughput single-cell RNA sequencing (scRNA-seq) data.
Biopython is a set of tools for biological computation including parsers for bioinformatics files into Python data structures; interfaces to common bioinformatics programs; a standard sequence class and tools for performing common operations on them; code to perform data classification; code for dealing with alignments; code making it easy to split up parallelizable tasks into separate processes; and more.
This package provides an implementation of the BITS (Binary Interval Search) algorithm, an approach to interval set intersection. It is especially suited for the comparison of diverse genomic datasets and the exploration of large datasets of genome intervals (e.g. genes, sequence alignments).
CPAT is a method to distinguish coding and noncoding RNA by using a logistic regression model based on four pure sequence-based, linguistic features: ORF size, ORF coverage, Ficket TESTCODE, and Hexamer usage bias. Linguistic features based method does not require other genomes or protein databases to perform alignment and is more robust. Because it is alignment-free, it runs much faster and also easier to use.
This package provides a package that makes it easy to implement sankey, alluvial and sankey bump plots in ggplot2.
t-Stochastic Neighborhood Embedding (t-SNE) is a method for dimensionality reduction and visualization of high dimensional datasets. A popular implementation of t-SNE uses the Barnes-Hut algorithm to approximate the gradient at each iteration of gradient descent. This implementation differs in these ways:
Instead of approximating the N-body simulation using Barnes-Hut, we interpolate onto an equispaced grid and use FFT to perform the convolution.
Instead of computing nearest neighbors using vantage-point trees, we approximate nearest neighbors using the Annoy library. The neighbor lookups are multithreaded to take advantage of machines with multiple cores.
TADbit is a complete Python library to deal with all steps to analyze, model, and explore 3C-based data. With TADbit the user can map FASTQ files to obtain raw interaction binned matrices (Hi-C like matrices), normalize and correct interaction matrices, identify and compare the so-called Topologically Associating Domains (TADs), build 3D models from the interaction matrices, and finally, extract structural properties from the models. TADbit is complemented by TADkit for visualizing 3D models.
This package contains a multicore Barnes-Hut implementation of the t-SNE algorithm. The implementation is described here: http://lvdmaaten.github.io/publications/papers/JMLR_2014.pdf.
This package implements methods to project single-cell RNA-seq data onto a reference atlas, enabling interpretation of unknown cell transcriptomic states in the the context of known, reference states.
MUSIC is an algorithm for identification of enriched regions at multiple scales in the read depth signals from ChIP-Seq experiments.
CoolBox is a toolkit for visual analysis of genomics data. It aims to be highly compatible with the Python ecosystem, easy to use and highly customizable with a well-designed user interface. It can be used in various visualization situations, for example, to produce high-quality genome track plots or fetch common used genomic data files with a Python script or command line, interactively explore genomic data within Jupyter environment or web browser.
This package provides extra utility functions to perform common tasks in the analysis of omics data, leveraging and enhancing features provided by Bioconductor packages.