Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
This package provides utility functions for manipulating BAM files.
A tandem repeat in DNA is two or more adjacent, approximate copies of a pattern of nucleotides. Tandem Repeats Finder is a program to locate and display tandem repeats in DNA sequences. In order to use the program, the user submits a sequence in FASTA format. The output consists of two files: a repeat table file and an alignment file. Submitted sequences may be of arbitrary length. Repeats with pattern size in the range from 1 to 2000 bases are detected.
Bioinformaticians often have to convert sequence files between formats and do little manipulations on them, and it's not worth writing scripts for that. Seqmagick is a utility to expose the file format conversion in BioPython in a convenient way. Instead of having a big mess of scripts, there is one that takes arguments.
This package has been developed under ROpenSci gudelines to integrate conventional and cutting edge cytometry analysis tools under a unified framework. It aims to represent an intuitive and interactive approach to analysing cytometry data in R.
BayesPrism includes deconvolution and embedding learning modules. The deconvolution module models a prior from cell type-specific expression profiles from scRNA-seq to jointly estimate the posterior distribution of cell type composition and cell type-specific gene expression from bulk RNA-seq expression of tumor samples. The embedding learning module uses Expectation-maximization (EM) to approximate the tumor expression using a linear combination of malignant gene programs while conditional on the inferred expression and fraction of non-malignant cells estimated by the deconvolution module.
The porechop package is a tool for finding and removing adapters from Oxford Nanopore reads. Adapters on the ends of reads are trimmed off, and when a read has an adapter in its middle, it is treated as chimeric and chopped into separate reads. Porechop performs thorough alignments to effectively find adapters, even at low sequence identity. Porechop also supports demultiplexing of Nanopore reads that were barcoded with the Native Barcoding Kit, PCR Barcoding Kit or Rapid Barcoding Kit.
This package is used for demultiplexing single-cell sequencing experiments of pooled cells. These cells are labeled with barcode oligonucleotides. The package implements methods to fit regression mixture models for a probabilistic classification of cells, including multiplet detection. Demultiplexing error rates can be estimated, and methods for quality control are provided.
The Spliced Transcripts Alignment to a Reference (STAR) software is based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences.
ParDRe is a parallel tool to remove duplicate genetic sequence reads. Duplicate reads can be seen as identical or nearly identical sequences with some mismatches. This tool lets users avoid the analysis of unnecessary reads, reducing the time of subsequent procedures with the dataset (e.g. assemblies, mappings, etc.). The tool is implemented with MPI in order to exploit the parallel capabilities of multicore clusters. It is faster than multithreaded counterparts (end of 2015) for the same number of cores and, thanks to the message-passing technology, it can be executed on clusters.
PyLiftover is a library for quick and easy conversion of genomic (point) coordinates between different assemblies.
This package implements the custom CRAM codecs used for "EXTERNAL" block types. These consist of two variants of the rANS codec (8-bit and 16-bit renormalisation, with run-length encoding and bit-packing also supported in the latter), a dynamic arithmetic coder, and custom codecs for name/ID compression and quality score compression derived from fqzcomp.
This package lets you read and write files in Generic Feature Format (GFF) with Biopython integration.
ReadWriter is a set of R functions to read and write files conveniently.
RAxML is a tool for phylogenetic analysis and post-analysis of large phylogenies.
MOSAIK is a program for mapping second and third-generation sequencing reads to a reference genome. MOSAIK can align reads generated by all the major sequencing technologies, including Illumina, Applied Biosystems SOLiD, Roche 454, Ion Torrent and Pacific BioSciences SMRT.
genomepy is designed to provide a simple and straightforward way to download and use genomic data. This includes
searching available data,
showing the available metadata,
automatically downloading, preprocessing and matching data, and
generating optional aligner indexes.
All with sensible, yet controllable defaults.
PyEGA3 is a tool for viewing and downloading files from authorized EGA datasets. It uses the EGA data API and has several key features:
Files are transferred over secure https connections and received unencrypted, so no need for decryption after download.
Downloads resume from where they left off in the event that the connection is interrupted.
Supports file segmenting and parallelized download of segments, improving overall performance.
After download completes, file integrity is verified using checksums.
Implements the GA4GH-compliant htsget protocol for download of genomic ranges for data files with accompanying index files.
deMULTIplex is an R package for analyzing single-cell RNA sequencing data generated with the MULTI-seq sample multiplexing method. The package includes software to
Convert raw MULTI-seq sample barcode library FASTQs into a sample barcode UMI count matrix, and
Classify cell barcodes into sample barcode groups.
Scanpy is a scalable toolkit for analyzing single-cell gene expression data. It includes preprocessing, visualization, clustering, pseudotime and trajectory inference and differential expression testing. The Python-based implementation efficiently deals with datasets of more than one million cells.
VoltRon is a novel spatial omic analysis toolbox for multi-omics integration using spatial image registration. VoltRon is capable of analyzing multiple types and modalities of spatially-aware datasets. VoltRon visualizes and analyzes regions of interests (ROIs), spots, cells and even molecules.
ravanan is a CWL implementation that is powered by GNU Guix and provides strong reproducibility guarantees. ravanan provides strong caching of intermediate results so the same step of a workflow is never run twice. ravanan captures logs from every step of the workflow for easy tracing back in case of job failures. ravanan currently runs on single machines and on slurm via its API.
Mash is a fast sequence distance estimator that uses the MinHash algorithm and is designed to work with genomes and metagenomes in the form of assemblies or reads.
The NCBI-VDB library implements a highly compressed columnar data warehousing engine that is most often used to store genetic information. Databases are stored in a portable image within the file system, and can be accessed/downloaded on demand across HTTP.
Pybiomart provides a simple pythonic interface to biomart.