Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
Bioawk is an extension to Brian Kernighan's awk, adding the support of several common biological data formats, including optionally gzip'ed BED, GFF, SAM, VCF, FASTA/Q and TAB-delimited formats with column names. It also adds a few built-in functions and a command line option to use TAB as the input/output delimiter. When the new functionality is not used, bioawk is intended to behave exactly the same as the original BWK awk.
MethylDackel will process a coordinate-sorted and indexed BAM or CRAM file containing some form of BS-seq alignments and extract per-base methylation metrics from them. MethylDackel requires an indexed fasta file containing the reference genome as well.
This package implements methods to project single-cell RNA-seq data onto a reference atlas, enabling interpretation of unknown cell transcriptomic states in the the context of known, reference states.
LAMMPS is a classical molecular dynamics simulator designed to run efficiently on parallel computers. LAMMPS has potentials for solid-state materials (metals, semiconductors), soft matter (biomolecules, polymers), and coarse-grained or mesoscopic systems. It can be used to model atoms or, more generically, as a parallel particle simulator at the atomic, meso, or continuum scale.
wfmash is a DNA sequence read mapper based on mash distances and the wavefront alignment algorithm. It is a fork of MashMap that implements base-level alignment via the wflign tiled wavefront global alignment algorithm. It completes MashMap with a high-performance alignment module capable of computing base-level alignments for very large sequences.
Forester is a collection of Java libraries for phylogenomics and evolutionary biology research. It includes support for reading, writing, and exporting phylogenetic trees.
SNAP is a fast and accurate aligner for short DNA reads. It is optimized for modern read lengths of 100 bases or higher, and takes advantage of these reads to align data quickly through a hash-based indexing scheme.
Pyani provides a package and script for calculation of genome-scale average nucleotide identity.
This package facilitates the analysis of single-cell RNA-seq UMI matrices. It does this by computing partitions of a cell similarity graph into small homogeneous groups of cells, which are defined as metacells (MCs). The derived MCs are then used for building different representations of the data, allowing matrix or 2D graph visualization forming a basis for analysis of cell types, subtypes, transcriptional gradients,cell-cycle variation, gene modules and their regulatory models and more.
Grassroots DICOM (GDCM) is an implementation of the DICOM standard designed to be open source so that researchers may access clinical data directly. GDCM includes a file format definition and a network communications protocol, both of which should be extended to provide a full set of tools for a researcher or small medical imaging vendor to interface with an existing medical database.
Minimap2 is a versatile sequence alignment program that aligns DNA or mRNA sequences against a large reference database. Typical use cases include:
mapping PacBio or Oxford Nanopore genomic reads to the human genome;
finding overlaps between long reads with error rate up to ~15%;
splice-aware alignment of PacBio Iso-Seq or Nanopore cDNA or Direct RNA reads against a reference genome;
aligning Illumina single- or paired-end reads;
assembly-to-assembly alignment;
full-genome alignment between two closely related species with divergence below ~15%.
Entrez Direct (EDirect) is a method for accessing the National Center for Biotechnology Information's (NCBI) set of interconnected databases (publication, sequence, structure, gene, variation, expression, etc.) from a terminal. Functions take search terms from command-line arguments. Individual operations are combined to build multi-step queries. Record retrieval and formatting normally complete the process.
EDirect also provides an argument-driven function that simplifies the extraction of data from document summaries or other results that are returned in structured XML format. This can eliminate the need for writing custom software to answer ad hoc questions.
SCENIC (Single-cell regulatory network inference and clustering) is an R package to infer Gene Regulatory Networks and cell types from single-cell RNA-seq data.
The Maxprobes package collects cross-reactive probes of Illumina methylation array 450K and EPIC/850K.
This package is tools for analysing intercellular and intracellular signaling from single cell RNA-seq (scRNA-seq) data.
SeqAn is a C++ library of efficient algorithms and data structures for the analysis of sequences with the focus on biological data. It contains algorithms and data structures for string representation and their manipulation, online and indexed string search, efficient I/O of bioinformatics file formats, sequence alignment, and more.
This package provides a convenient interface to minimap2, a fast and accurate C program to align genomic and transcribe nucleotide sequences.
The metacells package implements the improved metacell algorithm for single-cell RNA sequencing (scRNA-seq) data analysis within the scipy framework, and projection algorithm based on it. The original metacell algorithm was implemented in R. The Python package contains various algorithmic improvements and is scalable for larger data sets (millions of cells).
This package provides a converter between .hic files (from juicer) and single-resolution or multi-resolution .cool files (for cooler). Both hic and cool files describe Hi-C contact matrices.
StringTie is a fast and efficient assembler of RNA-Seq sequence alignments into potential transcripts. It uses a novel network flow algorithm as well as an optional de novo assembly step to assemble and quantitate full-length transcripts representing multiple splice variants for each gene locus. Its input can include not only the alignments of raw reads used by other transcript assemblers, but also alignments of longer sequences that have been assembled from those reads. To identify differentially expressed genes between experiments, StringTie's output can be processed either by the Cuffdiff or Ballgown programs.
Fxtract extracts sequences from a protein or nucleotide fastx (FASTA or FASTQ) file given a subsequence. It uses a simple substring search for basic tasks but can change to using POSIX regular expressions, PCRE, hash lookups or multi-pattern searching as required. By default fxtract looks in the sequence of each record but can also be told to look in the header, comment or quality sections.
This library implements an efficient loopless multiset combination generation algorithm which is (approximately) described in "Loopless algorithms for generating permutations, combinations, and other combinatorial configurations.", G. Ehrlich - Journal of the ACM (JACM), 1973. (Algorithm 7.)
MACS is an implementation of a ChIP-Seq analysis algorithm for identifying transcript factor binding sites named Model-based Analysis of ChIP-Seq (MACS). MACS captures the influence of genome complexity to evaluate the significance of enriched ChIP regions and it improves the spatial resolution of binding sites through combining the information of both sequencing tag position and orientation.
Prodigal runs smoothly on finished genomes, draft genomes, and metagenomes, providing gene predictions in GFF3, Genbank, or Sequin table format. It runs quickly, in an unsupervised fashion, handles gaps, handles partial genes, and identifies translation initiation sites.