Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
This is the reference implementation of the CWL standards. The CWL open standards are for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments. CWL is designed to meet the needs of data-intensive science, such as Bioinformatics, Medical Imaging, Astronomy, Physics, and Chemistry. The cwltool is intended to be feature complete and to provide comprehensive validation of CWL files as well as provide other tools related to working with CWL descriptions.
This package provides Python bindings for spoa, a C++ implementation of the partial order alignment (POA) algorithm (as described in 10.1093/bioinformatics/18.3.452) which is used to generate consensus sequences
This tool detects batch effects in high-dimensional data based on chi^2-test.
The preseq package is aimed at predicting and estimating the complexity of a genomic sequencing library, equivalent to predicting and estimating the number of redundant reads from a given sequencing depth and how many will be expected from additional sequencing using an initial sequencing experiment. The estimates can then be used to examine the utility of further sequencing, optimize the sequencing depth, or to screen multiple libraries to avoid low complexity samples.
t-Stochastic Neighborhood Embedding (t-SNE) is a method for dimensionality reduction and visualization of high dimensional datasets. A popular implementation of t-SNE uses the Barnes-Hut algorithm to approximate the gradient at each iteration of gradient descent. This implementation differs in these ways:
Instead of approximating the N-body simulation using Barnes-Hut, we interpolate onto an equispaced grid and use FFT to perform the convolution.
Instead of computing nearest neighbors using vantage-point trees, we approximate nearest neighbors using the Annoy library. The neighbor lookups are multithreaded to take advantage of machines with multiple cores.
This package provides Python bindings to the UCSC Big Binary (bigWig/bigBed) file library. This provides read-level access to local and remote bigWig and bigBed files but no write capabilitites. The main feature is fast retrieval of range queries into numpy arrays.
This is an R package to build generic .loom files aligning with the default naming convention of the .loom format and to integrate other data types e.g.: regulons (SCENIC), clusters from Seurat, trajectory information... The package can also be used to extract data from .loom files.
Sickle is a tool that trims reads based on quality and length thresholds. It uses sliding windows to detect low-quality bases at the 3'-end and high-quality bases at the 5'-end. Additionally, it discards reads based on the length threshold.
The Spliced Transcripts Alignment to a Reference (STAR) software is based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences.
BioRuby comes with a comprehensive set of Ruby development tools and libraries for bioinformatics and molecular biology. BioRuby has components for sequence analysis, pathway analysis, protein modelling and phylogenetic analysis; it supports many widely used data formats and provides easy access to databases, external programs and public web services, including BLAST, KEGG, GenBank, MEDLINE and GO.
Bio++ is a set of C++ libraries for Bioinformatics, including sequence analysis, phylogenetics, molecular evolution and population genetics. This package provides command line tools using the Bio++ library.
NanoSV is a software package that can be used to identify structural genomic variations in long-read sequencing data, such as data produced by Oxford Nanopore Technologies’ MinION, GridION or PromethION instruments, or Pacific Biosciences RSII or Sequel sequencers.
Bio-vcf provides a DSL for processing the VCF format. Record named fields can be queried with regular expressions. Bio-vcf is a new generation VCF parser, filter and converter. Bio-vcf is not only very fast for genome-wide (WGS) data, it also comes with a filtering, evaluation and rewrite language and can output any type of textual data, including VCF header and contents in RDF and JSON.
PySnpTools is a library for reading and manipulating genetic data. It can, for example, efficiently read whole PLINK *.bed/bim/fam files or parts of those files. It can also efficiently manipulate ranges of integers using set operators such as union, intersection, and difference.
This package provides a toolbox to process, analyze and visualize spatial single-cell expression data.
Arriba is a command-line tool for the detection of gene fusions from RNA-Seq data. It was developed for the use in a clinical research setting. Therefore, short runtimes and high sensitivity were important design criteria. It is based on the fast STAR aligner and the post-alignment runtime is typically just around two minutes. In contrast to many other fusion detection tools which build on STAR, Arriba does not require to reduce the alignIntronMax parameter of STAR to detect small deletions.
Blasr is a genomic sequence aligner for processing PacBio long reads.
This package contains the Battenberg R package for subclonal copy number estimation, as described by Nik-Zainal et al.
Express Beta Diversity (EBD) calculates ecological beta diversity (dissimilarity) measures between biological communities. EBD implements a variety of diversity measures including those that make use of phylogenetic similarity of community members.
SeqAn is a C++ library of efficient algorithms and data structures for the analysis of sequences with the focus on biological data. It contains algorithms and data structures for string representation and their manipulation, online and indexed string search, efficient I/O of bioinformatics file formats, sequence alignment, and more.
Samtools implements various utilities for post-processing nucleotide sequence alignments in the SAM, BAM, and CRAM formats, including indexing, variant calling (in conjunction with bcftools), and a simple alignment viewer.
Sailfish is a tool for genomic transcript quantification from RNA-seq data. It requires a set of target transcripts (either from a reference or de-novo assembly) to quantify. All you need to run sailfish is a fasta file containing your reference transcripts and a (set of) fasta/fastq file(s) containing your reads.
This package provides a fast and accurate analysis toolkit for single cell ATAC-seq (Assay for transposase-accessible chromatin using sequencing). Single cell ATAC-seq can resolve the heterogeneity of a complex tissue and reveal cell-type specific regulatory landscapes. However, the exceeding data sparsity has posed unique challenges for the data analysis. This package r-snapatac is an end-to-end bioinformatics pipeline for analyzing large- scale single cell ATAC-seq data which includes quality control, normalization, clustering analysis, differential analysis, motif inference and exploration of single cell ATAC-seq sequencing data.
Mash is a fast sequence distance estimator that uses the MinHash algorithm and is designed to work with genomes and metagenomes in the form of assemblies or reads.