Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
SAIGE is a package for efficiently controlling for case-control imbalance and sample relatedness in single-variant assoc tests (SAIGE) and controlling for sample relatedness in region-based assoc tests in large cohorts and biobanks (SAIGE-GENE+).
Salad is a schema language for describing JSON or YAML structured linked data documents. Salad schema describes rules for preprocessing, structural validation, and hyperlink checking for documents described by a Salad schema. Salad supports rich data modeling with inheritance, template specialization, object identifiers, object references, documentation generation, code generation, and transformation to RDF. Salad provides a bridge between document and record oriented data modeling and the Semantic Web.
python-cwlformat is a specification and a reference implementation for a very opinionated CWL code formatter. It outputs CWL in a standardized YAML format.
SQUID is Sean Eddy's personal library of C functions and utility programs for sequence analysis.
DoubletFinder identifies doublets by generating artificial doublets from existing scRNA-seq data and defining which real cells preferentially co-localize with artificial doublets in gene expression space. Other DoubletFinder package functions are used for fitting DoubletFinder to different scRNA-seq datasets. For example, ideal DoubletFinder performance in real-world contexts requires optimal pK selection and homotypic doublet proportion estimation. pK selection is achieved using pN-pK parameter sweeps and maxima identification in mean-variance-normalized bimodality coefficient distributions. Homotypic doublet proportion estimation is achieved by finding the sum of squared cell annotation frequencies.
This is package for including transposable elements in differential enrichment analysis of sequencing datasets. TEtranscripts and TEcount take RNA-seq (and similar data) and annotates reads to both genes and transposable elements. TEtranscripts then performs differential analysis using DESeq2. Note that TEtranscripts and TEcount rely on specially curated GTF files, which are not included due to their size.
Miniasm is a very fast OLC-based de novo assembler for noisy long reads. It takes all-vs-all read self-mappings (typically by minimap) as input and outputs an assembly graph in the GFA format. Different from mainstream assemblers, miniasm does not have a consensus step. It simply concatenates pieces of read sequences to generate the final unitig sequences. Thus the per-base error rate is similar to the raw input reads.
VSEARCH supports DNA sequence searching, clustering, chimera detection, dereplication, pairwise alignment, shuffling, subsampling, sorting and masking. The tool takes advantage of parallelism in the form of SIMD vectorization as well as multiple threads to perform accurate alignments at high speed. VSEARCH uses an optimal global aligner (full dynamic programming Needleman-Wunsch).
ctxcore is part of the SCENIC suite of tools. It provides core functions for pycisTarget and SCENIC.
Samblaster is a fast and flexible program for marking duplicates in read-id grouped paired-end SAM files. It can also optionally output discordant read pairs and/or split read mappings to separate SAM files, and/or unmapped/clipped reads to a separate FASTQ file. When marking duplicates, samblaster will require approximately 20MB of memory per 1M read pairs.
Biosoup is a C++ collection of header-only data structures used for storage and logging in bioinformatics tools.
PiGX ChIPseq is an analysis pipeline for preprocessing, peak calling and reporting for ChIP sequencing experiments. It is easy to use and produces high quality reports. The inputs are reads files from the sequencing experiment, and a configuration file which describes the experiment. In addition to quality control of the experiment, the pipeline enables to set up multiple peak calling analysis and allows the generation of a UCSC track hub in an easily configurable manner.
This is a fast parser for minimap2 PAF (Pairwise mApping Format) files.
libmaus2 is a collection of data structures and algorithms. It contains:
I/O classes (single byte and UTF-8);
bitioclasses (input, output and various forms of bit level manipulation);text indexing classes (suffix and LCP array, fulltext and minute (FM), etc.);
BAM sequence alignment files input/output (simple and collating); and many lower level support classes.
This is a package providing efficient operations for single cell ATAC-seq fragments and RNA counts matrices. It is interoperable with standard file formats, and introduces efficient bit-packed formats that allow large storage savings and increased read speeds.
Logomaker is a Python package for generating publication-quality sequence logos. Logomaker can generate both standard and highly customized logos illustrating the properties of DNA, RNA, or protein sequences. Logos are rendered as vector graphics embedded within native matplotlib Axes objects, making them easy to style and incorporate into multi-panel figures.
modbedtools is a python command line tool to generate modbed files for visualization on the WashU Epigenome Browser.
This package provides a library and collection of scripts to work with Illumina paired-end data (for CASAVA 1.8+).
Genrich is a peak-caller for genomic enrichment assays (e.g. ChIP-seq, ATAC-seq). It analyzes alignment files generated following the assay and produces a file detailing peaks of significant enrichment.
NanoSV is a software package that can be used to identify structural genomic variations in long-read sequencing data, such as data produced by Oxford Nanopore Technologies’ MinION, GridION or PromethION instruments, or Pacific Biosciences RSII or Sequel sequencers.
Isolator analyzes RNA-Seq experiments. Isolator has a particular focus on producing stable, consistent estimates. It implements a full hierarchical Bayesian model of an entire RNA-Seq experiment. It saves all the samples generated by the sampler, which can be processed to compute posterior probabilities for arbitrarily complex questions, far beyond the confines of pairwise tests. It aggressively corrects for technical effects, such as random priming bias, GC-bias, 3' bias, and fragmentation effects. Compared to other MCMC approaches, it is exceedingly efficient, though generally slower than modern maximum likelihood approaches.
CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.
FAN-C provides a pipeline for analysing Hi-C data starting at mapped paired-end sequencing reads.
SeqAn is a C++ library of efficient algorithms and data structures for the analysis of sequences with the focus on biological data. It contains algorithms and data structures for string representation and their manipulation, online and indexed string search, efficient I/O of bioinformatics file formats, sequence alignment, and more.