Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
This package provides data structures, algorithms and educational resources for bioinformatics.
This package implements a bioinformatics algorithm for demultiplexing multiplexed single cell datasets. It is built on a statistical model of tag read counts derived from the physical mechanism of tag cross-contamination.
WhatsHap is software for phasing genomic variants using DNA sequencing reads, also called read-based phasing or haplotype assembly. It is especially suitable for long reads, but works also well with short reads.
This package provides bioinformatic tools to align, deduplicate, reformat, filter and normalize DNA and RNA-seq data. It includes the following tools: BBMap, a short read aligner for DNA and RNA-seq data; BBNorm, a kmer-based error-correction and normalization tool; Dedupe, a tool to simplify assemblies by removing duplicate or contained subsequences that share a target percent identity; Reformat, to convert reads between fasta/fastq/scarf/fasta+qual/sam, interleaved/paired, and ASCII-33/64, at over 500 MB/s; and BBDuk, a tool to filter, trim, or mask reads with kmer matches to an artifact/contaminant file.
ravanan is a CWL implementation that is powered by GNU Guix and provides strong reproducibility guarantees. ravanan provides strong caching of intermediate results so the same step of a workflow is never run twice. ravanan captures logs from every step of the workflow for easy tracing back in case of job failures. ravanan currently runs on single machines and on slurm via its API.
MoFax is a Python package for transcription factor motif analysis. It provides convenience functions to load and visualize factor models trained with MOFA+ in Python.
This package contains the Battenberg R package for subclonal copy number estimation, as described by Nik-Zainal et al.
ChIPKernels is an R package for building different string kernels used for DNA Sequence analysis. A dictionary of the desired kernel must be built and this dictionary can be used for determining kernels for DNA Sequences.
SeqGL is a group lasso based algorithm to extract transcription factor sequence signals from ChIP, DNase and ATAC-seq profiles. This package presents a method which uses group lasso to discriminate between bound and non bound genomic regions to accurately identify transcription factors bound at the specific regions.
Scregseg (Single-Cell REGulatory landscape SEGmentation) is a tool that facilitates the analysis of single cell ATAC-seq data by an HMM-based segmentation algorithm. Scregseg uses an HMM with Dirichlet-Multinomial emission probabilities to segment the genome either according to distinct relative cross-cell accessibility profiles or (after collapsing the single-cell tracks to pseudo-bulk tracks) to capture distinct cross-cluster accessibility profiles.
The ccwl is a concise syntax to express CWL workflows. ccwl is a compiler to generate CWL workflows from concise descriptions in ccwl. It is implemented as an EDSL in the Scheme programming language.
This package provides an assortment of R functions that is suitable for all types of microbial diversity analyses.
This tool detects batch effects in high-dimensional data based on chi^2-test.
Delly is an integrated structural variant prediction method that can discover and genotype deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read massively parallel sequencing data. It uses paired-ends and split-reads to sensitively and accurately delineate genomic rearrangements throughout the genome.
Samtools implements various utilities for post-processing nucleotide sequence alignments in the SAM, BAM, and CRAM formats, including indexing, variant calling (in conjunction with bcftools), and a simple alignment viewer.
Bio::Kseq provides ruby bindings to the kseq.h FASTA and FASTQ parsing code. It provides a fast iterator over sequences and their quality scores.
MACS is an implementation of a ChIP-Seq analysis algorithm for identifying transcript factor binding sites named Model-based Analysis of ChIP-Seq (MACS). MACS captures the influence of genome complexity to evaluate the significance of enriched ChIP regions and it improves the spatial resolution of binding sites through combining the information of both sequencing tag position and orientation.
Splicekit is a modular platform for splicing analysis from short-read RNA-seq datasets. The platform also integrates pybio for genomic operations and scanRBP for RNA-protein binding studies. The whole analysis is self-contained (one single directory) and the platform is written in Python, in a modular way.
MOSAIK is a program for mapping second and third-generation sequencing reads to a reference genome. MOSAIK can align reads generated by all the major sequencing technologies, including Illumina, Applied Biosystems SOLiD, Roche 454, Ion Torrent and Pacific BioSciences SMRT.
This package implements the custom CRAM codecs used for "EXTERNAL" block types. These consist of two variants of the rANS codec (8-bit and 16-bit renormalisation, with run-length encoding and bit-packing also supported in the latter), a dynamic arithmetic coder, and custom codecs for name/ID compression and quality score compression derived from fqzcomp.
Telomerecat is a tool for estimating the average telomere length (TL) for a paired end, whole genome sequencing (WGS) sample.
Telomerecat is adaptable, accurate and fast. The algorithm accounts for sequencing amplification artifacts, anneouploidy (common in cancer samples) and noise generated by WGS. For a high coverage WGS BAM file of around 100GB telomerecat can produce an estimate in ~1 hour.
This package provides a toolbox to process, analyze and visualize spatial single-cell expression data.
Entrez Direct (EDirect) is a method for accessing the National Center for Biotechnology Information's (NCBI) set of interconnected databases (publication, sequence, structure, gene, variation, expression, etc.) from a terminal. Functions take search terms from command-line arguments. Individual operations are combined to build multi-step queries. Record retrieval and formatting normally complete the process.
EDirect also provides an argument-driven function that simplifies the extraction of data from document summaries or other results that are returned in structured XML format. This can eliminate the need for writing custom software to answer ad hoc questions.
The Shaman package implements functions for resampling Hi-C matrices in order to generate expected contact distributions given constraints on marginal coverage and contact-distance probability distributions. The package also provides support for visualizing normalized matrices and statistical analysis of contact distributions around selected landmarks.