Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
This package provides a method to detect and enable removal of doublets from single-cell RNA-sequencing.
HOMER (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and next-gen sequencing analysis. It is a collection of command line programs written in Perl and C++. HOMER was primarily written as a de novo motif discovery algorithm and is well suited for finding 8-20 bp motifs in large scale genomics data. HOMER contains many useful tools for analyzing ChIP-Seq, GRO-Seq, RNA-Seq, DNase-Seq, Hi-C and numerous other types of functional genomics sequencing data sets.
SAIGE is a package for efficiently controlling for case-control imbalance and sample relatedness in single-variant assoc tests (SAIGE) and controlling for sample relatedness in region-based assoc tests in large cohorts and biobanks (SAIGE-GENE+).
Smithlab CPP is a C++ library that includes functions used in many of the Smith lab bioinformatics projects, such as a wrapper around Samtools data structures, classes for genomic regions, mapped sequencing reads, etc.
FLASH (Fast Length Adjustment of SHort reads) is a tool to merge paired-end reads from next-generation sequencing experiments. FLASH is designed to merge pairs of reads when the original DNA fragments are shorter than twice the length of reads. The resulting longer reads can significantly improve genome assemblies. They can also improve transcriptome assembly when FLASH is used to merge RNA-seq data.
SQUID is Sean Eddy's personal library of C functions and utility programs for sequence analysis.
MOFA is a factor analysis model that provides a general framework for the integration of multi-omic data sets in an unsupervised fashion. Intuitively, MOFA can be viewed as a versatile and statistically rigorous generalization of principal component analysis to multi-omics data. Given several data matrices with measurements of multiple -omics data types on the same or on overlapping sets of samples, MOFA infers an interpretable low-dimensional representation in terms of a few latent factors. These learnt factors represent the driving sources of variation across data modalities, thus facilitating the identification of cellular states or disease subgroups.
Piranha is a peak-caller for genomic data produced by CLIP-seq and RIP-seq experiments. It takes input in BED or BAM format and identifies regions of statistically significant read enrichment. Additional covariates may optionally be provided to further inform the peak-calling process.
TSIS is used for detecting transcript isoform switches in time-series data. Transcript isoform switches occur when a pair of alternatively spliced isoforms reverse the order of their relative expression levels. TSIS characterizes the transcript switch by defining the isoform switch time-points for any pair of transcript isoforms within a gene. In addition, this tool describes the switch using five different features or metrics. Also it filters the results with user’s specifications and visualizes the results using different plots for the user to examine further details of the switches.
This package provides a collection of useful functions for working with DNA methylation micro-array data.
Kraken is a taxonomic sequence classifier that assigns taxonomic labels to DNA sequences. Kraken examines the k-mers within a query sequence and uses the information within those k-mers to query a database. That database maps k-mers to the lowest common ancestor (LCA) of all genomes known to contain a given k-mer.
Telomerecat is a tool for estimating the average telomere length (TL) for a paired end, whole genome sequencing (WGS) sample.
Telomerecat is adaptable, accurate and fast. The algorithm accounts for sequencing amplification artifacts, anneouploidy (common in cancer samples) and noise generated by WGS. For a high coverage WGS BAM file of around 100GB telomerecat can produce an estimate in ~1 hour.
This program searches for and removes remnant adapter sequences from High-Throughput Sequencing (HTS) data and (optionally) trims low quality bases from the 3' end of reads following adapter removal. AdapterRemoval can analyze both single end and paired end data, and can be used to merge overlapping paired-ended reads into (longer) consensus sequences. Additionally, the AdapterRemoval may be used to recover a consensus adapter sequence for paired-ended data, for which this information is not available.
This package provides an automated pipeline for spatial mapping of unique transcripts.
Kaiju is a program for sensitive taxonomic classification of high-throughput sequencing reads from metagenomic whole genome sequencing experiments.
This package computes informative enrichment and quality measures for ChIP-seq/DNase-seq/FAIRE-seq/MNase-seq data. It can also be used to obtain robust estimates of the predominant fragment length or characteristic tag shift values in these assays.
Parabam is a tool for processing sequencing files in parallel. It uses Python's native multiprocessing framework to apply a user defined rule on an input file.
This package provides tools for dealing with Unique Molecular Identifiers (UMIs) and Random Molecular Tags (RMTs) in genetic sequences. There are six tools: the extract and whitelist commands are used to prepare a fastq containing UMIs +/- cell barcodes for alignment. The remaining commands, group, dedup, and count/count_tab, are used to identify PCR duplicates using the UMIs and perform different levels of analysis depending on the needs of the user.
Sambamba is a high performance modern robust and fast tool (and library), written in the D programming language, for working with SAM and BAM files. Current parallelised functionality is an important subset of samtools functionality, including view, index, sort, markdup, and depth.
Kallisto is a program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads. It is based on the novel idea of pseudoalignment for rapidly determining the compatibility of reads with targets, without the need for alignment. Pseudoalignment of reads preserves the key information needed for quantification, and kallisto is therefore not only fast, but also as accurate as existing quantification tools.
Ngesh is a Python library and CLI tool for simulating phylogenetic trees and data. It is intended for benchmarking phylogenetic methods, especially in historical linguistics andstemmatology. The generation of stochastic phylogenetic trees also goes by the name simulationmethods for phylogenetic trees, synthetic data generation, or just phylogenetic tree simulation.
This package lets you read and write files in Generic Feature Format (GFF) with Biopython integration.
RSeQC provides a number of modules that can comprehensively evaluate high throughput sequence data, especially RNA-seq data. Some basic modules inspect sequence quality, nucleotide composition bias, PCR bias and GC bias, while RNA-seq specific modules evaluate sequencing saturation, mapped reads distribution, coverage uniformity, strand specificity, etc.
Velvet is a de novo genomic assembler specially designed for short read sequencing technologies, such as Solexa or 454. Velvet currently takes in short read sequences, removes errors then produces high quality unique contigs. It then uses paired read information, if available, to retrieve the repeated areas between contigs.