Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
python-cwl-upgrader is a standalone upgrader for CWL documents from version draft-3, v1.0, and v1.1 to v1.2.
This package generates a Miami plot with centered chromosome labels. The output is a ggplot2 object. Users can specify which data they want plotted on top vs. bottom, whether to display significance line(s), what colors to give chromosomes, and what points to label.
Genrich is a peak-caller for genomic enrichment assays (e.g. ChIP-seq, ATAC-seq). It analyzes alignment files generated following the assay and produces a file detailing peaks of significant enrichment.
Infernal ("INFERence of RNA ALignment") is a tool for searching DNA sequence databases for RNA structure and sequence similarities. It is an implementation of a special case of profile stochastic context-free grammars called covariance models (CMs). A CM is like a sequence profile, but it scores a combination of sequence consensus and RNA secondary structure consensus, so in many cases, it is more capable of identifying RNA homologs that conserve their secondary structure more than their primary sequence.
Entrez Direct (EDirect) is a method for accessing the National Center for Biotechnology Information's (NCBI) set of interconnected databases (publication, sequence, structure, gene, variation, expression, etc.) from a terminal. Functions take search terms from command-line arguments. Individual operations are combined to build multi-step queries. Record retrieval and formatting normally complete the process.
EDirect also provides an argument-driven function that simplifies the extraction of data from document summaries or other results that are returned in structured XML format. This can eliminate the need for writing custom software to answer ad hoc questions.
MafFilter is a program dedicated to the analysis of genome alignments. It parses and manipulates MAF files as well as more simple fasta files. This package can be used to design a pipeline as a series of consecutive filters, each performing a dedicated analysis. Many of the filters are available, from alignment cleaning to phylogeny reconstruction and population genetics analysis. Despite various filtering options and format conversion tools, MafFilter can compute a wide range of statistics (phylogenetic trees, nucleotide diversity, inference of selection, etc.).
Samtools implements various utilities for post-processing nucleotide sequence alignments in the SAM, BAM, and CRAM formats, including indexing, variant calling (in conjunction with bcftools), and a simple alignment viewer.
Pypairix is a Python module for fast querying on a pairix-indexed bgzipped text file that contains a pair of genomic coordinates per line.
eXpress is a streaming tool for quantifying the abundances of a set of target sequences from sampled subsequences. Example applications include transcript-level RNA-Seq quantification, allele-specific/haplotype expression analysis (from RNA-Seq), transcription factor binding quantification in ChIP-Seq, and analysis of metagenomic data.
SAIGE is a package for efficiently controlling for case-control imbalance and sample relatedness in single-variant assoc tests (SAIGE) and controlling for sample relatedness in region-based assoc tests in large cohorts and biobanks (SAIGE-GENE+).
This package provides a new batch effect correction method based on Projection to Latent Structures Discriminant Analysis named “PLSDA-batch” to correct data prior to any downstream analysis. PLSDA-batch estimates latent components related to treatment and batch effects to remove batch variation. The method is multivariate, non-parametric and performs dimension reduction. Combined with centered log ratio transformation for addressing uneven library sizes and compositional structure, PLSDA-batch addresses all characteristics of microbiome data that existing correction methods have ignored so far.
PHYLIP (the PHYLogeny Inference Package) is a package of programs for inferring phylogenies (evolutionary trees).
This package provides a method to detect and enable removal of doublets from single-cell RNA-sequencing.
PyEGA3 is a tool for viewing and downloading files from authorized EGA datasets. It uses the EGA data API and has several key features:
Files are transferred over secure https connections and received unencrypted, so no need for decryption after download.
Downloads resume from where they left off in the event that the connection is interrupted.
Supports file segmenting and parallelized download of segments, improving overall performance.
After download completes, file integrity is verified using checksums.
Implements the GA4GH-compliant htsget protocol for download of genomic ranges for data files with accompanying index files.
Psupertime is supervised pseudotime for single cell RNAseq data. It uses single cell RNAseq data, where the cells have a known ordering. This ordering helps to identify a small number of genes which place cells in that known order. It can be used for discovery of relevant genes, for identification of subpopulations, and characterization of further unknown or differently labelled data.
This package provides a collection of useful functions for working with DNA methylation micro-array data.
MOSAIK is a program for mapping second and third-generation sequencing reads to a reference genome. MOSAIK can align reads generated by all the major sequencing technologies, including Illumina, Applied Biosystems SOLiD, Roche 454, Ion Torrent and Pacific BioSciences SMRT.
Circus is an R package for annotation, analysis and visualization of circRNA data. Users can annotate their circRNA candidates with host genes, gene features they are spliced from, and discriminate between known and yet unknown splice junctions. Circular-to-linear ratios of circRNAs can be calculated, and a number of descriptive plots easily generated.
Miniasm is a very fast OLC-based de novo assembler for noisy long reads. It takes all-vs-all read self-mappings (typically by minimap) as input and outputs an assembly graph in the GFA format. Different from mainstream assemblers, miniasm does not have a consensus step. It simply concatenates pieces of read sequences to generate the final unitig sequences. Thus the per-base error rate is similar to the raw input reads.
Biopython is a set of tools for biological computation including parsers for bioinformatics files into Python data structures; interfaces to common bioinformatics programs; a standard sequence class and tools for performing common operations on them; code to perform data classification; code for dealing with alignments; code making it easy to split up parallelizable tasks into separate processes; and more.
The FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.
Next-Generation sequencing machines usually produce FASTA or FASTQ files, containing multiple short-reads sequences. The main processing of such FASTA/FASTQ files is mapping the sequences to reference genomes. However, it is sometimes more productive to preprocess the files before mapping the sequences to the genome---manipulating the sequences to produce better mapping results. The FASTX-Toolkit tools perform some of these preprocessing tasks.
MinCED is a program to find Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) in DNA sequences. It can be used for unassembled metagenomic reads, but is mainly designed for full genomes and assembled metagenomic sequence.
Fluff is a Python package that contains several scripts to produce pretty, publication-quality figures for next-generation sequencing experiments.
CellBender is a software package for eliminating technical artifacts from high-throughput single-cell RNA sequencing (scRNA-seq) data.