Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
Picard is a set of Java command line tools for manipulating high-throughput sequencing (HTS) data and formats. Picard is implemented using the HTSJDK Java library to support accessing file formats that are commonly used for high-throughput sequencing data such as SAM, BAM, CRAM and VCF.
MinCED is a program to find Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) in DNA sequences. It can be used for unassembled metagenomic reads, but is mainly designed for full genomes and assembled metagenomic sequence.
This is package for including transposable elements in differential enrichment analysis of sequencing datasets. TEtranscripts and TEcount take RNA-seq (and similar data) and annotates reads to both genes and transposable elements. TEtranscripts then performs differential analysis using DESeq2. Note that TEtranscripts and TEcount rely on specially curated GTF files, which are not included due to their size.
Bio++ is a set of C++ libraries for Bioinformatics, including sequence analysis, phylogenetics, molecular evolution and population genetics. This library provides sequence-related modules.
t-Stochastic Neighborhood Embedding (t-SNE) is a method for dimensionality reduction and visualization of high dimensional datasets. A popular implementation of t-SNE uses the Barnes-Hut algorithm to approximate the gradient at each iteration of gradient descent. This implementation differs in these ways:
Instead of approximating the N-body simulation using Barnes-Hut, we interpolate onto an equispaced grid and use FFT to perform the convolution.
Instead of computing nearest neighbors using vantage-point trees, we approximate nearest neighbors using the Annoy library. The neighbor lookups are multithreaded to take advantage of machines with multiple cores.
This package addresses the challenge of handling large amounts of data that are now routinely generated from DNA sequencing centers. deepTools contains useful modules to process the mapped reads data for multiple quality checks, creating normalized coverage files in standard bedGraph and bigWig file formats, that allow comparison between different files. Finally, using such normalized and standardized files, deepTools can create many publication-ready visualizations to identify enrichments and for functional annotations of the genome.
BayesPrism includes deconvolution and embedding learning modules. The deconvolution module models a prior from cell type-specific expression profiles from scRNA-seq to jointly estimate the posterior distribution of cell type composition and cell type-specific gene expression from bulk RNA-seq expression of tumor samples. The embedding learning module uses Expectation-maximization (EM) to approximate the tumor expression using a linear combination of malignant gene programs while conditional on the inferred expression and fraction of non-malignant cells estimated by the deconvolution module.
SCENIC (Single-cell regulatory network inference and clustering) is an R package to infer Gene Regulatory Networks and cell types from single-cell RNA-seq data.
This package conducts batch effects removal from a taxa read count table by a conditional quantile regression method. The distributional attributes of microbiome data - zero-inflation and over-dispersion, are simultaneously considered.
PAML (for Phylogentic Analysis by Maximum Likelihood) contains a few programs for model fitting and phylogenetic tree reconstruction using nucleotide or amino-acid sequence data.
This is a collection of utility functions for Seurat. These functions allow the automation and multiplexing of plotting, 3D plotting, visualization of statistics & QC, interaction with the Seurat object. Some functionalities require functions from CodeAndRoll and MarkdownReports libraries.
SortMeRNA is a biological sequence analysis tool for filtering, mapping and OTU picking of NGS reads. The core algorithm is based on approximate seeds and allows for fast and sensitive analyses of nucleotide sequences. The main application of SortMeRNA is filtering rRNA from metatranscriptomic data.
Mudskipper is a tool for projecting genomic alignments to transcriptomic coordinates.
Bio-locus is a tabix-like tool for fast querying of genome locations. Many file formats in bioinformatics contain records that start with a chromosome name and a position for a SNP, or a start-end position for indels. Bio-locus allows users to store this chr+pos or chr+pos+alt information in a database.
ChIPKernels is an R package for building different string kernels used for DNA Sequence analysis. A dictionary of the desired kernel must be built and this dictionary can be used for determining kernels for DNA Sequences.
MultiQC is a tool to aggregate bioinformatics results across many samples into a single report. It contains modules for a large number of common bioinformatics tools.
This package provides string parsing functionalities for generating plotnames, filenames and paths.
Telomerecat is a tool for estimating the average telomere length (TL) for a paired end, whole genome sequencing (WGS) sample.
Telomerecat is adaptable, accurate and fast. The algorithm accounts for sequencing amplification artifacts, anneouploidy (common in cancer samples) and noise generated by WGS. For a high coverage WGS BAM file of around 100GB telomerecat can produce an estimate in ~1 hour.
MafFilter is a program dedicated to the analysis of genome alignments. It parses and manipulates MAF files as well as more simple fasta files. This package can be used to design a pipeline as a series of consecutive filters, each performing a dedicated analysis. Many of the filters are available, from alignment cleaning to phylogeny reconstruction and population genetics analysis. Despite various filtering options and format conversion tools, MafFilter can compute a wide range of statistics (phylogenetic trees, nucleotide diversity, inference of selection, etc.).
Bloom-filter-based error correction solution for high-throughput sequencing reads (BLESS) uses a single minimum-sized bloom filter is a correction tool for genomic reads produced by Next-generation sequencing (NGS). BLESS produces accurate correction results with much less memory compared with previous solutions and is also able to tolerate a higher false-positive rate. BLESS can extend reads like DNA assemblers to correct errors at the end of reads.
pySCENIC is a Python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
This package provides an implementation of chunked, compressed, N-dimensional arrays for R, Zarr specification version 2 (2024) <doi:10.5281/zenodo.11320255>.
python-scanrbp is a Python package that provides the scanRBP tool that loads RNA-protein binding motif PWM and computes the log-odds scores for all the loaded RBPs across a given genomic sequence and draws a heatmap of the scores.
ngshmmalign is a profile HMM aligner for NGS reads designed particularly for small genomes (such as those of RNA viruses like HIV-1 and HCV) that experience substantial biological insertions and deletions.