Enter the query into the form above. You can look for specific version of a package by using @ symbol like this: gcc@10.
API method:
GET /api/packages?search=hello&page=1&limit=20
where search is your query, page is a page number and limit is a number of items on a single page. Pagination information (such as a number of pages and etc) is returned
in response headers.
If you'd like to join our channel webring send a patch to ~whereiseveryone/toys@lists.sr.ht adding your channel as an entry in channels.scm.
This package provides different statistical methods to extract biological activities from omics data within a unified framework.
This R package lets you estimate signatures of mutational processes and their activities on mutation count data. Starting from a set of single-nucleotide variants (SNVs), it allows both estimation of the exposure of samples to predefined mutational signatures (including whether the signatures are present at all), and identification of signatures de novo from the mutation counts.
A tandem repeat in DNA is two or more adjacent, approximate copies of a pattern of nucleotides. Tandem Repeats Finder is a program to locate and display tandem repeats in DNA sequences. In order to use the program, the user submits a sequence in FASTA format. The output consists of two files: a repeat table file and an alignment file. Submitted sequences may be of arbitrary length. Repeats with pattern size in the range from 1 to 2000 bases are detected.
The alignment module of BioJava provides an API that contains
implementations of dynamic programming algorithms for sequence alignment;
reading and writing of popular alignment file formats;
a single-, or multi- threaded multiple sequence alignment algorithm.
PiGX scRNAseq is an analysis pipeline for preprocessing and quality control for single cell RNA sequencing experiments. The inputs are read files from the sequencing experiment, and a configuration file which describes the experiment. It produces processed files for downstream analysis and interactive quality reports. The pipeline is designed to work with UMI based methods.
This package lets you perform unsupervised clustering of amplicon sequencing data in microbiome studies with the Dirichlet-tree Multinomial Mixtures.
BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. The first algorithm is designed for Illumina sequence reads up to 100bp, while the rest two for longer sequences ranged from 70bp to 1Mbp. BWA-MEM and BWA-SW share similar features such as long-read support and split alignment, but BWA-MEM, which is the latest, is generally recommended for high-quality queries as it is faster and more accurate. BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads.
Grouping large genomic fragments assembled from shotgun metagenomic sequences to deconvolute complex microbial communities, or metagenome binning, enables the study of individual organisms and their interactions. MetaBAT is an automated metagenome binning software, which integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency.
The Spliced Transcripts Alignment to a Reference (STAR) software is based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences.
BSeq-sc is a bioinformatics analysis pipeline that leverages single-cell sequencing data to estimate cell type proportion and cell type-specific gene expression differences from RNA-seq data from bulk tissue samples. This is a companion package to the publication "A single-cell transcriptomic map of the human and mouse pancreas reveals inter- and intra-cell population structure." Baron et al. Cell Systems (2016) https://www.ncbi.nlm.nih.gov/pubmed/27667365.
Trinity assembles transcript sequences from Illumina RNA-Seq data. Trinity represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms and to tease apart transcripts derived from paralogous genes.
CodingQuarry is a highly accurate, self-training GHMM fungal gene predictor designed to work with assembled, aligned RNA-seq transcripts.
This package can be used to normalize cytometry samples when a control sample is taken along in each of the batches. This is done by first identifying multiple clusters/cell types, learning the batch effects from the control samples and applying quantile normalization on all markers of interest.
This helper package implements the HiCMatrix class for the HiCExplorer and pyGenomeTracks packages.
The EIGENSOFT package provides tools for population genetics and stratification correction. EIGENSOFT implements methods commonly used in population genetics analyses such as PCA, computation of Tracy-Widom statistics, and finding related individuals in structured populations. It comes with a built-in plotting script and supports multiple file formats and quantitative phenotypes.
TopHat is a fast splice junction mapper for nucleotide sequence reads produced by the RNA-Seq method. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons.
CellBender is a software package for eliminating technical artifacts from high-throughput single-cell RNA sequencing (scRNA-seq) data.
Pairtools is a simple and fast command-line framework to process sequencing data from a Hi-C experiment. Process pair-end sequence alignments and perform the following operations:
detect ligation junctions (a.k.a. Hi-C pairs) in aligned paired-end sequences of Hi-C DNA molecules
sort
.pairsfiles for downstream analysesdetect, tag and remove PCR/optical duplicates
generate extensive statistics of Hi-C datasets
select Hi-C pairs given flexibly defined criteria
restore
.samalignments from Hi-C pairs.
PhyML is a software package that uses modern statistical approaches to analyse alignments of nucleotide or amino acid sequences in a phylogenetic framework. The main tool in this package builds phylogenies under the maximum likelihood criterion. It implements a large number of substitution models coupled with efficient options to search the space of phylogenetic tree topologies. codePhyREX fits the spatial-Lambda-Fleming-Viot model to geo-referenced genetic data. This model is similar to the structured coalescent but assumes that individuals are distributed along a spatial continuum rather than discrete demes. PhyREX can be used to estimate population densities and rates of dispersal. Its output can be processed by treeannotator (from the BEAST package) as well as SPREAD.
JAMM is a peak finder for next generation sequencing datasets (ChIP-Seq, ATAC-Seq, DNase-Seq, etc.) that can integrate replicates and assign peak boundaries accurately. JAMM is applicable to both broad and narrow datasets.
BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.
MOFA is a factor analysis model that provides a general framework for the integration of multi-omic data sets in an unsupervised fashion. Intuitively, MOFA can be viewed as a versatile and statistically rigorous generalization of principal component analysis to multi-omics data. Given several data matrices with measurements of multiple -omics data types on the same or on overlapping sets of samples, MOFA infers an interpretable low-dimensional representation in terms of a few latent factors. These learnt factors represent the driving sources of variation across data modalities, thus facilitating the identification of cellular states or disease subgroups.
This is an R package that integrates the installation of doublet-detection methods. In addition, this tool is used for execution and benchmark of those eight mentioned methods.
PLINK is a whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner. The focus of PLINK is purely on analysis of genotype/phenotype data, so there is no support for steps prior to this (e.g. study design and planning, generating genotype or CNV calls from raw data). Through integration with gPLINK and Haploview, there is some support for the subsequent visualization, annotation and storage of results.