The drizzle library is a Python package for combining dithered images into a single image. This library is derived from code used in DrizzlePac. Like DrizzlePac, most of the code is implemented in the C language. The biggest change from DrizzlePac is that this code passes an array that maps the input to output image into the C code, while the DrizzlePac code computes the mapping by using a Python callback. Switching to using an array allowed the code to be greatly simplified.
python-astroid
provides a common base representation of Python source code for projects such as pychecker, pyreverse, pylint, etc. It provides a compatible representation which comes from the _ast module. It rebuilds the tree generated by the builtin _ast module by recursively walking down the AST and building an extended ast. The new node classes have additional methods and attributes for different usages. They include some support for static inference and local name scopes. Furthermore, astroid builds partial trees by inspecting living objects.
Mutagen is a Python module to handle audio metadata. It supports ASF, FLAC, M4A, Monkey’s Audio, MP3, Musepack, Ogg FLAC, Ogg Speex, Ogg Theora, Ogg Vorbis, True Audio, WavPack and OptimFROG audio files. All versions of ID3v2 are supported, and all standard ID3v2.4 frames are parsed. It can read Xing headers to accurately calculate the bitrate and length of MP3s. ID3 and APEv2 tags can be edited regardless of audio format. It can also manipulate Ogg streams on an individual packet/page level.
Boltons is a set of over 230 pure-Python utilities in the same spirit as — and yet conspicuously missing from — the standard library, including:
Atomic file saving, bolted on with fileutils
A highly-optimized OrderedMultiDict, in dictutils
Two types of PriorityQueue, in queueutils
Chunked and windowed iteration, in iterutils
Recursive data structure iteration and merging, with iterutils.remap
Exponential backoff functionality, including jitter, through iterutils.backoff
A full-featured TracebackInfo type, for representing stack traces, in tbutils
This project documents instruction sets in a format convenient for tools development. An instruction set is represented by three files:
an XML file that describes instructions;
an XSD file that describes the structure of the XML file;
a Python module that reads the XML file and represents it as a set of Python objects;
It currently provides descriptions for most user-mode x86, x86_64, and k1om instructions up to AVX-512 and SHA (including 3dnow!+, XOP, FMA3, FMA4, TBM and BMI2).
This package is a Cython wrapper for khash-sets/maps. It brings functionality of khash to Python and Cython and can be used seamlessly in numpy or pandas. Numpy's world is lacking the concept of a (hash-)set. This shortcoming is fixed and efficient (memory- and speedwise compared to pandas) unique
and isin
are implemented. Python-set/dict have a big memory-footprint. For some datatypes the overhead can be reduced by using khash by factor 4-8.
This library provides fast, memory-efficient, pythonic (and command-line) access to fasta sequence files. It stores a flattened version of a fasta sequence file without spaces or headers and uses either a mmap
in numpy binary format or fseek
/fread
so the sequence data is never read into memory. It saves a pickle (.gdx
) of the start and stop (for fseek
/mmap
) locations of each header in the fasta file for internal use.
Note that this package has been deprecated in favor of pyfaidx
.
Natsort lets you apply natural sorting on lists instead of lexicographical. If you use the built-in sorted
method in python on a list such as [
, it would be returned as a20
, a9
, a1
, a4
, a10
][
. Natsort provides a function a1
, a10
, a20
, a4
, a9
]natsorted
that identifies numbers and sorts them separately from strings. It can also sort version numbers, real numbers, mixed types and more, and comes with a shell command natsort
that exposes this functionality in the command line.
MOFA is a factor analysis model that provides a general framework for the integration of multi-omic data sets in an unsupervised fashion. Intuitively, MOFA can be viewed as a versatile and statistically rigorous generalization of principal component analysis to multi-omics data. Given several data matrices with measurements of multiple -omics data types on the same or on overlapping sets of samples, MOFA infers an interpretable low-dimensional representation in terms of a few latent factors. These learnt factors represent the driving sources of variation across data modalities, thus facilitating the identification of cellular states or disease subgroups.
This Python module can be used to generate and parse RFC 5451/7001/7601 Authentication-Results
email headers. It supports extensions such as:
RFC 5617 DKIM/ADSP
RFC 6008 DKIM signature identification (
header.b
)RFC 6212 VBR
RFC 6577 SPF
RFC 7281
Authentication-Results
registration for S/MIMERFC 7293 The
Require-Recipient-Valid-Since
header fieldRFC 7489 DMARC
ARC (draft-ietf-dmarc-arc-protocol-08)
This package provides a simple Python test runner for unittest that outputs Test Anything Protocol (TAP) results to standard output. Contrary to other TAP runners for Python, pycotap...
prints TAP (and only TAP) to standard output instead of to a separate file, allowing you to pipe it directly to TAP pretty printers and processors;
only contains a TAP reporter, so no parsers, no frameworks, no dependencies, etc;
is configurable: you can choose how you want the test output and test result diagnostics to end up in your TAP output (as TAP diagnostics, YAML blocks, or attachments).
This is the reference implementation of the CWL standards. The CWL open standards are for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments. CWL is designed to meet the needs of data-intensive science, such as Bioinformatics, Medical Imaging, Astronomy, Physics, and Chemistry. The cwltool is intended to be feature complete and to provide comprehensive validation of CWL files as well as provide other tools related to working with CWL descriptions.
Pathlib offers a set of classes to handle file system paths. It offers the following advantages over using string objects:
No more cumbersome use of os and os.path functions. Everything can be done easily through operators, attribute accesses, and method calls.
Embodies the semantics of different path types. For example, comparing Windows paths ignores casing.
Well-defined semantics, eliminating any inconsistencies or ambiguities (forward vs. backward slashes, etc.).
Note: In Python 3.4, pathlib is now part of the standard library. For other Python versions please consider python-pathlib2 instead, which tracks the standard library module. This module (python-pathlib) isn't maintained anymore.
python-pandera
provides a flexible and expressive API for performing data validation on dataframe-like objects to make data processing pipelines more readable and robust. Dataframes contain information that python-pandera
explicitly validates at runtime. This is useful in production-critical data pipelines or reproducible research settings. With python-pandera
, you can:
Define a schema once and use it to validate different dataframe types.
Check the types and properties of columns.
Perform more complex statistical validation like hypothesis testing.
Seamlessly integrate with existing data pipelines via function decorators.
Define dataframe models with the class-based API with pydantic-style syntax.
Synthesize data from schema objects for property-based testing.
Lazily validate dataframes so that all validation rules are executed.
Integrate with a rich ecosystem of tools like
python-pydantic
,python-fastapi
andpython-mypy
.
Cheetah is a text-based template engine and Python code generator.
Cheetah can be used as a standalone templating utility or referenced as a library from other Python applications. It has many potential uses, but web developers looking for a viable alternative to ASP, JSP, PHP and PSP are expected to be its principle user group.
Features:
Generates HTML, SGML, XML, SQL, Postscript, form email, LaTeX, or any other text-based format.
Cleanly separates content, graphic design, and program code.
Blends the power and flexibility of Python with a simple template language that non-programmers can understand.
Gives template writers full access to any Python data structure, module, function, object, or method in their templates.
Makes code reuse easy by providing an object-orientated interface to templates that is accessible from Python code or other Cheetah templates. One template can subclass another and selectively reimplement sections of it.
Provides a simple, yet powerful, caching mechanism that can dramatically improve the performance of a dynamic website.
Compiles templates into optimized, yet readable, Python code.
Python inotify.
Engine.IO server
Socket.IO server
Python humanize utilities