Mutagen is a Python module to handle audio metadata. It supports ASF, FLAC, M4A, Monkey’s Audio, MP3, Musepack, Ogg FLAC, Ogg Speex, Ogg Theora, Ogg Vorbis, True Audio, WavPack and OptimFROG audio files. All versions of ID3v2 are supported, and all standard ID3v2.4 frames are parsed. It can read Xing headers to accurately calculate the bitrate and length of MP3s. ID3 and APEv2 tags can be edited regardless of audio format. It can also manipulate Ogg streams on an individual packet/page level.
Boltons is a set of over 230 pure-Python utilities in the same spirit as — and yet conspicuously missing from — the standard library, including:
Atomic file saving, bolted on with fileutils
A highly-optimized OrderedMultiDict, in dictutils
Two types of PriorityQueue, in queueutils
Chunked and windowed iteration, in iterutils
Recursive data structure iteration and merging, with iterutils.remap
Exponential backoff functionality, including jitter, through iterutils.backoff
A full-featured TracebackInfo type, for representing stack traces, in tbutils
This project documents instruction sets in a format convenient for tools development. An instruction set is represented by three files:
an XML file that describes instructions;
an XSD file that describes the structure of the XML file;
a Python module that reads the XML file and represents it as a set of Python objects;
It currently provides descriptions for most user-mode x86, x86_64, and k1om instructions up to AVX-512 and SHA (including 3dnow!+, XOP, FMA3, FMA4, TBM and BMI2).
This package is a Cython wrapper for khash-sets/maps. It brings functionality of khash to Python and Cython and can be used seamlessly in numpy or pandas. Numpy's world is lacking the concept of a (hash-)set. This shortcoming is fixed and efficient (memory- and speedwise compared to pandas) unique and isin are implemented. Python-set/dict have a big memory-footprint. For some datatypes the overhead can be reduced by using khash by factor 4-8.
This package provides a Pythonic Jupyter-friendly Python API for the HepMC3 library.
pyhepmc has been optimised for safety, usability, and efficiency by a human expert, something that an automatic tool cannot provide. It brings these unique features:
Python idioms are supported where appropriate.
Simple IO with
pyhepmc.open.An alternative Numpy API whih accelerates event processing.
The public API is fully documented with Python docstrings.
Objects are inspectable in Jupyter notebooks.
Events render as graphs in Jupyter notebooks.
Natsort lets you apply natural sorting on lists instead of lexicographical. If you use the built-in sorted method in python on a list such as [, it would be returned as a20, a9, a1, a4, a10][. Natsort provides a function a1, a10, a20, a4, a9]natsorted that identifies numbers and sorts them separately from strings. It can also sort version numbers, real numbers, mixed types and more, and comes with a shell command natsort that exposes this functionality in the command line.
MOFA is a factor analysis model that provides a general framework for the integration of multi-omic data sets in an unsupervised fashion. Intuitively, MOFA can be viewed as a versatile and statistically rigorous generalization of principal component analysis to multi-omics data. Given several data matrices with measurements of multiple -omics data types on the same or on overlapping sets of samples, MOFA infers an interpretable low-dimensional representation in terms of a few latent factors. These learnt factors represent the driving sources of variation across data modalities, thus facilitating the identification of cellular states or disease subgroups.
This Python module can be used to generate and parse RFC 5451/7001/7601 Authentication-Results email headers. It supports extensions such as:
RFC 5617 DKIM/ADSP
RFC 6008 DKIM signature identification (
header.b)RFC 6212 VBR
RFC 6577 SPF
RFC 7281
Authentication-Resultsregistration for S/MIMERFC 7293 The
Require-Recipient-Valid-Sinceheader fieldRFC 7489 DMARC
ARC (draft-ietf-dmarc-arc-protocol-08)
This package provides a simple Python test runner for unittest that outputs Test Anything Protocol (TAP) results to standard output. Contrary to other TAP runners for Python, pycotap...
prints TAP (and only TAP) to standard output instead of to a separate file, allowing you to pipe it directly to TAP pretty printers and processors;
only contains a TAP reporter, so no parsers, no frameworks, no dependencies, etc;
is configurable: you can choose how you want the test output and test result diagnostics to end up in your TAP output (as TAP diagnostics, YAML blocks, or attachments).
Pyogrio provides a GeoPandas-oriented API to OGR vector data sources, such as ESRI Shapefile, GeoPackage, and GeoJSON. Vector data sources have geometries, such as points, lines, or polygons, and associated records with potentially many columns worth of data. Pyogrio uses a vectorized approach for reading and writing GeoDataFrames to and from OGR vector data sources in order to give you faster interoperability. It uses pre-compiled bindings for GDAL/OGR so that the performance is primarily limited by the underlying I/O speed of data source drivers in GDAL/OGR rather than multiple steps of converting to and from Python data types within Python.
fgivenx is a Python package for plotting posteriors of functions. It is currently used in astronomy, but will be of use to any scientists performing Bayesian analyses which have predictive posteriors that are functions.
This package allows one to plot a predictive posterior of a function, dependent on sampled parameters. It assumes one has a Bayesian posterior Post(theta|D,M) described by a set of posterior samples theta_i~Post. If there is a function parameterised by theta y=f(x;theta), then this script will produce a contour plot of the conditional posterior P(y|x,D,M) in the (x,y) plane.
python-pandera provides a flexible and expressive API for performing data validation on dataframe-like objects to make data processing pipelines more readable and robust. Dataframes contain information that python-pandera explicitly validates at runtime. This is useful in production-critical data pipelines or reproducible research settings. With python-pandera, you can:
Define a schema once and use it to validate different dataframe types.
Check the types and properties of columns.
Perform more complex statistical validation like hypothesis testing.
Seamlessly integrate with existing data pipelines via function decorators.
Define dataframe models with the class-based API with pydantic-style syntax.
Synthesize data from schema objects for property-based testing.
Lazily validate dataframes so that all validation rules are executed.
Integrate with a rich ecosystem of tools like
python-pydantic,python-fastapiandpython-mypy.
WebLogo is a web based application designed to make the generation of sequence logos as easy and painless as possible.
WebLogo can create output in several common graphics' formats, including the bitmap formats GIF and PNG, suitable for on-screen display, and the vector formats EPS and PDF, more suitable for printing, publication, and further editing. Additional graphics options include bitmap resolution, titles, optional axis, and axis labels, antialiasing, error bars, and alternative symbol formats.
A sequence logo is a graphical representation of an amino acid or nucleic acid multiple sequence alignment. Each logo consists of stacks of symbols, one stack for each position in the sequence. The overall height of the stack indicates the sequence conservation at that position, while the height of symbols within the stack indicates the relative frequency of each amino or nucleic acid at that position. The width of the stack is proportional to the fraction of valid symbols in that position.
Cheetah is a text-based template engine and Python code generator.
Cheetah can be used as a standalone templating utility or referenced as a library from other Python applications. It has many potential uses, but web developers looking for a viable alternative to ASP, JSP, PHP and PSP are expected to be its principle user group.
Features:
Generates HTML, SGML, XML, SQL, Postscript, form email, LaTeX, or any other text-based format.
Cleanly separates content, graphic design, and program code.
Blends the power and flexibility of Python with a simple template language that non-programmers can understand.
Gives template writers full access to any Python data structure, module, function, object, or method in their templates.
Makes code reuse easy by providing an object-orientated interface to templates that is accessible from Python code or other Cheetah templates. One template can subclass another and selectively reimplement sections of it.
Provides a simple, yet powerful, caching mechanism that can dramatically improve the performance of a dynamic website.
Compiles templates into optimized, yet readable, Python code.
Python Netlink library.
Python humanize utilities
spike sorting pipeline.
Software Heritage Authentication Utilities.
Software Heritage core utilities
Convert bioinformatics data to Zarr.
HTTP plugin for DVC.
Zope Template Application Language (TAL).
Software Heritage virtual file system.
Python wrapper for the Zotero API