This package provides a speech recognition toolkit based on kaldi
. It supports more than 20 languages and dialects - English, Indian English, German, French, Spanish, Portuguese, Chinese, Russian, Turkish, Vietnamese, Italian, Dutch, Catalan, Arabic, Greek, Farsi, Filipino, Ukrainian, Kazakh, Swedish, Japanese, Esperanto, Hindi, Czech, Polish. The program works offline, even on lightweight devices. Portable per-language models are about 50Mb each, and there are much bigger and precise models available.
Vosk API provides a streaming API allowing to use it `on-the-fly' and bindings for different programming languages. It allows quick reconfiguration of vocabulary for better accuracy, and supports speaker identification beside simple speech recognition.