New models for Russian speech recognition in the Vosk library

The developers of the Vosk library have published new models for Russian speech recognition: server-side vosk-model-ru-0.22 and mobile Vosk-model-small-ru-0.22. The models use new speech data, as well as a new neural network architecture, which made it possible to increase recognition accuracy by 10-20%. The code and data are distributed under the Apache 2.0 license.

Important changes:

  • The new data collected in the voice columns significantly improves the recognition of speech commands spoken from a distance.
  • The new sound extraction scheme has significantly improved recognition accuracy for wideband recordings. At the same time, telephony recognition accuracy has also improved.
  • The dictionary add-on package allows you to customize the recognition of complex technical entries.

For best accuracy, it is recommended to update Wax version to 0.3.32 as well. You may also be interested in the new Wax features - integration with Unity, Nativescript, Jigasi. Models for recognition of Kazakh and Ukrainian languages. The server model needs a modern processor and 8GB of memory to work. The mobile model can be used in phones and RaspberryPi 3+.

Source: opennet.ru

Add a comment