ProHoster > Blog > internet news > Open source AI model hertz-dev for full-duplex voice communication published

Open source AI model hertz-dev for full-duplex voice communication published

ProHoster Hosting - Unlimited hosting with a free website builder

Standard Intelligence has announced the publication of hertz-dev, the first open source AI model for full-duplex speech synthesis that can be used as the basis for real-time voice communication or conversational speech generation. The model generates speech that is close to the voice data it was trained on, providing a human-like experience without the lag that comes with a choppy phone call. The project's work is licensed under the Apache 2.0 license.

On a system with an NVIDIA GeForce RTX 4090 GPU, the average latency before generation is 120 ms (theoretically up to 65 ms), which is about twice as fast as existing publicly available models. The published version is built using a transformer architecture, covers 8.5 billion parameters, and is trained using 500 billion tokens. The size of the context taken into account by the model (the number of tokens that the model can process and remember when generating speech) is 2048 tokens, or about 4 minutes of speech.

Source: opennet.ru

ProHoster VPS - Affordable and reliable VPS servers with full root access

Yuri Gagarin

Author of the article

A technical specialist at ProHoster with over six years of experience in server administration, VPN solutions, and network security. I manage infrastructure setup and support, monitor service stability, and implement solutions to protect client data. I also contribute to performance optimization and compliance with modern security and privacy requirements.

Buy reliable hosting for sites with DDoS protection, VPS VDS servers 🔥 Buy reliable website hosting with DDoS protection, VPS VDS servers | ProHoster