VoxCPM
by OpenBMB
About
VoxCPM is a tokenizer-free Text-to-Speech system that directly generates continuous speech representations via an end-to-end diffusion autoregressive architecture, achieving highly natural and expressive synthesis. VoxCPM2, the latest 2B parameter model, is trained on over 2 million hours of multilingual speech data, supporting 30 languages, Voice Design, Controllable Voice Cloning, and 48kHz studio-quality audio output with built-in super-resolution.
Features
- 30-Language Multilingual Support
- Reference-Free Voice Design
- Controllable & Ultimate Voice Cloning
- 48kHz Studio-Quality Audio Output
- Real-Time Streaming & Production Deployment
Supported Platforms
webdesktop