OmniVoice
by k2-fsa
About
OmniVoice is a state-of-the-art massively multilingual zero-shot Text-to-Speech (TTS) model, supporting over 600 languages. Built on a novel diffusion language model-style architecture, it delivers high-quality speech generation with superior inference speed. Key capabilities include industry-leading voice cloning, fine-grained voice design through attributes like gender, age, pitch, and accent, and precise control via non-verbal symbols and Chinese pinyin pronunciation correction. Its extensive language coverage and high-speed performance make OmniVoice an ideal solution for multilingual content creation, personalized voice synthesis, and real-time applications.
Supported Platforms
linuxmacos