Abstract: Noise robustness is critical when applying automatic speech recognition (ASR) in real-world scenarios. One solution involves using speech enhancement (SE) models as the front end of ASR.
🌐 24+ Languages: Arabic, Chinese, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Italian, Japanese, Korean, Malay, Norwegian, Polish ...
VietTTS is an open-source toolkit providing the community with a powerful Vietnamese TTS model, capable of natural voice synthesis and robust voice cloning. Designed for effective experimentation, ...
Abstract: Selective state space models (SSMs) represented by Mamba have demonstrated their computational efficiency and promising outcomes in various tasks, including automatic speech recognition (ASR ...
一些您可能无法访问的结果已被隐去。
显示无法访问的结果