Ask a Question

Prefer a chat interface with context about you and your work?

Augmenting Images for ASR and TTS Through Single-Loop and Dual-Loop Multimodal Chain Framework

Augmenting Images for ASR and TTS Through Single-Loop and Dual-Loop Multimodal Chain Framework

Previous research has proposed a machine speech chain to enable automatic speech recognition (ASR) and text-to-speech synthesis (TTS) to assist each other in semi-supervised learning and to avoid the need for a large amount of paired speech and text data.However, that framework still requires a large amount of unpaired (speech …