Ask a Question

Prefer a chat interface with context about you and your work?

Listening While Speaking and Visualizing: Improving ASR Through Multimodal Chain

Listening While Speaking and Visualizing: Improving ASR Through Multimodal Chain

Previously, a machine speech chain, which is based on sequence-to-sequence deep learning, was proposed to mimic speech perception and production behavior. Such chains separately processed listening and speaking by automatic speech recognition (ASR) and text-to-speech synthesis (TTS) and simultaneously enabled them to teach each other in semi-supervised learning when they …