DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question Answering
DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question Answering
Spoken Question Answering (SQA) is to find the answer from a spoken document given a question, which is crucial for personal assistants when replying to the queries from the users.Existing SQA methods all rely on Automatic Speech Recognition (ASR) transcripts.Not only does ASR need to be trained with massive annotated …