SPLAT: Speech-Language Joint Pre-Training for Spoken Language Understanding
SPLAT: Speech-Language Joint Pre-Training for Spoken Language Understanding
Spoken language understanding (SLU) requires a model to analyze input acoustic signal to understand its linguistic content and make predictions. To boost the models’ performance, various pre-training methods have been proposed to learn rich representations from large-scale unannotated speech and text. However, the inherent disparities between the two modalities necessitate …