End-to-end Contextual Speech Recognition Using Class Language Models and a Token Passing Decoder
End-to-end Contextual Speech Recognition Using Class Language Models and a Token Passing Decoder
End-to-end modeling (E2E) of automatic speech recognition (ASR) blends all the components of a traditional speech recognition system into a single, unified model. Although it simplifies the ASR systems, the unified model is hard to adapt when training and testing data mismatches. In this work, we focus on contextual speech …