Ask a Question

Prefer a chat interface with context about you and your work?

Speech Recognition Rescoring with Large Speech-Text Foundation Models

Speech Recognition Rescoring with Large Speech-Text Foundation Models

Large language models (LLM) have demonstrated the ability to understand human language by leveraging large amount of text data. Automatic speech recognition (ASR) systems are often limited by available transcribed speech data and benefit from a second pass rescoring using LLM. Recently multi-modal large language models, particularly speech and text …