Ask a Question

Prefer a chat interface with context about you and your work?

FastEmit: Low-Latency Streaming ASR with Sequence-Level Emission Regularization

FastEmit: Low-Latency Streaming ASR with Sequence-Level Emission Regularization

Streaming automatic speech recognition (ASR) aims to emit each hypothesized word as quickly and accurately as possible. However, emitting fast without degrading quality, as measured by word error rate (WER), is highly challenging. Existing approaches including Early and Late Penalties [1] and Constrained Alignments [2], [3] penalize emission delay by …