Endpoint Detection for Streaming End-to-End Multi-Talker ASR
Endpoint Detection for Streaming End-to-End Multi-Talker ASR
Streaming end-to-end multi-talker speech recognition aims at transcribing the overlapped speech from conversations or meetings with an all-neural model in a streaming fashion, which is fundamentally different from a modular-based approach that usually cascades the speech separation and the speech recognition models trained independently. Previously, we proposed the Streaming Unmixing …