Ask a Question

Prefer a chat interface with context about you and your work?

Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss

Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss

In this paper we present an end-to-end speech recognition model with Transformer encoders that can be used in a streaming speech recognition system. Transformer computation blocks based on self-attention are used to encode both audio and label sequences independently. The activations from both audio and label encoders are combined with …