Ask a Question

Prefer a chat interface with context about you and your work?

Very Deep Self-Attention Networks for End-to-End Speech Recognition

Very Deep Self-Attention Networks for End-to-End Speech Recognition

Recently, end-to-end sequence-to-sequence models for speech recognition have gained significant interest in the research community.While previous architecture choices revolve around time-delay neural networks (TDNN) and long shortterm memory (LSTM) recurrent neural networks, we propose to use self-attention via the Transformer architecture as an alternative.Our analysis shows that deep Transformer networks …