Ask a Question

Prefer a chat interface with context about you and your work?

AST: Audio Spectrogram Transformer

AST: Audio Spectrogram Transformer

In the past decade, convolutional neural networks (CNNs) have been widely adopted as the main building block for endto-end audio classification models, which aim to learn a direct mapping from audio spectrograms to corresponding labels.To better capture long-range global context, a recent trend is to add a self-attention mechanism on …