Ask a Question

Prefer a chat interface with context about you and your work?

Speech-XLNet: Unsupervised Acoustic Model Pretraining for Self-Attention Networks

Speech-XLNet: Unsupervised Acoustic Model Pretraining for Self-Attention Networks

Self-attention network (SAN) can benefit significantly from the bi-directional representation learning through unsupervised pretraining paradigms such as BERT and XLNet.In this paper, we present an XLNet-like pretraining scheme "Speech-XLNet" to learn speech representations with self-attention networks (SANs).Firstly, we find that by shuffling the speech frame orders, Speech-XLNet serves as a …