Ask a Question

Prefer a chat interface with context about you and your work?

Unsupervised Pre-Training of Bidirectional Speech Encoders via Masked Reconstruction

Unsupervised Pre-Training of Bidirectional Speech Encoders via Masked Reconstruction

We propose an approach for pre-training speech representations via a masked reconstruction loss. Our pre-trained encoder networks are bidirectional and can therefore be used directly in typical bidirectional speech recognition models. The pre-trained networks can then be fine-tuned on a smaller amount of supervised data for speech recognition. Experiments with …