Ask a Question

Prefer a chat interface with context about you and your work?

Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning

Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning

We introduce a simple neural encoder architecture that can be trained using an unsupervised contrastive learning objective which gets its positive samples from data-augmented k-Nearest Neighbors search.We show that when built on top of recent self-supervised audio representations [1, 2, 3], this method can be applied iteratively and yield competitive …