Prefer a chat interface with context about you and your work?
Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model