Prefer a chat interface with context about you and your work?
Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target