Ask a Question

Prefer a chat interface with context about you and your work?

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Large language models (LLMs) have exhibited remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. Despite the recent success, current LLMs are not capable of processing complex audio information or conducting spoken conversations (like Siri or Alexa). In this work, we propose a …