Ask a Question

Prefer a chat interface with context about you and your work?

InstructTTS: Modelling Expressive TTS in Discrete Latent Space With Natural Language Style Prompt

InstructTTS: Modelling Expressive TTS in Discrete Latent Space With Natural Language Style Prompt

Expressive text-to-speech (TTS) aims to synthesize speech with varying speaking styles to better reflect human speech patterns. In this study, we attempt to use natural language as a style prompt to control the styles in the synthetic speech, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">e.g.</i> , "Sigh tone in full of sad mood with …