Diff-TTS: A Denoising Diffusion Model for Text-to-Speech
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech
Although neural text-to-speech (TTS) models have attracted a lot of attention and succeeded in generating human-like speech, there is still room for improvements to its naturalness and architectural efficiency.In this work, we propose a novel nonautoregressive TTS model, namely Diff-TTS, which achieves highly natural and efficient speech synthesis.Given the text, …