Ask a Question

Prefer a chat interface with context about you and your work?

Expressive paragraph text-to-speech synthesis with multi-step variational autoencoder

Expressive paragraph text-to-speech synthesis with multi-step variational autoencoder

Neural networks have been able to generate high-quality single-sentence speech. However, it remains a challenge concerning audio-book speech synthesis due to the intra-paragraph correlation of semantic and acoustic features as well as variable styles. In this paper, we propose a highly expressive paragraph speech synthesis system with a multi-step variational …