Projects
Reading
People
Chat
SU\G
(𝔸)
/K·U
Projects
Reading
People
Chat
Sign Up
Light
Dark
System
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
RJ Skerry-Ryan
,
Eric Battenberg
,
Ying Xiao
,
Yuxuan Wang
,
Daisy Stanton
,
Joel Shor
,
Ron J. Weiss
,
Rob Clark
,
Rif A. Saurous
Type:
Preprint
Publication Date:
2018-03-24
Citations:
56
View Publication
Share
Locations
arXiv (Cornell University) -
View
Similar Works
Action
Title
Year
Authors
+
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
2018
RJ Skerry-Ryan
Eric Battenberg
Ying Xiao
Yuxuan Wang
Daisy Stanton
Joel Shor
Ron J. Weiss
Rob Clark
Rif A. Saurous
+
PDF
Chat
CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural Text-to-Speech
2020
Sri Karlapati
Alexis Moinet
Arnaud Joly
Viacheslav Klimkov
Daniel Sáez-Trigueros
Thomas Drugman
+
Fine-grained robust prosody transfer for single-speaker neural text-to-speech
2019
Viacheslav Klimkov
Srikanth Ronanki
Jonas Rohnke
Thomas Drugman
+
PDF
Chat
Fine-Grained Robust Prosody Transfer for Single-Speaker Neural Text-To-Speech
2019
Viacheslav Klimkov
Srikanth Ronanki
Jonas Rohnke
Thomas Drugman
+
Fine-grained robust prosody transfer for single-speaker neural text-to-speech
2019
Viacheslav Klimkov
Srikanth Ronanki
Jonas Rohnke
Thomas Drugman
+
Daft-Exprt: Cross-Speaker Prosody Transfer on Any Text for Expressive Speech Synthesis
2021
Julian Zaïdi
Hugo Seuté
Benjamin van Niekerk
Marc-André Carbonneau
+
PDF
Chat
Daft-Exprt: Cross-Speaker Prosody Transfer on Any Text for Expressive Speech Synthesis
2022
Julian Zaïdi
Hugo Seuté
Benjamin van Niekerk
Marc-André Carbonneau
+
Robust and fine-grained prosody control of end-to-end speech synthesis
2018
Younggun Lee
Taesu Kim
+
PDF
Chat
Robust and Fine-grained Prosody Control of End-to-end Speech Synthesis
2019
Younggun Lee
Taesu Kim
+
PDF
Chat
CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer
2022
Sri Karlapati
Penny Karanasou
Mateusz Łajszczak
Syed Ammar Abbas
Alexis Moinet
Peter Makarov
Ray Li
Arent van Korlaar
Simon Slangen
Thomas Drugman
+
Cross-lingual Prosody Transfer for Expressive Machine Dubbing
2023
Jakub Świątkowski
Duo Wang
Mikolaj Babianski
Patrick Lumban Tobing
Ravichander Vipperla
Vincent Pollet
+
CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer
2022
Sri Karlapati
Penny Karanasou
Mateusz Łajszczak
Ammar Abbas
Alexis Moinet
Peter Makarov
Ray Li
Arent van Korlaar
Simon Slangen
Thomas Drugman
+
Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis.
2021
Julian Zaïdi
Hugo Seuté
Benjamin van Niekerk
Marc‐André Carbonneau
+
Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows.
2021
Iván Vallés-Ṕerez
Julian Roth
Grzegorz Beringer
Roberto Barra-Chicote
Jasha Droppo
+
PDF
Chat
Improving Multi-Speaker TTS Prosody Variance with a Residual Encoder and Normalizing Flows
2021
Iván Vallés-Ṕerez
Julian Roth
Grzegorz Beringer
Roberto Barra-Chicote
Jasha Droppo
+
Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows
2021
Iván Vallés-Ṕerez
Julian Roth
Grzegorz Beringer
Roberto Barra-Chicote
Jasha Droppo
+
Do Prosody Transfer Models Transfer Prosody?
2023
Atli Thor Sigurgeirsson
Simon King
+
eCat: An End-to-End Model for Multi-Speaker TTS & Many-to-Many Fine-Grained Prosody Transfer
2023
Ammar Abbas
Sri Karlapati
Bastian Schnell
Penny Karanasou
Marcel Granero Moya
Amith Nagaraj
Ayman Boustati
Nicole Peinelt
Alexis Moinet
Thomas Drugman
+
Expressive TTS Training with Frame and Style Reconstruction Loss
2020
Rui Liu
Berrak Şişman
Guanglai Gao
Haizhou Li
+
Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis
2018
Daisy Stanton
Yuxuan Wang
RJ Skerry-Ryan
Works That Cite This (39)
Action
Title
Year
Authors
+
PDF
Chat
A High-Fidelity Open Embodied Avatar with Lip Syncing and Expression Capabilities
2019
Deepali Aneja
Daniel McDuff
Shital Shah
+
Expressive Neural Voice Cloning
2021
Paarth Neekhara
Shehzeen Hussain
Shlomo Dubnov
Farinaz Koushanfar
Julian McAuley
+
Controllable neural text-to-speech synthesis using intuitive prosodic features
2020
Tuomo Raitio
Ramya Rasipuram
Dan Castellani
+
One-shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization
2019
Ju-Chieh Chou
Cheng-chieh Yeh
Hung-yi Lee
+
Towards Fine-Grained Prosody Control for Voice Conversion
2019
Zheng Lian
Zhengqi Wen
+
Sample Efficient Adaptive Text-to-Speech
2018
Yutian Chen
Yannis Assael
Brendan Shillingford
David Budden
Scott Reed
Heiga Zen
Quan Wang
Luis C. Cobo
Andrew Trask
Ben Laurie
+
Building a mixed-lingual neural TTS system with only monolingual data
2019
Liumeng Xue
Wei Song
Guanghui Xu
Lei Xie
Zhizheng Wu
+
Pitchtron: Towards audiobook generation from ordinary people's voices
2020
Sung‐Hee Jung
Hoirin Kim
+
Prosody Transfer in Neural Text to Speech Using Global Pitch and Loudness Features
2019
Siddharth Gururani
Kilol Gupta
Dhɑvɑl Shɑh
Zahra Shakeri
Jervis Pinto
+
PDF
Chat
Towards Fine-Grained Prosody Control for Voice Conversion
2021
Zheng Lian
Rongxiu Zhong
Zhengqi Wen
Bin Liu
Jianhua Tao
Works Cited by This (11)
Action
Title
Year
Authors
+
Generating Sequences With Recurrent Neural Networks
2013
Alex Graves
+
Deep Voice 2: Multi-Speaker Neural Text-to-Speech
2017
Sercan Ö. Arık
Gregory Diamos
Andrew Gibiansky
J. J. Miller
Kainan Peng
Wei Ping
Jonathan Raiman
Yanqi Zhou
+
Uncovering Latent Style Factors for Expressive Speech Synthesis
2017
Yuxuan Wang
RJ Skerry-Ryan
Ying Xiao
Daisy Stanton
Joel Shor
Eric Battenberg
Rob Clark
Rif A. Saurous
+
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
2017
Jonathan Shen
Ruoming Pang
Ron J. Weiss
Mike Schuster
Navdeep Jaitly
Zongheng Yang
Zhifeng Chen
Yu Zhang
Yuxuan Wang
RJ Skerry-Ryan
+
On Using Backpropagation for Speech Texture Generation and Voice Conversion
2017
Jan Chorowski
Ron J. Weiss
Rif A. Saurous
Samy Bengio
+
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
2018
Yuxuan Wang
Daisy Stanton
Yu Zhang
RJ Skerry-Ryan
Eric Battenberg
Joel Shor
Ying Xiao
Fei Ren
Jia Ye
Rif A. Saurous
+
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
2015
Sergey Ioffe
Christian Szegedy
+
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
2014
Kyunghyun Cho
Bart van Merriënboer
Çaǧlar Gülçehre
Dzmitry Bahdanau
Fethi Bougares
Holger Schwenk
Yoshua Bengio
+
PDF
Chat
Tacotron: Towards End-to-End Speech Synthesis
2017
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
Navdeep Jaitly
Zongheng Yang
Ying Xiao
Zhifeng Chen
Samy Bengio
+
Neural Discrete Representation Learning
2017
Aäron van den Oord
Oriol Vinyals
Koray Kavukcuoglu