Ask a Question

Prefer a chat interface with context about you and your work?

One-Shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization

One-Shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization

Recently, voice conversion (VC) without parallel data has been successfully adapted to multi-target scenario in which a single model is trained to convert the input voice to many different speakers.However, such model suffers from the limitation that it can only convert the voice to the speakers in the training data, …