Expressive Neural Voice Cloning

Type: Preprint

Publication Date: 2021-01-01

Citations: 15

DOI: https://doi.org/10.48550/arxiv.2102.00151

Locations

  • arXiv (Cornell University) - View
  • DataCite API - View

Similar Works

Action Title Year Authors
+ SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer 2023 Daegyeom Kim
Seongho Hong
Yong-Hoon Choi
+ Self-supervised learning for robust voice cloning 2022 Konstantinos Klapsas
Nikolaos Ellinas
Karolos Nikitaras
Georgios Vamvoukakis
Panos Kakoulidis
螝蠅谓蟽蟿伪谓蟿委谓慰蟼 螠伪蟻魏蠈蟺慰蠀位慰蟼
Spyros Raptis
June Sig Sung
Gunu Jho
Aimilios Chalamandaris
+ Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis 2018 Daisy Stanton
Yuxuan Wang
RJ Skerry-Ryan
+ Neural Voice Cloning with a Few Samples 2018 Sercan 脰. Ar谋k
Jitong Chen
Kainan Peng
Wei Ping
Yanqi Zhou
+ PDF Chat Self supervised learning for robust voice cloning 2022 Konstantinos Klapsas
Nikolaos Ellinas
Karolos Nikitaras
Georgios Vamvoukakis
Panos Kakoulidis
螝蠅谓蟽蟿伪谓蟿委谓慰蟼 螠伪蟻魏蠈蟺慰蠀位慰蟼
Spyros Raptis
June Sig Sung
Gunu Jho
Aimilios Chalamandaris
+ Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis 2018 Daisy Stanton
Yuxuan Wang
RJ Skerry-Ryan
+ Uncovering Latent Style Factors for Expressive Speech Synthesis 2017 Yuxuan Wang
RJ Skerry-Ryan
Ying Xiao
Daisy Stanton
Joel Shor
Eric Battenberg
Rob Clark
Rif A. Saurous
+ Data Efficient Voice Cloning for Neural Singing Synthesis 2019 Merlijn Blaauw
Jordi Bonada
Ryunosuke Daido
+ Data Efficient Voice Cloning for Neural Singing Synthesis 2019 Merlijn Blaauw
Jordi Bonada
Ryunosuke Daido
+ PDF Chat Data Efficient Voice Cloning for Neural Singing Synthesis 2019 Merlijn Blaauw
Jordi Bonada
Ryunosuke Daido
+ PDF Chat Predicting Expressive Speaking Style from Text in End-To-End Speech Synthesis 2018 Daisy Stanton
Yuxuan Wang
RJ Skerry-Ryan
+ PDF Chat Style Mixture of Experts for Expressive Text-To-Speech Synthesis 2024 Ahad Jawaid
Shreeram Suresh Chandra
Junchen Lu
Berrak 艦i艧man
+ Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis 2018 Yuxuan Wang
Daisy Stanton
Yu Zhang
RJ Skerry-Ryan
Eric Battenberg
Joel Shor
Ying Xiao
Fei Ren
Jia Ye
Rif A. Saurous
+ Make-A-Voice: Unified Voice Synthesis With Discrete Representation 2023 Rongjie Huang
Chunlei Zhang
Yongqi Wang
Dongchao Yang
Luping Liu
Zhenhui Ye
Ziyue Karen Jiang
Chao Weng
Zhou Zhao
Dong Yu
+ PDF Chat Meta-Voice: Fast few-shot style transfer for expressive voice cloning using meta learning 2021 Songxiang Liu
Dan Su
Dong Yu
+ Meta-Voice: Fast few-shot style transfer for expressive voice cloning using meta learning 2021 Songxiang Liu
Dan Su
Dong Yu
+ GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech 2022 Rongjie Huang
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
+ PDF Chat DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech 2024 Jan Melechovsk媒
Ambuj Mehrish
Berrak 艦i艧man
Dorien Herremans
+ PDF Chat ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec 2024 Shengpeng Ji
Jialong Zuo
Minghui Fang
Siqi Zheng
Qian Chen
Wen Wang
Ziyue Karen Jiang
Hai Huang
Xize Cheng
Rongjie Huang
+ Adapting TTS models For New Speakers using Transfer Learning 2021 Paarth Neekhara
Jason Li
Boris Ginsburg

Works Cited by This (16)

Action Title Year Authors
+ Generalized End-to-End Loss for Speaker Verification 2017 Li Wan
Quan Wang
Alan Papir
Ignacio L贸pez Moreno
+ Deep Voice 3: 2000-Speaker Neural Text-to-Speech 2017 Wei Ping
Kainan Peng
Andrew Gibiansky
Sercan 脰. Ar谋k
Ajay Kannan
Sharan Narang
Jonathan Raiman
J. J. Miller
+ Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis 2018 Yuxuan Wang
Daisy Stanton
Yu Zhang
RJ Skerry-Ryan
Eric Battenberg
Joel Shor
Ying Xiao
Fei Ren
Jia Ye
Rif A. Saurous
+ Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron 2018 RJ Skerry-Ryan
Eric Battenberg
Ying Xiao
Yuxuan Wang
Daisy Stanton
Joel Shor
Ron J. Weiss
Rob Clark
Rif A. Saurous
+ PDF Chat VoxCeleb2: Deep Speaker Recognition 2018 Joon Son Chung
Arsha Nagrani
Andrew Zisserman
+ WaveNet: A Generative Model for Raw Audio 2016 A盲ron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
Andrew Senior
Koray Kavukcuoglu
+ Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis 2018 Daisy Stanton
Yuxuan Wang
RJ Skerry-Ryan
+ PDF Chat Waveglow: A Flow-based Generative Network for Speech Synthesis 2019 Ryan Prenger
Rafael Valle
Bryan Catanzaro
+ PDF Chat Tacotron: Towards End-to-End Speech Synthesis 2017 Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
Navdeep Jaitly
Zongheng Yang
Ying Xiao
Zhifeng Chen
Samy Bengio
+ Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning 2017 Wei Ping
Kainan Peng
Andrew Gibiansky
Sercan 脰. Ar谋k
Ajay Kannan
Sharan Narang
Jonathan Raiman
J. J. Miller