Expressive Neural Voice Cloning

Paarth Neekhara, Shehzeen Hussain, Shlomo Dubnov, Farinaz Koushanfar, Julian McAuley

Type: Preprint

Publication Date: 2021-01-01

Citations: 15

DOI: https://doi.org/10.48550/arxiv.2102.00151

View Publication

Locations

arXiv (Cornell University) - View
DataCite API - View

Similar Works

Action	Title	Year	Authors
+	SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer	2023	Daegyeom Kim Seongho Hong Yong-Hoon Choi
+	Self-supervised learning for robust voice cloning	2022	Konstantinos Klapsas Nikolaos Ellinas Karolos Nikitaras Georgios Vamvoukakis Panos Kakoulidis Κωνσταντίνος Μαρκόπουλος Spyros Raptis June Sig Sung Gunu Jho Aimilios Chalamandaris
+	Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis	2018	Daisy Stanton Yuxuan Wang RJ Skerry-Ryan
+	Neural Voice Cloning with a Few Samples	2018	Sercan Ö. Arık Jitong Chen Kainan Peng Wei Ping Yanqi Zhou
+ PDF Chat	Self supervised learning for robust voice cloning	2022	Konstantinos Klapsas Nikolaos Ellinas Karolos Nikitaras Georgios Vamvoukakis Panos Kakoulidis Κωνσταντίνος Μαρκόπουλος Spyros Raptis June Sig Sung Gunu Jho Aimilios Chalamandaris
+	Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis	2018	Daisy Stanton Yuxuan Wang RJ Skerry-Ryan
+	Uncovering Latent Style Factors for Expressive Speech Synthesis	2017	Yuxuan Wang RJ Skerry-Ryan Ying Xiao Daisy Stanton Joel Shor Eric Battenberg Rob Clark Rif A. Saurous
+	Data Efficient Voice Cloning for Neural Singing Synthesis	2019	Merlijn Blaauw Jordi Bonada Ryunosuke Daido
+	Data Efficient Voice Cloning for Neural Singing Synthesis	2019	Merlijn Blaauw Jordi Bonada Ryunosuke Daido
+ PDF Chat	Data Efficient Voice Cloning for Neural Singing Synthesis	2019	Merlijn Blaauw Jordi Bonada Ryunosuke Daido
+ PDF Chat	Predicting Expressive Speaking Style from Text in End-To-End Speech Synthesis	2018	Daisy Stanton Yuxuan Wang RJ Skerry-Ryan
+ PDF Chat	Style Mixture of Experts for Expressive Text-To-Speech Synthesis	2024	Ahad Jawaid Shreeram Suresh Chandra Junchen Lu Berrak Şişman
+	Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis	2018	Yuxuan Wang Daisy Stanton Yu Zhang RJ Skerry-Ryan Eric Battenberg Joel Shor Ying Xiao Fei Ren Jia Ye Rif A. Saurous
+	Make-A-Voice: Unified Voice Synthesis With Discrete Representation	2023	Rongjie Huang Chunlei Zhang Yongqi Wang Dongchao Yang Luping Liu Zhenhui Ye Ziyue Karen Jiang Chao Weng Zhou Zhao Dong Yu
+ PDF Chat	Meta-Voice: Fast few-shot style transfer for expressive voice cloning using meta learning	2021	Songxiang Liu Dan Su Dong Yu
+	Meta-Voice: Fast few-shot style transfer for expressive voice cloning using meta learning	2021	Songxiang Liu Dan Su Dong Yu
+	GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech	2022	Rongjie Huang Yi Ren Jinglin Liu Chenye Cui Zhou Zhao
+ PDF Chat	DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech	2024	Jan Melechovský Ambuj Mehrish Berrak Şişman Dorien Herremans
+ PDF Chat	ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec	2024	Shengpeng Ji Jialong Zuo Minghui Fang Siqi Zheng Qian Chen Wen Wang Ziyue Karen Jiang Hai Huang Xize Cheng Rongjie Huang
+	Adapting TTS models For New Speakers using Transfer Learning	2021	Paarth Neekhara Jason Li Boris Ginsburg

Works That Cite This (7)

Action	Title	Year	Authors
+	A Survey on Neural Speech Synthesis	2021	Xu Tan Tao Qin Frank K. Soong Tie‐Yan Liu
+	Fine-Grained Emotional Control of Text-to-Speech: Learning to Rank Inter- and Intra-Class Emotion Intensities	2023	Shijun Wang Jón Guðnason Damian Borth
+ PDF Chat	The Role of Vocal Persona in Natural and Synthesized Speech	2023	Camille Noufi Lloyd May Jonathan Berger
+ PDF Chat	Improved Prosodic Clustering for Multispeaker and Speaker-Independent Phoneme-Level Prosody Control	2021	Myrsini Christidou Alexandra Vioni Nikolaos Ellinas Georgios Vamvoukakis Κωνσταντίνος Μαρκόπουλος Panos Kakoulidis June Sig Sung Hyoung-Min Park Aimilios Chalamandaris Pirros Tsiakoulis
+	Adapting TTS models For New Speakers using Transfer Learning.	2021	Paarth Neekhara Jason Li Boris Ginsburg
+	Context, Perception, Production: A Model of Vocal Persona	2023	Camille Noufi Lloyd May Jonathan Berger
+ PDF Chat	Distribution Augmentation for Low-Resource Expressive Text-To-Speech	2022	Mateusz Łajszczak Animesh Prasad Arent van Korlaar Bajibabu Bollepalli Antonio Bonafonte Arnaud Joly Marco Nicolis Alexis Moinet Thomas Drugman Trevor Wood

Works Cited by This (16)

Action	Title	Year	Authors
+	Generalized End-to-End Loss for Speaker Verification	2017	Li Wan Quan Wang Alan Papir Ignacio López Moreno
+	Deep Voice 3: 2000-Speaker Neural Text-to-Speech	2017	Wei Ping Kainan Peng Andrew Gibiansky Sercan Ö. Arık Ajay Kannan Sharan Narang Jonathan Raiman J. J. Miller
+	Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis	2018	Yuxuan Wang Daisy Stanton Yu Zhang RJ Skerry-Ryan Eric Battenberg Joel Shor Ying Xiao Fei Ren Jia Ye Rif A. Saurous
+	Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron	2018	RJ Skerry-Ryan Eric Battenberg Ying Xiao Yuxuan Wang Daisy Stanton Joel Shor Ron J. Weiss Rob Clark Rif A. Saurous
+ PDF Chat	VoxCeleb2: Deep Speaker Recognition	2018	Joon Son Chung Arsha Nagrani Andrew Zisserman
+	WaveNet: A Generative Model for Raw Audio	2016	Aäron van den Oord Sander Dieleman Heiga Zen Karen Simonyan Oriol Vinyals Alex Graves Nal Kalchbrenner Andrew Senior Koray Kavukcuoglu
+	Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis	2018	Daisy Stanton Yuxuan Wang RJ Skerry-Ryan
+ PDF Chat	Waveglow: A Flow-based Generative Network for Speech Synthesis	2019	Ryan Prenger Rafael Valle Bryan Catanzaro
+ PDF Chat	Tacotron: Towards End-to-End Speech Synthesis	2017	Yuxuan Wang RJ Skerry-Ryan Daisy Stanton Yonghui Wu Ron J. Weiss Navdeep Jaitly Zongheng Yang Ying Xiao Zhifeng Chen Samy Bengio
+	Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning	2017	Wei Ping Kainan Peng Andrew Gibiansky Sercan Ö. Arık Ajay Kannan Sharan Narang Jonathan Raiman J. J. Miller