+
PDF
Chat
|
G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR
|
2023
|
Gary Wang
Ekin D. Cubuk
Andrew Rosenberg
Shuyang Cheng
Ron J. Weiss
Bhuvana Ramabhadran
Pedro J. Moreno
Quoc V. Le
Daniel Park
|
+
|
G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR
|
2022
|
Gary Wang
Ekin D. Cubuk
Andrew E. Rosenberg
Shuyang Cheng
Ron J. Weiss
Bhuvana Ramabhadran
Pedro J. Moreno
Quoc V. Le
Daniel Park
|
+
PDF
Chat
|
Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation
|
2021
|
Scott Wisdom
Aren Jansen
Ron J. Weiss
Hakan Erdoğan
John R. Hershey
|
+
PDF
Chat
|
Multitask Training with Text Data for End-to-End Speech Recognition
|
2021
|
Peidong Wang
Tara N. Sainath
Ron J. Weiss
|
+
PDF
Chat
|
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
|
2021
|
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
Najim Dehak
William Chan
|
+
PDF
Chat
|
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
|
2021
|
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
Najim Dehak
William Chan
|
+
PDF
Chat
|
Parallel Tacotron: Non-Autoregressive and Controllable TTS
|
2021
|
Isaac Elias
Heiga Zen
Jonathan Shen
Yu Zhang
Jia Ye
Ron J. Weiss
Yonghui Wu
|
+
PDF
Chat
|
Wave-Tacotron: Spectrogram-Free End-to-End Text-to-Speech Synthesis
|
2021
|
Ron J. Weiss
RJ Skerry-Ryan
Eric Battenberg
Soroosh Mariooryad
Diederik P. Kingma
|
+
|
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
|
2021
|
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
Najim Dehak
William Chan
|
+
|
Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation
|
2021
|
Scott Wisdom
Aren Jansen
Ron J. Weiss
Hakan Erdoğan
John R. Hershey
|
+
|
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
|
2020
|
Ron J. Weiss
RJ Skerry-Ryan
Eric Battenberg
Soroosh Mariooryad
Diederik P. Kingma
|
+
PDF
Chat
|
Multitask Training with Text Data for End-to-End Speech Recognition
|
2020
|
Peidong Wang
Tara N. Sainath
Ron J. Weiss
|
+
PDF
Chat
|
Generating Diverse and Natural Text-to-Speech Samples Using a Quantized Fine-Grained VAE and Autoregressive Prosody Prior
|
2020
|
Guangzhi Sun
Zhang Yu
Ron J. Weiss
Yuan Cao
Heiga Zen
Andrew Rosenberg
Bhuvana Ramabhadran
Yonghui Wu
|
+
PDF
Chat
|
Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis
|
2020
|
Guangzhi Sun
Yu Zhang
Ron J. Weiss
Yuan Cao
Heiga Zen
Yonghui Wu
|
+
|
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior
|
2020
|
Guangzhi Sun
Yu Zhang
Ron J. Weiss
Yuan Cao
Heiga Zen
Andrew Rosenberg
Bhuvana Ramabhadran
Yonghui Wu
|
+
PDF
Chat
|
Generating diverse and natural text-to-speech samples using a quantized
fine-grained VAE and auto-regressive prosody prior
|
2020
|
Guangzhi Sun
Yu Zhang
Ron J. Weiss
Yuan Cao
Heiga Zen
Andrew E. Rosenberg
Bhuvana Ramabhadran
Yonghui Wu
|
+
PDF
Chat
|
Fully-hierarchical fine-grained prosody modeling for interpretable
speech synthesis
|
2020
|
Guangzhi Sun
Yu Zhang
Ron J. Weiss
Yuan Cao
Heiga Zen
Yonghui Wu
|
+
|
Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis
|
2020
|
Guangzhi Sun
Yu Zhang
Ron J. Weiss
Yuan Cao
Heiga Zen
Yonghui Wu
|
+
|
Parallel Tacotron: Non-Autoregressive and Controllable TTS
|
2020
|
Isaac Elias
Heiga Zen
Jonathan Shen
Yu Zhang
Jia Ye
Ron J. Weiss
Yonghui Wu
|
+
|
Unsupervised Sound Separation Using Mixture Invariant Training
|
2020
|
Scott Wisdom
Efthymios Tzinis
Hakan Erdoğan
Ron J. Weiss
Kevin Wilson
John R. Hershey
|
+
|
Multitask Training with Text Data for End-to-End Speech Recognition
|
2020
|
Peidong Wang
Tara N. Sainath
Ron J. Weiss
|
+
|
WaveGrad: Estimating Gradients for Waveform Generation
|
2020
|
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
William Chan
|
+
|
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior
|
2020
|
Guangzhi Sun
Zhang Yu
Ron J. Weiss
Yuan Cao
Heiga Zen
Andrew Rosenberg
Bhuvana Ramabhadran
Yonghui Wu
|
+
|
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
|
2020
|
Ron J. Weiss
RJ Skerry-Ryan
Eric Battenberg
Soroosh Mariooryad
Diederik P. Kingma
|
+
PDF
Chat
|
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
|
2019
|
Heiga Zen
Viet Chau Dang
Rob Clark
Zhang Yu
Ron J. Weiss
Jia Ye
Zhifeng Chen
Yonghui Wu
|
+
PDF
Chat
|
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
|
2019
|
Yu Zhang
Ron J. Weiss
Heiga Zen
Yonghui Wu
Zhifeng Chen
RJ Skerry-Ryan
Jia Ye
Andrew Rosenberg
Bhuvana Ramabhadran
|
+
PDF
Chat
|
Direct Speech-to-Speech Translation with a Sequence-to-Sequence Model
|
2019
|
Jia Ye
Ron J. Weiss
Fadi Biadsy
Wolfgang Macherey
Melvin Johnson
Zhifeng Chen
Yonghui Wu
|
+
PDF
Chat
|
Parrotron: An End-to-End Speech-to-Speech Conversion Model and its Applications to Hearing-Impaired Speech and Speech Separation
|
2019
|
Fadi Biadsy
Ron J. Weiss
Pedro J. Moreno
D. Kanvesky
Jia Ye
|
+
PDF
Chat
|
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
|
2019
|
Quan Wang
Hannah Muckenhirn
Kevin Wilson
Prashant Sridhar
Zelin Wu
John R. Hershey
Rif A. Saurous
Ron J. Weiss
Jia Ye
Ignacio López Moreno
|
+
PDF
Chat
|
Unsupervised Speech Representation Learning Using WaveNet Autoencoders
|
2019
|
Jan Chorowski
Ron J. Weiss
Samy Bengio
Aäron van den Oord
|
+
PDF
Chat
|
A Spelling Correction Model for End-to-end Speech Recognition
|
2019
|
Jinxi Guo
Tara N. Sainath
Ron J. Weiss
|
+
PDF
Chat
|
Leveraging Weakly Supervised Data to Improve End-to-end Speech-to-text Translation
|
2019
|
Jia Ye
Melvin Johnson
Wolfgang Macherey
Ron J. Weiss
Yuan Cao
Chung‐Cheng Chiu
Naveen Ari
Stella Laurenzo
Yonghui Wu
|
+
|
Parrotron: An End-to-End Speech-to-Speech Conversion Model and its Applications to Hearing-Impaired Speech and Speech Separation
|
2019
|
Fadi Biadsy
Ron J. Weiss
Pedro J. Moreno
Dimitri Kanevsky
Jia Ye
|
+
|
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
|
2019
|
Jonathan Shen
Patrick Nguyen
Yonghui Wu
Zhifeng Chen
Mia Xu Chen
Jia Ye
Anjuli Kannan
Tara N. Sainath
Yuan Cao
Chung‐Cheng Chiu
|
+
|
Direct speech-to-speech translation with a sequence-to-sequence model
|
2019
|
Jia Ye
Ron J. Weiss
Fadi Biadsy
Wolfgang Macherey
Melvin Johnson
Zhifeng Chen
Yonghui Wu
|
+
|
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
|
2019
|
Heiga Zen
Viet Chau Dang
Rob Clark
Zhang Yu
Ron J. Weiss
Jia Ye
Zhifeng Chen
Yonghui Wu
|
+
|
A spelling correction model for end-to-end speech recognition
|
2019
|
Jinxi Guo
Tara N. Sainath
Ron J. Weiss
|
+
|
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
|
2019
|
Yu Zhang
Ron J. Weiss
Heiga Zen
Yonghui Wu
Zhifeng Chen
RJ Skerry-Ryan
Jia Ye
Andrew Rosenberg
Bhuvana Ramabhadran
|
+
|
Hierarchical Generative Modeling for Controllable Speech Synthesis
|
2018
|
Wei-Ning Hsu
Yu Zhang
Ron J. Weiss
Heiga Zen
Yonghui Wu
Yuxuan Wang
Yuan Cao
Jia Ye
Zhifeng Chen
Jonathan Shen
|
+
|
Metrics for Signal Temporal Logic Formulae
|
2018
|
Curtis Madsen
Prashant Vaidyanathan
Sadra Sadraddini
Cristian-Ioan Vasile
Nicholas A. DeLateur
Ron J. Weiss
Douglas Densmore
Călin Belta
|
+
|
Synthesizing Diverse, High-Quality Audio Textures.
|
2018
|
Joseph M. Antognini
Matt Hoffman
Ron J. Weiss
|
+
PDF
Chat
|
State-of-the-Art Speech Recognition with Sequence-to-Sequence Models
|
2018
|
Chung‐Cheng Chiu
Tara N. Sainath
Yonghui Wu
Rohit Prabhavalkar
Patrick Nguyen
Zhifeng Chen
Anjuli Kannan
Ron J. Weiss
Kanishka Rao
Ekaterina Gonina
|
+
PDF
Chat
|
On Using Backpropagation for Speech Texture Generation and Voice Conversion
|
2018
|
Jan Chorowski
Ron J. Weiss
Rif A. Saurous
Samy Bengio
|
+
PDF
Chat
|
Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions
|
2018
|
Jonathan Shen
Ruoming Pang
Ron J. Weiss
Mike Schuster
Navdeep Jaitly
Zongheng Yang
Zhifeng Chen
Yu Zhang
Yuxuan Wang
Rj Skerrv-Ryan
|
+
PDF
Chat
|
Multilingual Speech Recognition with a Single End-to-End Model
|
2018
|
Shubham Toshniwal
Tara N. Sainath
Ron J. Weiss
Bo Li
Pedro J. Moreno
Eugene Weinstein
Kanishka Rao
|
+
|
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
|
2018
|
RJ Skerry-Ryan
Eric Battenberg
Ying Xiao
Yuxuan Wang
Daisy Stanton
Joel Shor
Ron J. Weiss
Rob Clark
Rif A. Saurous
|
+
|
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
|
2018
|
Jia Ye
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
Fei Ren
Zhifeng Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
|
+
|
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
|
2018
|
Quan Wang
Hannah Muckenhirn
Kevin Wilson
Prashant Sridhar
Zelin Wu
John R. Hershey
Rif A. Saurous
Ron J. Weiss
Jia Ye
Ignacio López Moreno
|
+
|
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation
|
2018
|
Jia Ye
Melvin Johnson
Wolfgang Macherey
Ron J. Weiss
Yuan Cao
Chung‐Cheng Chiu
Naveen Ari
Stella Laurenzo
Yonghui Wu
|
+
|
Hierarchical Generative Modeling for Controllable Speech Synthesis
|
2018
|
Wei-Ning Hsu
Yu Zhang
Ron J. Weiss
Heiga Zen
Yonghui Wu
Yuxuan Wang
Yuan Cao
Jia Ye
Zhifeng Chen
Jonathan Shen
|
+
|
Synthesizing Diverse, High-Quality Audio Textures
|
2018
|
Joseph F. Antognini
Matt Hoffman
Ron J. Weiss
|
+
|
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
|
2018
|
RJ Skerry-Ryan
Eric Battenberg
Ying Xiao
Yuxuan Wang
Daisy Stanton
Joel Shor
Ron J. Weiss
Rob Clark
Rif A. Saurous
|
+
|
On Using Backpropagation for Speech Texture Generation and Voice Conversion
|
2017
|
Jan Chorowski
Ron J. Weiss
Rif A. Saurous
Samy Bengio
|
+
|
State-of-the-art Speech Recognition With Sequence-to-Sequence Models
|
2017
|
Chung‐Cheng Chiu
Tara N. Sainath
Yonghui Wu
Rohit Prabhavalkar
Patrick Nguyen
Zhifeng Chen
Anjuli Kannan
Ron J. Weiss
Kanishka Rao
Ekaterina Gonina
|
+
PDF
Chat
|
Sequence-to-Sequence Models Can Directly Translate Foreign Speech
|
2017
|
Ron J. Weiss
Jan Chorowski
Navdeep Jaitly
Yonghui Wu
Zhifeng Chen
|
+
PDF
Chat
|
Tacotron: Towards End-to-End Speech Synthesis
|
2017
|
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
Navdeep Jaitly
Zongheng Yang
Ying Xiao
Zhifeng Chen
Samy Bengio
|
+
|
Online and Linear-Time Attention by Enforcing Monotonic Alignments
|
2017
|
Colin Raffel
Minh-Thang Luong
Peter J. Liu
Ron J. Weiss
Douglas Eck
|
+
|
Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model.
|
2017
|
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
Navdeep Jaitly
Zongheng Yang
Ying Xiao
Zhifeng Chen
Samy Bengio
|
+
PDF
Chat
|
CNN architectures for large-scale audio classification
|
2017
|
Shawn Hershey
Sourish Chaudhuri
Daniel P. W. Ellis
Jort F. Gemmeke
Aren Jansen
Robert C. Moore
Manoj Plakal
Devin Platt
Rif A. Saurous
Bryan Seybold
|
+
|
Online and Linear-Time Attention by Enforcing Monotonic Alignments
|
2017
|
Colin Raffel
Minh-Thang Luong
Peter J. Liu
Ron J. Weiss
Douglas Eck
|
+
|
Tacotron: Towards End-to-End Speech Synthesis
|
2017
|
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
Navdeep Jaitly
Zongheng Yang
Ying Xiao
Zhifeng Chen
Samy Bengio
|
+
|
Multilingual Speech Recognition With A Single End-To-End Model
|
2017
|
Shubham Toshniwal
Tara N. Sainath
Ron J. Weiss
Bo Li
Pedro J. Moreno
Eugene Weinstein
Kanishka Rao
|
+
|
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
|
2017
|
Jonathan Shen
Ruoming Pang
Ron J. Weiss
Mike Schuster
Navdeep Jaitly
Zongheng Yang
Zhifeng Chen
Yu Zhang
Yuxuan Wang
RJ Skerry-Ryan
|
+
|
Sequence-to-Sequence Models Can Directly Translate Foreign Speech
|
2017
|
Ron J. Weiss
Jan Chorowski
Navdeep Jaitly
Yonghui Wu
Zhifeng Chen
|
+
|
On Using Backpropagation for Speech Texture Generation and Voice Conversion
|
2017
|
Jan Chorowski
Ron J. Weiss
Rif A. Saurous
Samy Bengio
|
+
|
CNN Architectures for Large-Scale Audio Classification
|
2016
|
Shawn Hershey
Sourish Chaudhuri
Daniel P. W. Ellis
Jort F. Gemmeke
Aren Jansen
Robert C. Moore
Manoj Plakal
Devin Platt
Rif A. Saurous
Bryan Seybold
|
+
|
CNN Architectures for Large-Scale Audio Classification
|
2016
|
Shawn Hershey
Sourish Chaudhuri
Daniel P. W. Ellis
Jort F. Gemmeke
Aren Jansen
Robert C. Moore
Manoj Plakal
Devin Platt
Rif A. Saurous
Bryan Seybold
|
+
|
Affinity Weighted Embedding
|
2013
|
Jason Weston
Ron J. Weiss
Hector Yee
|
+
|
Scikit-learn: Machine Learning in Python
|
2012
|
Fabián Pedregosa
Gaël Varoquaux
Alexandre Gramfort
Vincent Michel
Bertrand Thirion
Olivier Grisel
Mathieu Blondel
Peter Prettenhofer
Ron J. Weiss
Vincent Dubourg
|
+
|
Latent Collaborative Retrieval
|
2012
|
Jason Weston
Chong Wang
Ron J. Weiss
Adam Berenzweig
|
+
PDF
Chat
|
Scikit-learn: Machine Learning in Python
|
2011
|
Fabián Pedregosa
Gaël Varoquaux
Alexandre Gramfort
Vincent Michel
Bertrand Thirion
Olivier Grisel
Mathieu Blondel
Andreas Müller
Joel Nothman
Gilles Louppe
|