HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units

Type: Preprint

Publication Date: 2021-06-14

Citations: 249

Locations

  • arXiv (Cornell University) - View

Similar Works

Action Title Year Authors
+ PDF Chat HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units 2021 Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdelrahman Mohamed
+ HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units 2021 Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdelrahman Mohamed
+ PDF Chat w2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training 2021 Yu-An Chung
Yu Zhang
Wei Han
Chung‐Cheng Chiu
James Qin
Ruoming Pang
Yonghui Wu
+ PDF Chat MS-HuBERT: Mitigating Pre-training and Inference Mismatch in Masked Language Modelling methods for learning Speech Representations 2024 Hemant Yadav
Sunayana Sitaram
Rajiv Ratn Shah
+ PDF Chat MS-HuBERT: Mitigating Pre-training and Inference Mismatch in Masked Language Modelling methods for learning Speech Representations 2024 Hemant Yadav
Sunayana Sitaram
Rajiv Ratn Shah
+ PDF Chat Distilhubert: Speech Representation Learning by Layer-Wise Distillation of Hidden-Unit Bert 2022 Heng-Jui Chang
Shu-Wen Yang
Hung-yi Lee
+ DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT 2021 Heng-Jui Chang
Shuwen Yang
Hung-yi Lee
+ PDF Chat W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training 2021 Yu-An Chung
Yu Zhang
Wei Han
Chung‐Cheng Chiu
James Qin
Ruoming Pang
Yonghui Wu
+ W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training 2021 Yu-An Chung
Yu Zhang
Wei Han
Chung‐Cheng Chiu
James Qin
Ruoming Pang
Yonghui Wu
+ Representation Learning With Hidden Unit Clustering For Low Resource Speech Applications 2023 Varun Krishna
Tarun Sai
Sriram Ganapathy
+ Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation 2023 Ziyang Ma
Zhisheng Zheng
Guanrou Yang
Yu Wang
Chao Zhang
Xie Chen
+ PDF Chat Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT 2024 Ryota Komatsu
Takahiro Shinozaki
+ Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks 2021 Sangeeta Srivastava
Yun Wang
Andros Tjandra
Anurag Kumar
Chunxi Liu
Kritika Singh
Yatharth Saraf
+ PDF Chat Conformer-Based Self-Supervised Learning For Non-Speech Audio Tasks 2022 Sangeeta Srivastava
Yun Wang
Andros Tjandra
Anurag Kumar
Chunxi Liu
Kritika Singh
Yatharth Saraf
+ wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations 2020 Alexei Baevski
Henry Zhou
Abdelrahman Mohamed
Michael Auli
+ Self-supervised Learning with Random-projection Quantizer for Speech Recognition 2022 Chung‐Cheng Chiu
James Qin
Yu Zhang
Jiahui Yu
Yonghui Wu
+ PDF Chat Speech Representation Learning Revisited: The Necessity of Separate Learnable Parameters and Robust Data Augmentation 2024 Hemant Yadav
Sunayana Sitaram
Rajiv Ratn Shah
+ PDF Chat LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks 2024 Amit Meghanani
Thomas Hain
+ PDF Chat LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks 2024 Amit Meghanani
Thomas Hain
+ PDF Chat Revisiting speech segmentation and lexicon learning with better features 2024 Herman Kamper
Benjamin van Niekerk

Works That Cite This (191)

Action Title Year Authors
+ Towards an Efficient Voice Identification Using Wav2Vec2.0 and HuBERT Based on the Quran Reciters Dataset 2021 Aly Moustafa
Salah A. Aly
+ PDF Chat Direct Speech-to-Speech Translation With Discrete Units 2022 Ann Lee
Peng‐Jen Chen
Changhan Wang
Jiatao Gu
Sravya Popuri
Xutai Ma
Adam Polyak
Yossi Adi
Qing He
Yun Tang
+ PDF Chat SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities 2022 Hsiang-Sheng Tsai
Heng-Jui Chang
Wen-Chin Huang
Zi−Li Huang
Kushal Lakhotia
Shuwen Yang
Shuyan Dong
Andy Liu
Cheng-I Lai
Jiatong Shi
+ Perceive and Predict: Self-Supervised Speech Representation Based Loss Functions for Speech Enhancement 2023 George Close
William Ravenscroft
Thomas Hain
Stefan Goetze
+ PDF Chat Generalization Ability of MOS Prediction Networks 2022 Erica Cooper
Wen-Chin Huang
Tomoki Toda
Junichi Yamagishi
+ PDF Chat ESPnet-SLU: Advancing Spoken Language Understanding Through ESPnet 2022 Siddhant Arora
Siddharth Dalmia
Pavel Denisov
Xuankai Chang
Yushi Ueda
Yifan Peng
Yuekai Zhang
Sujay V. Kumar
K. Ganesan
Brian Yan
+ Mandarin-English Code-switching Speech Recognition with Self-supervised Speech Representation Models 2021 Liang-Hsuan Tseng
Yu-Kuan Fu
Heng-Jui Chang
Hung-yi Lee
+ PDF Chat Word Order does not Matter for Speech Recognition 2022 Vineel Pratap
Qiantong Xu
Tatiana Likhomanenko
Gabriel Synnaeve
Ronan Collobert
+ PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Speech Models 2023 Tiantian Feng
Shrikanth Narayanan
+ PDF Chat CIF-PT: Bridging Speech and Text Representations for Spoken Language Understanding via Continuous Integrate-and-Fire Pre-Training 2023 Linhao Dong
Zhecheng An
Peihao Wu
Jun Zhang
Lu Lu
Zejun Ma

Works Cited by This (44)

Action Title Year Authors
+ Scikit-learn: Machine Learning in Python 2012 Fabián Pedregosa
Gaël Varoquaux
Alexandre Gramfort
Vincent Michel
Bertrand Thirion
Olivier Grisel
Mathieu Blondel
Peter Prettenhofer
Ron J. Weiss
Vincent Dubourg
+ Representation Learning with Contrastive Predictive Coding 2018 Aäron van den Oord
Yazhe Li
Oriol Vinyals
+ PDF Chat Deep Clustering for Unsupervised Learning of Visual Features 2018 Mathilde Caron
Piotr Bojanowski
Armand Joulin
Matthijs Douze
+ An Unsupervised Autoregressive Model for Speech Representation Learning 2019 Yu-An Chung
Wei-Ning Hsu
Hao Tang
James Glass
+ PDF Chat fairseq: A Fast, Extensible Toolkit for Sequence Modeling 2019 Myle Ott
Sergey Edunov
Alexei Baevski
Angela Fan
Sam Gross
Nathan Ng
David Grangier
Michael Auli
+ PDF Chat Wav2Letter++: A Fast Open-source Speech Recognition System 2019 Vineel Pratap
Awni Hannun
Qiantong Xu
Jeff Cai
Jacob Kahn
Gabriel Synnaeve
Vitaliy Liptchinsky
Ronan Collobert
+ Deep Contextualized Word Representations 2018 Matthew E. Peters
Mark E Neumann
Mohit Iyyer
Matt Gardner
Christopher Clark
Kenton Lee
Luke Zettlemoyer
+ PDF Chat Learning Latent Representations for Speech Generation and Transformation 2017 Wei-Ning Hsu
Yu Zhang
James Glass
+ Neural Discrete Representation Learning 2017 Aäron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
+ RoBERTa: A Robustly Optimized BERT Pretraining Approach 2019 Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
Mike Lewis
Luke Zettlemoyer
Veselin Stoyanov