Zhuo Chen

Follow

Generating author description...

All published works
Action Title Year Authors
+ PDF Chat COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning 2024 Jing Pan
Jian Wu
Yashesh Gaur
Sunit Sivasankaran
Zhuo Chen
Shujie Liu
Jinyu Li
+ PDF Chat Rethinking the Soft Conflict Pseudo Boolean Constraint on MaxSAT Local Search Solvers 2024 Jiongzhi Zheng
Zhuo Chen
Chu-Min Li
Kun He
+ PDF Chat Photonic probabilistic machine learning using quantum vacuum noise 2024 Seou Choi
Yannick Salamin
Charles Roques‐Carmes
Rumen Dangovski
Di Luo
Zhuo Chen
Michael Horodynski
Jamison Sloan
Shiekh Zia Uddin
Marin Soljačić
+ ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge 2024 He Wang
Pengcheng Guo
Yue Li
Ao Zhang
Jiayao Sun
Lei Xie
Wei Chen
Pan Zhou
Hui Bu
Xin Xu
+ Rethinking the Soft Conflict Pseudo Boolean Constraint on MaxSAT Local Search Solvers 2024 Jiongzhi Zheng
Zhuo Chen
Chu-Min Li
Kun He
+ PDF Chat SpeechX: Neural Codec Language Model as a Versatile Speech Transformer 2024 Xiaofei Wang
Manthan Thakker
Zhuo Chen
Naoyuki Kanda
Şefik Emre Eskimez
Sanyuan Chen
Min Tang
Shujie Liu
Jinyu Li
Takuya Yoshioka
+ PDF Chat The Second Multi-Channel Multi-Party Meeting Transcription Challenge (M2MeT 2.0): A Benchmark for Speaker-Attributed ASR 2023 Yuhao Liang
Mohan Shi
Yu Fan
Yangze Li
Shiliang Zhang
Zhihao Du
Qian Chen
Lei Xie
Yanmin Qian
Jian Wu
+ PDF Chat On Decoder-Only Architecture For Speech-to-Text and Large Language Model Integration 2023 Jian Wu
Yashesh Gaur
Zhuo Chen
Long Zhou
Yimeng Zhu
Tianrui Wang
Jinyu Li
Shujie Liu
Bo Ren
Linquan Liu
+ Simulating Realistic Speech Overlaps Improves Multi-Talker ASR 2023 Muqiao Yang
Naoyuki Kanda
Xiaofei Wang
Jian Wu
Sunit Sivasankaran
Zhuo Chen
Jinyu Li
Takuya Yoshioka
+ Target Sound Extraction with Variable Cross-Modality Clues 2023 Chenda Li
Yao Qian
Zhuo Chen
Dongmei Wang
Takuya Yoshioka
Shujie Liu
Yanmin Qian
Michael Zeng
+ Self-Supervised Learning with Bi-Label Masked Speech Prediction for Streaming Multi-Talker Speech Recognition 2023 Zili Huang
Zhuo Chen
Naoyuki Kanda
Jian Wu
Yiming Wang
Jinyu Li
Takuya Yoshioka
Xiaofei Wang
Peidong Wang
+ Vararray Meets T-Sot: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition 2023 Naoyuki Kanda
Jian Wu
Xiaofei Wang
Zhuo Chen
Jinyu Li
Takuya Yoshioka
+ Speaker Change Detection For Transformer Transducer ASR 2023 Jian Wu
Zhuo Chen
Min Hu
Xiong Xiao
Jinyu Li
+ Speech Separation with Large-Scale Self-Supervised Learning 2023 Zhuo Chen
Naoyuki Kanda
Jian Wu
Yu Wu
Xiaofei Wang
Takuya Yoshioka
Jinyu Li
Sunit Sivasankaran
Şefik Emre Eskimez
+ PDF Chat Exploring WavLM on Speech Enhancement 2023 Hyungchan Song
Sanyuan Chen
Zhuo Chen
Yu Wu
Takuya Yoshioka
Min Tang
Jong Won Shin
Shujie Liu
+ Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers 2023 Chengyi Wang
Sanyuan Chen
Yu Wu
Ziqiang Zhang
Long Zhou
Shujie Liu
Zhuo Chen
Yanqing Liu
Huaming Wang
Jinyu Li
+ Speaker Change Detection for Transformer Transducer ASR 2023 Jian Wu
Zhuo Chen
Min Hu
Xiong Xiao
Jinyu Li
+ Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling 2023 Ziqiang Zhang
Long Zhou
Chengyi Wang
Sanyuan Chen
Yu Wu
Shujie Liu
Zhuo Chen
Yanqing Liu
Huaming Wang
Jinyu Li
+ VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation 2023 Tianrui Wang
Long Zhou
Ziqiang Zhang
Yu Wu
Shujie Liu
Yashesh Gaur
Zhuo Chen
Jinyu Li
Furu Wei
+ Adapting Multi-Lingual ASR Models for Handling Multiple Talkers 2023 Chenda Li
Yao Qian
Zhuo Chen
Naoyuki Kanda
Dongmei Wang
Takuya Yoshioka
Yanmin Qian
Michael Zeng
+ SpeechX: Neural Codec Language Model as a Versatile Speech Transformer 2023 Xiaofei Wang
Manthan Thakker
Zhuo Chen
Naoyuki Kanda
Şefik Emre Eskimez
Sanyuan Chen
Min Tang
Shujie Liu
Jinyu Li
Takuya Yoshioka
+ The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR 2023 Yuhao Liang
Mohan Shi
Yu Fan
Yangze Li
Shiliang Zhang
Zhihao Du
Qian Chen
Lei Xie
Yanmin Qian
Jian Wu
+ COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning 2023 Jing Pan
Jian Wu
Yashesh Gaur
Sunit Sivasankaran
Zhuo Chen
Shujie Liu
Jinyu Li
+ PDF Chat Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition? 2022 Sanyuan Chen
Yu Wu
Chengyi Wang
Shujie Liu
Zhuo Chen
Peidong Wang
Gang Liu
Jinyu Li
Jian Wu
Xiangzhan Yu
+ PDF Chat Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers Using End-to-End Speaker-Attributed ASR 2022 Naoyuki Kanda
Xiong Xiao
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
+ PDF Chat Unispeech-Sat: Universal Speech Representation Learning With Speaker Aware Pre-Training 2022 Sanyuan Chen
Yu Wu
Chengyi Wang
Zhengyang Chen
Zhuo Chen
Shujie Liu
Jian Wu
Yao Qian
Furu Wei
Jinyu Li
+ PDF Chat Personalized speech enhancement: new models and Comprehensive evaluation 2022 Şefik Emre Eskimez
Takuya Yoshioka
Huaming Wang
Xiaofei Wang
Zhuo Chen
Xuedong Huang
+ PDF Chat One Model to Enhance Them All: Array Geometry Agnostic Multi-Channel Personalized Speech Enhancement 2022 Hassan Taherian
Şefik Emre Eskimez
Takuya Yoshioka
Huaming Wang
Zhuo Chen
Xuedong Huang
+ PDF Chat All-Neural Beamformer for Continuous Speech Separation 2022 Zhuohuang Zhang
Takuya Yoshioka
Naoyuki Kanda
Zhuo Chen
Xiaofei Wang
Dongmei Wang
Şefik Emre Eskimez
+ DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning 2022 Zhuo Chen
Yu‐Feng Huang
Jiaoyan Chen
Yuxia Geng
Wen Zhang
Yin Fang
Jeff Z. Pan
Wenting Song
Huajun Chen
+ Streaming Multi-Talker ASR with Token-Level Serialized Output Training 2022 Naoyuki Kanda
Jian Wu
Yu Wu
Xiong Xiao
Zhong Meng
Xiaofei Wang
Yashesh Gaur
Zhuo Chen
Jinyu Li
Takuya Yoshioka
+ The Microsoft System for VoxCeleb Speaker Recognition Challenge 2022 2022 Gang Liu
Tianyan Zhou
Yongheng Zhao
Yu Wu
Zhuo Chen
Yao Qian
Jian Wu
+ Tele-Knowledge Pre-training for Fault Analysis 2022 Zhuo Chen
Wen Zhang
Yu‐Feng Huang
Mingyang Chen
Yuxia Geng
Hongtao Yu
Zhen Bi
Yichi Zhang
Zhen Yao
Wenting Song
+ Simulating realistic speech overlaps improves multi-talker ASR 2022 Muqiao Yang
Naoyuki Kanda
Xiaofei Wang
Jian Wu
Sunit Sivasankaran
Zhuo Chen
Jinyu Li
Takuya Yoshioka
+ Speech separation with large-scale self-supervised learning 2022 Zhuo Chen
Naoyuki Kanda
Jian Wu
Yu Wu
Xiaofei Wang
Takuya Yoshioka
Jinyu Li
Sunit Sivasankaran
Şefik Emre Eskimez
+ Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition 2022 Zili Huang
Zhuo Chen
Naoyuki Kanda
Jian Wu
Yiming Wang
Jinyu Li
Takuya Yoshioka
Xiaofei Wang
Peidong Wang
+ Exploring WavLM on Speech Enhancement 2022 Hyungchan Song
Sanyuan Chen
Zhuo Chen
Yu Wu
Takuya Yoshioka
Min Tang
Jong Won Shin
Shujie Liu
+ Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition? 2022 Sanyuan Chen
Yu Wu
Chengyi Wang
Shujie Liu
Zhuo Chen
Peidong Wang
Gang Liu
Jinyu Li
Jian Wu
Xiangzhan Yu
+ MEAformer: Multi-modal Entity Alignment Transformer for Meta Modality Hybrid 2022 Zhuo Chen
Jiaoyan Chen
Wen Zhang
Lingbing Guo
Yin Fang
Yu‐Feng Huang
Yichi Zhang
Yuxia Geng
Jeff Z. Pan
Wenting Song
+ Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings 2022 Naoyuki Kanda
Jian Wu
Yu Wu
Xiong Xiao
Zhong Meng
Xiaofei Wang
Yashesh Gaur
Zhuo Chen
Jinyu Li
Takuya Yoshioka
+ PDF Chat Modeling and Reasoning in Event Calculus using Goal-Directed Constraint Answer Set Programming 2021 Joaquín Arias
Manuel Carro
Zhuo Chen
Gopal Gupta
+ PDF Chat Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone 2021 Naoyuki Kanda
Guoli Ye
Yu Wu
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
+ PDF Chat AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario 2021 Yihui Fu
Luyao Cheng
Shubo Lv
Yukai Jv
Yuxiang Kong
Zhuo Chen
Yanxin Hu
Lei Xie
Jian Wu
Hui Bu
+ PDF Chat End-to-End Speaker-Attributed ASR with Transformer 2021 Naoyuki Kanda
Guoli Ye
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
+ Modeling and Reasoning in Event Calculus using Goal-Directed Constraint Answer Set Programming 2021 Joaquín Arias
Manuel Carro
Zhuo Chen
Gopal Gupta
+ PDF Chat Human Listening and Live Captioning: Multi-Task Training for Speech Enhancement 2021 Şefik Emre Eskimez
Xiaofei Wang
Min Tang
Hemin Yang
Zirun Zhu
Zhuo Chen
Huaming Wang
Takuya Yoshioka
+ PDF Chat Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR 2021 Naoyuki Kanda
Zhong Meng
Liang Lu
Yashesh Gaur
Xiaofei Wang
Zhuo Chen
Takuya Yoshioka
+ PDF Chat Microsoft Speaker Diarization System for the Voxceleb Speaker Recognition Challenge 2020 2021 Xiong Xiao
Naoyuki Kanda
Zhuo Chen
Tianyan Zhou
Takuya Yoshioka
Sanyuan Chen
Yong Zhao
Gang Liu
Yu Wu
Jian Wu
+ PDF Chat End-to-End Speaker-Attributed ASR with Transformer 2021 Naoyuki Kanda
Guoli Ye
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
+ PDF Chat Integration of Speech Separation, Diarization, and Recognition for Multi-Speaker Meetings: System Description, Comparison, and Analysis 2021 Desh Raj
Pavel Denisov
Zhuo Chen
Hakan Erdoğan
Zili Huang
Maokui He
Shinji Watanabe
Jun Du
Takuya Yoshioka
Yi Luo
+ PDF Chat Exploring End-to-End Multi-Channel ASR with Bias Information for Meeting Transcription 2021 Xiaofei Wang
Naoyuki Kanda
Yashesh Gaur
Zhuo Chen
Zhong Meng
Takuya Yoshioka
+ PDF Chat Sequential Multi-Frame Neural Beamforming for Speech Separation and Enhancement 2021 Zhong-Qiu Wang
Hakan Erdoğan
Scott Wisdom
Kevin Wilson
Desh Raj
Shinji Watanabe
Zhuo Chen
John R. Hershey
+ End-to-End Speaker-Attributed ASR with Transformer 2021 Naoyuki Kanda
Guoli Ye
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
+ Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone 2021 Naoyuki Kanda
Guoli Ye
Yu Wu
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
+ Human Listening and Live Captioning: Multi-Task Training for Speech Enhancement 2021 Şefik Emre Eskimez
Xiaofei Wang
Min Tang
Hemin Yang
Zirun Zhu
Zhuo Chen
Huaming Wang
Takuya Yoshioka
+ Investigation of Practical Aspects of Single Channel Speech Separation for ASR 2021 Jian Wu
Zhuo Chen
Sanyuan Chen
Yu Wu
Takuya Yoshioka
Naoyuki Kanda
Shujie Liu
Jinyu Li
+ Modeling and Reasoning in Event Calculus using Goal-Directed Constraint Answer Set Programming 2021 Joaquín Arias
Manuel Carro
Zhuo Chen
Gopal Gupta
+ Personalized Speech Enhancement: New Models and Comprehensive Evaluation 2021 Şefik Emre Eskimez
Takuya Yoshioka
Huaming Wang
Xiaofei Wang
Zhuo Chen
Xuedong Huang
+ One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement 2021 Hassan Taherian
Şefik Emre Eskimez
Takuya Yoshioka
Huaming Wang
Zhuo Chen
Xuedong Huang
+ All-neural beamformer for continuous speech separation 2021 Zhuohuang Zhang
Takuya Yoshioka
Naoyuki Kanda
Zhuo Chen
Xiaofei Wang
Dongmei Wang
Şefik Emre Eskimez
+ UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training 2021 Sanyuan Chen
Yu Wu
Chengyi Wang
Zhengyang Chen
Zhuo Chen
Shujie Liu
Jian Wu
Yao Qian
Furu Wei
Jinyu Li
+ PDF Chat Task Offloading for Large-Scale Asynchronous Mobile Edge Computing: An Index Policy Approach 2020 Yizhen Xu
Peng Cheng
Zhuo Chen
Ming Ding
Yonghui Li
Branka Vucetic
+ PDF Chat Deep Multi-Task Learning for Cooperative NOMA: System Design and Principles 2020 Yuxin Lu
Peng Cheng
Zhuo Chen
Wai Ho Mow
Yonghui Li
Branka Vucetic
+ PDF Chat An End-to-End Architecture of Online Multi-Channel Speech Separation 2020 Jian Wu
Zhuo Chen
Jinyu Li
Takuya Yoshioka
Zhili Tan
Edward Lin
Yi Luo
Lei Xie
+ PDF Chat Justifications for Goal-Directed Constraint Answer Set Programming 2020 Joaquín Arias
Manuel Carro
Zhuo Chen
Gopal Gupta
+ PDF Chat Dual-Path RNN: Efficient Long Sequence Modeling for Time-Domain Single-Channel Speech Separation 2020 Yi Luo
Zhuo Chen
Takuya Yoshioka
+ Spectrum Intelligent Radio: Technology, Development, and Future Trends 2020 Peng Cheng
Zhuo Chen
Ming Ding
Yonghui Li
Branka Vucetic
Dusit Niyato
+ PDF Chat Spectrum Intelligent Radio: Technology, Development, and Future Trends 2020 Peng Cheng
Zhuo Chen
Ming Ding
Yonghui Li
Branka Vucetic
Dusit Niyato
+ Deep Multi-Task Learning for Cooperative NOMA: System Design and Principles 2020 Yuxin Lu
Peng Cheng
Zhuo Chen
Wai Ho Mow
Yonghui Li
Branka Vucetic
+ An End-to-end Architecture of Online Multi-channel Speech Separation 2020 Jian Wu
Zhuo Chen
Jinyu Li
Takuya Yoshioka
Zhili Tan
Ed Lin
Yi Luo
Lei Xie
+ Microsoft Speaker Diarization System for the VoxCeleb Speaker Recognition Challenge 2020 2020 Xiong Xiao
Naoyuki Kanda
Zhuo Chen
Tianyan Zhou
Takuya Yoshioka
Sanyuan Chen
Yong Zhao
Gang Liu
Yu Wu
Jian Wu
+ Don't shoot butterfly with rifles: Multi-channel Continuous Speech Separation with Early Exit Transformer 2020 Sanyuan Chen
Yu Wu
Zhuo Chen
Takuya Yoshioka
Shujie Liu
Jinyu Li
+ Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR 2020 Naoyuki Kanda
Zhong Meng
Liang Lu
Yashesh Gaur
Xiaofei Wang
Zhuo Chen
Takuya Yoshioka
+ Exploring End-to-End Multi-channel ASR with Bias Information for Meeting Transcription 2020 Xiaofei Wang
Naoyuki Kanda
Yashesh Gaur
Zhuo Chen
Zhong Meng
Takuya Yoshioka
+ Integration of speech separation, diarization, and recognition for multi-speaker meetings: System description, comparison, and analysis 2020 Desh Raj
Pavel Denisov
Zhuo Chen
Hakan Erdoğan
Zili Huang
Maokui He
Shinji Watanabe
Jun Du
Takuya Yoshioka
Yi Luo
+ PDF Chat A Learning-Based Two-Stage Spectrum Sharing Strategy With Multiple Primary Transmit Power Levels 2019 Rui Zhang
Peng Cheng
Zhuo Chen
Yonghui Li
Branka Vucetic
+ PDF Chat Low-latency Speaker-independent Continuous Speech Separation 2019 Takuya Yoshioka
Zhuo Chen
Changliang Liu
Xiong Xiao
Hakan Erdoğan
Dimitrios Dimitriadis
+ Understanding the Impact of Label Granularity on CNN-based Image Classification 2019 Zhuo Chen
Ruizhou Ding
Ting-Wu Chin
Diana Marculescu
+ Low-Latency Speaker-Independent Continuous Speech Separation 2019 Takuya Yoshioka
Zhuo Chen
Changliang Liu
Xiong Xiao
Hakan Erdoğan
Dimitrios Dimitriadis
+ Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation 2019 Yi Luo
Zhuo Chen
Takuya Yoshioka
+ Sequential Multi-Frame Neural Beamforming for Speech Separation and Enhancement 2019 Zhong-Qiu Wang
Hakan Erdoğan
Scott Wisdom
Kevin Wilson
Desh Raj
Shinji Watanabe
Zhuo Chen
John R. Hershey
+ PDF Chat Understanding the Impact of Label Granularity on CNN-Based Image Classification 2018 Zhuo Chen
Ruizhou Ding
Ting-Wu Chin
Diana Marculescu
+ PDF Chat Mobile Collaborative Spectrum Sensing for Heterogeneous Networks: A Bayesian Machine Learning Approach 2018 Yizhen Xu
Peng Cheng
Zhuo Chen
Yonghui Li
Branka Vucetic
+ PDF Chat Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks 2018 Takuya Yoshioka
Hakan Erdoğan
Zhuo Chen
Xiong Xiao
Fil Alleva
+ PDF Chat Speaker-Invariant Training Via Adversarial Learning 2018 Zhong Meng
Jinyu Li
Zhuo Chen
Yang Zhao
Vadim Mazalov
Yifan Gong
Biing‐Hwang Juang
+ PDF Chat Developing Far-Field Speaker System Via Teacher-Student Learning 2018 Jinyu Li
Rui Zhao
Zhuo Chen
Changliang Liu
Xiong Xiao
Guoli Ye
Yifan Gong
+ Speaker-Independent Speech Separation With Deep Attractor Network 2018 Yi Luo
Zhuo Chen
Nima Mesgarani
+ Cracking the cocktail party problem by multi-beam deep attractor network 2018 Zhuo Chen
Jinyu Li
Xiong Xiao
Takuya Yoshioka
Huaming Wang
Zhenghao Wang
Yifan Gong
+ Developing Far-Field Speaker System Via Teacher-Student Learning 2018 Jinyu Li
Rui Zhao
Zhuo Chen
Changliang Liu
Xiong Xiao
Guoli Ye
Yifan Gong
+ PDF Chat Unsupervised adaptation with domain separation networks for robust speech recognition 2017 Zhong Meng
Zhuo Chen
Vadim Mazalov
Jinyu Li
Yifan Gong
+ Task Scheduling for Heterogeneous Multicore Systems 2017 Zhuo Chen
Diana Marculescu
+ End-to-End Attention based Text-Dependent Speaker Verification 2017 Shi-Xiong Zhang
Zhuo Chen
Yong Zhao
Jinyu Li
Yifan Gong
+ PDF Chat End-to-End attention based text-dependent speaker verification 2016 Shi-Xiong Zhang
Zhuo Chen
Yong Zhao
Jinyu Li
Yifan Gong
+ PDF Chat A Physician Advisory System for Chronic Heart Failure management based on knowledge patterns 2016 Zhuo Chen
Kyle Marple
Elmer Salazar
Gopal Gupta
Lakshman S. Tamil
+ PDF Chat Single-Channel Multi-Speaker Separation Using Deep Clustering 2016 Yusuf Ziya Işık
Jonathan Le Roux
Zhuo Chen
Shinji Watanabe
John R. Hershey
+ PDF Chat Deep clustering: Discriminative embeddings for segmentation and separation 2016 John R. Hershey
Zhuo Chen
Jonathan Le Roux
Shinji Watanabe
+ Single-Channel Multi-Speaker Separation using Deep Clustering 2016 Yusuf Işık
Jonathan Le Roux
Zhuo Chen
Shinji Watanabe
John R. Hershey
Common Coauthors
Commonly Cited References
Action Title Year Authors # of times referenced
+ PDF Chat Deep clustering: Discriminative embeddings for segmentation and separation 2016 John R. Hershey
Zhuo Chen
Jonathan Le Roux
Shinji Watanabe
19
+ PDF Chat Continuous Speech Separation: Dataset and Analysis 2020 Zhuo Chen
Takuya Yoshioka
Liang Lu
Tianyan Zhou
Zhong Meng
Yi Luo
Jian Wu
Xiong Xiao
Jinyu Li
19
+ PDF Chat Advances in Online Audio-Visual Meeting Transcription 2019 Takuya Yoshioka
Igor Abramovski
Cem Aksoylar
Zhuo Chen
Moshe David
Dimitrios Dimitriadis
Yifan Gong
Ilya Gurvich
Xuedong Huang
Yan Huang
16
+ PDF Chat Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks 2017 Morten Kolbæk
Dong Yu
Zheng‐Hua Tan
Jesper Jensen
11
+ PDF Chat SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition 2019 Daniel Park
William Chan
Yu Zhang
Chung‐Cheng Chiu
Barret Zoph
Ekin D. Cubuk
Quoc V. Le
10
+ Attention Is All You Need 2017 Ashish Vaswani
Noam Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan N. Gomez
Łukasz Kaiser
Illia Polosukhin
10
+ CHiME-6 Challenge: Tackling Multispeaker Speech Recognition for Unsegmented Recordings 2020 Shinji Watanabe
Michael Mandel
Jon Barker
Emmanuel Vincent
Ashish Arora
Xuankai Chang
Sanjeev Khudanpur
Vimal Manohar
Daniel Povey
Desh Raj
10
+ PDF Chat Permutation invariant training of deep models for speaker-independent multi-talker speech separation 2017 Dong Yu
Morten Kolbæk
Zheng‐Hua Tan
Jesper Jensen
9
+ PDF Chat Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of any Number of Speakers 2020 Naoyuki Kanda
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Tianyan Zhou
Takuya Yoshioka
9
+ PDF Chat Integration of Speech Separation, Diarization, and Recognition for Multi-Speaker Meetings: System Description, Comparison, and Analysis 2021 Desh Raj
Pavel Denisov
Zhuo Chen
Hakan Erdoğan
Zili Huang
Maokui He
Shinji Watanabe
Jun Du
Takuya Yoshioka
Yi Luo
9
+ CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings 2020 Shinji Watanabe
Michael Mandel
Jon Barker
Emmanuel Vincent
Ashish Arora
Xuankai Chang
Sanjeev Khudanpur
Vimal Manohar
Daniel Povey
Desh Raj
9
+ PDF Chat VoxCeleb: A Large-Scale Speaker Identification Dataset 2017 Arsha Nagrani
Joon Son Chung
Andrew Zisserman
9
+ PDF Chat Investigation of End-to-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings 2021 Naoyuki Kanda
Xuankai Chang
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
9
+ PDF Chat Conformer: Convolution-augmented Transformer for Speech Recognition 2020 Anmol Gulati
James Qin
Chung‐Cheng Chiu
Niki Parmar
Yu Zhang
Jiahui Yu
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
8
+ PDF Chat VoxCeleb2: Deep Speaker Recognition 2018 Joon Son Chung
Arsha Nagrani
Andrew Zisserman
8
+ PDF Chat Recognizing Multi-Talker Speech with Permutation Invariant Training 2017 Dong Yu
Xuankai Chang
Yanmin Qian
8
+ Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation 2019 Yi Luo
Nima Mesgarani
8
+ PDF Chat Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks 2018 Takuya Yoshioka
Hakan Erdoğan
Zhuo Chen
Xiong Xiao
Fil Alleva
8
+ PDF Chat VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking 2019 Quan Wang
Hannah Muckenhirn
Kevin Wilson
Prashant Sridhar
Zelin Wu
John R. Hershey
Rif A. Saurous
Ron J. Weiss
Jia Ye
Ignacio López Moreno
7
+ PDF Chat Libri-Light: A Benchmark for ASR with Limited or No Supervision 2020 Jacob Kahn
Maude Rivière
Wenlong Zheng
Eugene Kharitonov
Qinmei Xu
Pierre-Emmanuel Mazaré
Julien Karadayi
Vitaliy Liptchinsky
Ronan Collobert
Christian Fuegen
7
+ PDF Chat Serialized Output Training for End-to-End Overlapped Speech Recognition 2020 Naoyuki Kanda
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Takuya Yoshioka
7
+ PDF Chat Streaming End-to-End Multi-Talker Speech Recognition 2021 Liang Lu
Naoyuki Kanda
Jinyu Li
Yifan Gong
6
+ PDF Chat DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement 2020 Yanxin Hu
Yun Liu
Shubo Lv
Mengtao Xing
Shimin Zhang
Yihui Fu
Jian Wu
Bihong Zhang
Lei Xie
6
+ PDF Chat Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates 2018 Taku Kudo
6
+ Attention-Based Models for Speech Recognition 2015 Jan Chorowski
Dzmitry Bahdanau
Dmitriy Serdyuk
Kyunghyun Cho
Yoshua Bengio
6
+ Conformer: Convolution-augmented Transformer for Speech Recognition 2020 Anmol Gulati
James Qin
Chung‐Cheng Chiu
Niki Parmar
Yu Zhang
Jiahui Yu
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
6
+ PDF Chat Low-latency Speaker-independent Continuous Speech Separation 2019 Takuya Yoshioka
Zhuo Chen
Changliang Liu
Xiong Xiao
Hakan Erdoğan
Dimitrios Dimitriadis
6
+ PDF Chat A Purely End-to-End System for Multi-speaker Speech Recognition 2018 Hiroshi Seki
Takaaki Hori
Shinji Watanabe
Jonathan Le Roux
John R. Hershey
6
+ PDF Chat End-to-end Monaural Multi-speaker ASR System without Pretraining 2019 Xuankai Chang
Yanmin Qian
Kai Yu
Shinji Watanabe
6
+ PDF Chat Joint Speech Recognition and Speaker Diarization via Sequence Transduction 2019 Laurent El Shafey
Hagen Soltau
Izhak Shafran
5
+ PDF Chat wav2vec: Unsupervised Pre-Training for Speech Recognition 2019 Steffen Schneider
Alexei Baevski
Ronan Collobert
Michael Auli
5
+ PDF Chat ResNeXt and Res2Net Structures for Speaker Verification 2021 Tianyan Zhou
Yong Zhao
Jian Wu
5
+ PDF Chat Deep long short-term memory adaptive beamforming networks for multichannel robust speech recognition 2017 Zhong Meng
Shinji Watanabe
John R. Hershey
Hakan Erdoğan
5
+ Attention is All you Need 2017 Ashish Vaswani
Noam Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan N. Gomez
Łukasz Kaiser
Illia Polosukhin
5
+ PDF Chat Improving Speaker Discrimination of Target Speech Extraction With Time-Domain Speakerbeam 2020 Marc Delcroix
Tsubasa Ochiai
Kateřina Žmolíková
Keisuke Kinoshita
Naohiro Tawara
Tomohiro Nakatani
Shoko Araki
5
+ PDF Chat Deep attractor network for single-microphone speaker separation 2017 Zhuo Chen
Yi Luo
Nima Mesgarani
5
+ PDF Chat Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective 2019 Zhong-Qiu Wang
Ke Tan
DeLiang Wang
5
+ wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations 2020 Alexei Baevski
Henry Zhou
Abdelrahman Mohamed
Michael Auli
5
+ PDF Chat All-neural Online Source Separation, Counting, and Diarization for Meeting Analysis 2019 Thilo von Neumann
Keisuke Kinoshita
Marc Delcroix
Shoko Araki
Tomohiro Nakatani
Reinhold Haeb‐Umbach
5
+ Achieving Human Parity in Conversational Speech Recognition 2016 Wayne Xiong
Jasha Droppo
Xuedong Huang
Frank Seide
Mike Seltzer
Andreas Stolcke
Dong Yu
Geoffrey Zweig
5
+ PDF Chat WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing 2022 Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu Wu
Shujie Liu
Zhuo Chen
Jinyu Li
Naoyuki Kanda
Takuya Yoshioka
Xiong Xiao
5
+ Advances in Online Audio-Visual Meeting Transcription 2019 Takuya Yoshioka
Igor Abramovski
Cem Aksoylar
Zhuo Chen
Moshe David
Dimitrios Dimitriadis
Yifan Gong
Ilya Gurvich
Xuedong Huang
Yan Huang
5
+ PDF Chat Universal Sound Separation 2019 Ilya Kavalerov
Scott Wisdom
Hakan Erdoğan
Brian Patton
Kevin Wilson
Jonathan Le Roux
John R. Hershey
5
+ PDF Chat Continuous Speech Separation with Conformer 2021 Sanyuan Chen
Yu Wu
Zhuo Chen
Jian Wu
Jinyu Li
Takuya Yoshioka
Chengyi Wang
Shujie Liu
Ming Zhou
5
+ Recent Advances in End-to-End Automatic Speech Recognition 2022 Jinyu Li
5
+ PDF Chat Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap 2019 Tae Jin Park
Kyu J. Han
Manoj Kumar
Shrikanth Narayanan
4
+ PDF Chat A Comparative Study on Transformer vs RNN in Speech Applications 2019 Shigeki Karita
Nanxin Chen
Tomoki Hayashi
Takaaki Hori
Hirofumi Inaguma
Ziyan Jiang
Masao Someki
Nelson Enrique Yalta Soplin
Ryuichi Yamamoto
Xiaofei Wang
4
+ PDF Chat SDR – Half-baked or Well Done? 2019 Jonathan Le Roux
Scott Wisdom
Hakan Erdoğan
John R. Hershey
4
+ Speaker-Independent Speech Separation With Deep Attractor Network 2018 Yi Luo
Zhuo Chen
Nima Mesgarani
4
+ PDF Chat Speech Recognition and Multi-Speaker Diarization of Long Conversations 2020 Huanru Henry Mao
S. X. Li
Julian McAuley
Garrison W. Cottrell
4