+
PDF
Chat
|
Variational Bayesian Adaptive Learning of Deep Latent Variables for
Acoustic Knowledge Transfer
|
2025
|
Hu Hu
Sabato Marco Siniscalchi
Chao-Han Huck Yang
Chin‐Hui Lee
|
+
PDF
Chat
|
Quality-Aware End-to-End Audio-Visual Neural Speaker Diarization
|
2024
|
Maokui He
Jun Du
Shutong Niu
Qingfeng Liu
Chin‐Hui Lee
|
+
PDF
Chat
|
An Explicit Consistency-Preserving Loss Function for Phase
Reconstruction and Speech Enhancement
|
2024
|
Pin-Jui Ku
Chun-Wei Ho
Hao Yen
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
PDF
Chat
|
Language-Universal Speech Attributes Modeling for Zero-Shot Multilingual Spoken Keyword Recognition
|
2024
|
Hao Yen
Pin-Jui Ku
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
PDF
Chat
|
Exploring Audio-Visual Information Fusion for Sound Event Localization
and Detection In Low-Resource Realistic Scenarios
|
2024
|
Ya Jiang
Qing Wang
Jun Du
Maocheng Hu
Pengfei Hu
Zeyan Liu
Shi Cheng
Zhaoxu Nian
Yuxuan Dong
Mingqi Cai
|
+
PDF
Chat
|
Language-Universal Speech Attributes Modeling for Zero-Shot Multilingual
Spoken Keyword Recognition
|
2024
|
Hao Yen
Pin-Jui Ku
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
|
The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction
|
2024
|
Shilong Wu
Chenxi Wang
Hang Chen
Yusheng Dai
Chenyue Zhang
Ruoyu Wang
Hongbo Lan
Jun Du
Chin‐Hui Lee
Jingdong Chen
|
+
|
Boosting End-to-End Multilingual Phoneme Recognition Through Exploiting Universal Speech Attributes Constraints
|
2024
|
Hao Yen
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
|
Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture
|
2024
|
Gaobin Yang
Maokui He
Shutong Niu
Ruoyu Wang
Yanyan Yue
Shuangqing Qian
Shilong Wu
Jun Du
Chin‐Hui Lee
|
+
PDF
Chat
|
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video
Frames for Audio-Visual Speech Recognition
|
2024
|
Yusheng Dai
Hang Chen
Jun Du
Ruoyu Wang
Shihao Chen
Jiefeng Ma
Haotian Wang
Chin‐Hui Lee
|
+
|
Bayesian adaptive learning to latent variables via Variational Bayes and Maximum a Posteriori
|
2024
|
Hu Hu
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
PDF
Chat
|
A Variance-Preserving Interpolation Approach for Diffusion Models With Applications to Single Channel Speech Enhancement and Recognition
|
2024
|
Zilu Guo
Qing Wang
Jun Du
Jia Pan
Qingfeng Liu
Chin‐Hui Lee
|
+
PDF
Chat
|
Semi-Supervised Multi-Channel Speaker Diarization With Cross-Channel Attention
|
2023
|
Shilong Wu
Jun Du
Maokui He
Shutong Niu
Hang Chen
Haitao Tang
Chin‐Hui Lee
|
+
PDF
Chat
|
A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
|
2023
|
Pin-Jui Ku
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
PDF
Chat
|
Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement
|
2023
|
Zilu Guo
Jun Du
Chin‐Hui Lee
Yu Gao
Wenbin Zhang
|
+
PDF
Chat
|
Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder
|
2023
|
Yusheng Dai
Hang Chen
Jun Du
Xiaofei Ding
Ning Ding
Feijun Jiang
Chin‐Hui Lee
|
+
|
A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition
|
2023
|
Chao-Han Huck Yang
Bo Li
Yu Zhang
Nanxin Chen
Tara N. Sainath
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
|
The Multimodal Information Based Speech Processing (Misp) 2022 Challenge: Audio-Visual Diarization And Recognition
|
2023
|
Zhe Wang
Shilong Wu
Hang Chen
Maokui He
Jun Du
Chin‐Hui Lee
Jingdong Chen
Shinji Watanabe
Sabato Marco Siniscalchi
Odette Scharenborg
|
+
PDF
Chat
|
An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition
|
2023
|
Chao-Han Huck Yang
I‐Ming Chen
Andreas Stolcke
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
PDF
Chat
|
A Four-Stage Data Augmentation Approach to ResNet-Conformer Based Acoustic Modeling for Sound Event Localization and Detection
|
2023
|
Qing Wang
Jun Du
Huaxin Wu
Jia Pan
Feng Ma
Chin‐Hui Lee
|
+
|
The Multimodal Information based Speech Processing (MISP) 2022 Challenge: Audio-Visual Diarization and Recognition
|
2023
|
Zhe Wang
Shilong Wu
Hang Chen
Maokui He
Jun Du
Chin‐Hui Lee
Jingdong Chen
Shinji Watanabe
Sabato Marco Siniscalchi
Odette Scharenborg
|
+
|
A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
|
2023
|
Pin-Jui Ku
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
|
Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement
|
2023
|
Zilu Guo
Jun Du
Chin‐Hui Lee
Yu Gao
Wenbin Zhang
|
+
|
Semi-supervised multi-channel speaker diarization with cross-channel attention
|
2023
|
Shilong Wu
Jun Du
Maokui He
Shutong Niu
Hang Chen
Haitao Tang
Chin‐Hui Lee
|
+
|
Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder
|
2023
|
Yusheng Dai
Hang Chen
Jun Du
Xiaofei Ding
Ning Ding
Feijun Jiang
Chin‐Hui Lee
|
+
|
The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge
|
2023
|
Ruoyu Wang
Maokui He
Jun Du
Hengshun Zhou
Shutong Niu
Hang Chen
Yanyan Yue
Gaobin Yang
Shilong Wu
Lei Sun
|
+
|
The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction
|
2023
|
Shilong Wu
Chenxi Wang
Hang Chen
Yusheng Dai
Chenyue Zhang
Ruoyu Wang
Hongbo Lan
Jun Du
Chin‐Hui Lee
Jingdong Chen
|
+
|
Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints
|
2023
|
Hao Yen
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
|
Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture
|
2023
|
Gaobin Yang
Maokui He
Shutong Niu
Ruoyu Wang
Yanyan Yue
Shuangqing Qian
Shilong Wu
Jun Du
Chin‐Hui Lee
|
+
PDF
Chat
|
A Study on Joint Modeling and Data Augmentation of Multi-Modalities for Audio-Visual Scene Classification
|
2022
|
Qing Wang
Jun Du
Siyuan Zheng
Yunqing Li
Yajian Wang
Yuzhong Wu
Hu Hu
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Yannan Wang
|
+
PDF
Chat
|
An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling to Differential Privacy Preserving Speech Recognition
|
2022
|
Chao-Han Huck Yang
Jun Qi
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
PDF
Chat
|
Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function
|
2022
|
Qing Wang
Hang Chen
Ya Jiang
Zhe Wang
Yuyang Wang
Jun Du
Chin‐Hui Lee
|
+
PDF
Chat
|
A Variational Bayesian Approach to Learning Latent Variables for Acoustic Knowledge Transfer
|
2022
|
Hu Hu
Sabato Marco Siniscalchi
Chao-Han Huck Yang
Chin‐Hui Lee
|
+
PDF
Chat
|
A Study of Designing Compact Audio-Visual Wake Word Spotting System Based on Iterative Fine-Tuning in Neural Network Pruning
|
2022
|
Hengshun Zhou
Jun Du
Chao-Han Huck Yang
Shifu Xiong
Chin‐Hui Lee
|
+
PDF
Chat
|
The USTC-Ximalaya System for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription (M2met) Challenge
|
2022
|
Maokui He
Xiang Lv
Weilin Zhou
Jingjing Yin
Xiaoqi Zhang
Yuxuan Wang
Shutong Niu
Yuhang Cao
Heng Lu
Jun Du
|
+
|
A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification
|
2022
|
Qing Wang
Jun Du
Siyuan Zheng
Yunqing Li
Yajian Wang
Yuzhong Wu
Hu Hu
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Yannan Wang
|
+
|
The USTC-Ximalaya system for the ICASSP 2022 multi-channel multi-party meeting transcription (M2MeT) challenge
|
2022
|
Maokui He
Xiang Lv
Weilin Zhou
Jingjing Yin
Xiaoqi Zhang
Yuxuan Wang
Shutong Niu
Yuhang Cao
Heng Lu
Jun Du
|
+
|
A Study of Designing Compact Audio-Visual Wake Word Spotting System Based on Iterative Fine-Tuning in Neural Network Pruning
|
2022
|
Hengshun Zhou
Jun Du
Chao-Han Huck Yang
Shifu Xiong
Chin‐Hui Lee
|
+
|
An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition
|
2022
|
Chao-Han Huck Yang
I‐Ming Chen
Andreas Stolcke
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
|
An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling to Differential Privacy Preserving Speech Recognition
|
2022
|
Chao-Han Huck Yang
Jun Qi
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
|
Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function
|
2022
|
Qing Wang
Hang Chen
Ya Jiang
Zhe Wang
Yuyang Wang
Jun Du
Chin‐Hui Lee
|
+
|
A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition
|
2022
|
Chao-Han Huck Yang
Bo Li
Yu Zhang
Nanxin Chen
Tara N. Sainath
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
PDF
Chat
|
Information Fusion in Attention Networks Using Adaptive and Multi-level
Factorized Bilinear Pooling for Audio-visual Emotion Recognition
|
2021
|
Hengshun Zhou
Jun Du
Yuanyuan Zhang
Qing Wang
Qingfeng Liu
Chin‐Hui Lee
|
+
PDF
Chat
|
PATE-AAE: Incorporating Adversarial Autoencoder into Private Aggregation of Teacher Ensembles for Spoken Command Classification
|
2021
|
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
|
A Lottery Ticket Hypothesis Framework for Low-Complexity Device-Robust Neural Acoustic Scene Classification
|
2021
|
Chao-Han Huck Yang
Hu Hu
Sabato Marco Siniscalchi
Qing Wang
Yuyang Wang
Xianjun Xia
Yuanjun Zhao
Yuzhong Wu
Yannan Wang
Jun Du
|
+
PDF
Chat
|
Correlating subword articulation with lip shapes for embedding aware audio-visual speech enhancement
|
2021
|
Hang Chen
Jun Du
Yu Hu
Li-Rong Dai
Baocai Yin
Chin‐Hui Lee
|
+
PDF
Chat
|
A Two-Stage Approach to Device-Robust Acoustic Scene Classification
|
2021
|
Hu Hu
Chao-Han Huck Yang
Xianjun Xia
Xue Bai
Xin Tang
Yajian Wang
Shutong Niu
Li Chai
Juanjuan Li
Hongning Zhu
|
+
PDF
Chat
|
Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech Recognition
|
2021
|
Chao-Han Huck Yang
Jun Qi
Samuel Yen-Chi Chen
Pin‐Yu Chen
Sabato Marco Siniscalchi
Xiaoli Ma
Chin‐Hui Lee
|
+
|
PATE-AAE: Incorporating Adversarial Autoencoder into Private Aggregation of Teacher Ensembles for Spoken Command Classification
|
2021
|
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
|
A Four-Stage Data Augmentation Approach to ResNet-Conformer Based Acoustic Modeling for Sound Event Localization and Detection
|
2021
|
Qing Wang
Jun Du
Huaxin Wu
Jia Pan
Feng Ma
Chin‐Hui Lee
|
+
|
USTC-NELSLIP System Description for DIHARD-III Challenge
|
2021
|
Yuxuan Wang
Maokui He
Shutong Niu
Lei Sun
Tian Gao
Xin Fang
Jia Pan
Jun Du
Chin‐Hui Lee
|
+
|
Separation Guided Speaker Diarization in Realistic Mismatched Conditions
|
2021
|
Shutong Niu
Jun Du
Lei Sun
Chin‐Hui Lee
|
+
PDF
Chat
|
Information Fusion in Attention Networks Using Adaptive and Multi-Level Factorized Bilinear Pooling for Audio-Visual Emotion Recognition
|
2021
|
Hengshun Zhou
Jun Du
Yuanyuan Zhang
Qing Wang
Qingfeng Liu
Chin‐Hui Lee
|
+
|
Information Fusion in Attention Networks Using Adaptive and Multi-level Factorized Bilinear Pooling for Audio-visual Emotion Recognition
|
2021
|
Hengshun Zhou
Jun Du
Yuanyuan Zhang
Qing Wang
Qingfeng Liu
Chin‐Hui Lee
|
+
|
A Lottery Ticket Hypothesis Framework for Low-Complexity Device-Robust Neural Acoustic Scene Classification
|
2021
|
Chao-Han Huck Yang
Hu Hu
Sabato Marco Siniscalchi
Qing Wang
Yuyang Wang
Xianjun Xia
Yuanjun Zhao
Yuzhong Wu
Yannan Wang
Jun Du
|
+
|
A Variational Bayesian Approach to Learning Latent Variables for Acoustic Knowledge Transfer
|
2021
|
Hu Hu
Sabato Marco Siniscalchi
Chao-Han Huck Yang
Chin‐Hui Lee
|
+
|
Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech Recognition
|
2020
|
Chao-Han Huck Yang
Jun Qi
Samuel Yen-Chi Chen
Pin‐Yu Chen
Sabato Marco Siniscalchi
Xiaoli Ma
Chin‐Hui Lee
|
+
PDF
Chat
|
Exploring Deep Hybrid Tensor-to-Vector Network Architectures for Regression Based Speech Enhancement
|
2020
|
Jun Qi
Hu Hu
Yannan Wang
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
PDF
Chat
|
Relational Teacher Student Learning with Neural Label Embedding for Device Adaptation in Acoustic Scene Classification
|
2020
|
Hu Hu
Sabato Marco Siniscalchi
Yannan Wang
Chin‐Hui Lee
|
+
PDF
Chat
|
An Acoustic Segment Model Based Segment Unit Selection Approach to Acoustic Scene Classification with Partial Utterances
|
2020
|
Hu Hu
Sabato Marco Siniscalchi
Yannan Wang
Xue Bai
Jun Du
Chin‐Hui Lee
|
+
PDF
Chat
|
Characterizing Speech Adversarial Examples Using Self-Attention U-Net Enhancement
|
2020
|
Chao-Han Huck Yang
Jun Qi
Pin‐Yu Chen
Xiaoli Ma
Chin‐Hui Lee
|
+
PDF
Chat
|
Enhanced Adversarial Strategically-Timed Attacks Against Deep Reinforcement Learning
|
2020
|
Chao-Han Huck Yang
Jun Qi
Pin‐Yu Chen
Yi Ouyang
I-Te Danny Hung
Chin‐Hui Lee
Xiaoli Ma
|
+
PDF
Chat
|
L-Vector: Neural Label Embedding for Domain Adaptation
|
2020
|
Zhong Meng
Hu Hu
Jinyu Li
Changliang Liu
Yan Huang
Yifan Gong
Chin‐Hui Lee
|
+
PDF
Chat
|
Tensor-To-Vector Regression for Multi-Channel Speech Enhancement Based on Tensor-Train Network
|
2020
|
Jun Qi
Hu Hu
Yannan Wang
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
|
Performance Analysis for Tensor-Train Decomposition to Deep Neural Network Based Vector-to-Vector Regression
|
2020
|
Jun Qi
Xiaoli Ma
Chin‐Hui Lee
Jun Du
Sabato Marco Siniscalchi
|
+
|
Enhanced Adversarial Strategically-Timed Attacks against Deep Reinforcement Learning
|
2020
|
Chao-Han Huck Yang
Jun Qi
Pin‐Yu Chen
Yi Ouyang
I-Te Danny Hung
Chin‐Hui Lee
Xiaoli Ma
|
+
|
Tensor-to-Vector Regression for Multi-channel Speech Enhancement based on Tensor-Train Network
|
2020
|
Jun Qi
Hu Hu
Yannan Wang
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
PDF
Chat
|
Analyzing Upper Bounds on Mean Absolute Errors for Deep Neural Network-Based Vector-to-Vector Regression
|
2020
|
Jun Qi
Jun Du
Sabato Marco Siniscalchi
Xiaoli Ma
Chin‐Hui Lee
|
+
|
L-Vector: Neural Label Embedding for Domain Adaptation
|
2020
|
Zhong Meng
Hu Hu
Jinyu Li
Changliang Liu
Yan Huang
Yifan Gong
Chin‐Hui Lee
|
+
|
Device-Robust Acoustic Scene Classification Based on Two-Stage Categorization and Data Augmentation
|
2020
|
Hu Hu
Chao-Han Huck Yang
Xianjun Xia
Xue Bai
Xin Tang
Yajian Wang
Shutong Niu
Li Chai
Juanjuan Li
Hongning Zhu
|
+
|
Exploring Deep Hybrid Tensor-to-Vector Network Architectures for Regression Based Speech Enhancement
|
2020
|
Jun Qi
Hu Hu
Yannan Wang
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Chin‐Hui Lee
|
+
|
Relational Teacher Student Learning with Neural Label Embedding for Device Adaptation in Acoustic Scene Classification
|
2020
|
Hu Hu
Sabato Marco Siniscalchi
Yannan Wang
Chin‐Hui Lee
|
+
|
An Acoustic Segment Model Based Segment Unit Selection Approach to Acoustic Scene Classification with Partial Utterances
|
2020
|
Hu Hu
Sabato Marco Siniscalchi
Yannan Wang
Xue Bai
Jun Du
Chin‐Hui Lee
|
+
|
Correlating Subword Articulation with Lip Shapes for Embedding Aware Audio-Visual Speech Enhancement
|
2020
|
Hang Chen
Jun Du
Yu Hu
Li-Rong Dai
Baocai Yin
Chin‐Hui Lee
|
+
PDF
Chat
|
On Mean Absolute Error for Deep Neural Network Based Vector-to-Vector Regression
|
2020
|
Jun Qi
Jun Du
Sabato Marco Siniscalchi
Xiaoli Ma
Chin‐Hui Lee
|
+
|
Lip-reading with Hierarchical Pyramidal Convolution and Self-Attention
|
2020
|
Hang Chen
Jun Du
Yu Hu
Li-Rong Dai
Chin‐Hui Lee
Baocai Yin
|
+
|
Riemannian Stochastic Gradient Descent for Tensor-Train Recurrent Neural Networks
|
2018
|
Jun Qi
Chin‐Hui Lee
Javier Tejedor
|
+
|
Convolutional-Recurrent Neural Networks for Speech Enhancement
|
2018
|
Han Zhao
Shuayb Zarar
Ivan Tashev
Chin‐Hui Lee
|
+
PDF
Chat
|
Convolutional-Recurrent Neural Networks for Speech Enhancement
|
2018
|
Han Zhao
Shuayb Zarar
Ivan Tashev
Chin‐Hui Lee
|
+
|
Acoustics-guided evaluation (AGE): a new measure for estimating performance of speech enhancement algorithms for robust ASR
|
2018
|
Li Chai
Jun Du
Chin‐Hui Lee
|
+
|
Convolutional-Recurrent Neural Networks for Speech Enhancement
|
2018
|
Han Zhao
Shuayb Zarar
Ivan Tashev
Chin‐Hui Lee
|
+
|
Multi-Objective Learning and Mask-Based Post-Processing for Deep Neural Network Based Speech Enhancement
|
2017
|
Yong Xu
Jun Du
Zhen Ying Huang
Li-Rong Dai
Chin‐Hui Lee
|
+
PDF
Chat
|
Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement
|
2015
|
Yong Xu
Jun Du
Zhen Huang
Li-Rong Dai
Chin‐Hui Lee
|
+
PDF
Chat
|
Maximum a posteriori adaptation of network parameters in deep models
|
2015
|
Zhen Huang
Sabato Marco Siniscalchi
I‐Ming Chen
Jinyu Li
Jiadong Wu
Chin‐Hui Lee
|
+
|
Maximum a Posteriori Adaptation of Network Parameters in Deep Models
|
2015
|
Zhen Huang
Sabato Marco Siniscalchi
I‐Ming Chen
Jiadong Wu
Chin‐Hui Lee
|
+
|
A Probabilistic Framework for Representing Dialog Systems and Entropy-Based Dialog Management through Dynamic Stochastic State Evolution
|
2015
|
Ji Wu
Miao Li
Chin‐Hui Lee
|
+
PDF
Chat
|
A Minimax Classification Approach With Application To Robust Speech Recognition
|
2005
|
Neri Merhav
Chin‐Hui Lee
|
+
|
Ordinary and Proper Location M-Estimates for Autoregressive-Moving Average Models
|
1986
|
Chin‐Hui Lee
R. Douglas Martin
|