The USTC-Ximalaya System for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription (M2met) Challenge
The USTC-Ximalaya System for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription (M2met) Challenge
We propose two improvements to target-speaker voice activity detection (TS-VAD), the core component in our proposed speaker diarization system that was submitted to the 2022 Multi-Channel Multi-Party Meeting Transcription (M2MeT) challenge. These techniques are designed to handle multi-speaker conversations in real-world meeting scenarios with high speaker-overlap ratios and under heavy …