Ask a Question

Prefer a chat interface with context about you and your work?

Multimodal Speaker Segmentation and Diarization Using Lexical and Acoustic Cues via Sequence to Sequence Neural Networks

Multimodal Speaker Segmentation and Diarization Using Lexical and Acoustic Cues via Sequence to Sequence Neural Networks

While there has been substantial amount of work in speaker diarization recently, there are few efforts in jointly employing lexical and acoustic information for speaker segmentation.Towards that, we investigate a speaker diarization system using a sequence-to-sequence neural network trained on both lexical and acoustic features.We also propose a loss function …