Ask a Question

Prefer a chat interface with context about you and your work?

Multistage linguistic conditioning of convolutional layers for speech emotion recognition

Multistage linguistic conditioning of convolutional layers for speech emotion recognition

Introduction The effective fusion of text and audio information for categorical and dimensional speech emotion recognition (SER) remains an open issue, especially given the vast potential of deep neural networks (DNNs) to provide a tighter integration of the two. Methods In this contribution, we investigate the effectiveness of deep fusion …