Multitask Learning From Augmented Auxiliary Data for Improving Speech Emotion Recognition

Type: Article

Publication Date: 2022-11-14

Citations: 13

DOI: https://doi.org/10.1109/taffc.2022.3221749

Abstract

Despite the recent progress in speech emotion recognition (SER), state-of-the-art systems lack generalisation across different conditions.A key underlying reason for poor generalisation is the scarcity of emotion datasets, which is a significant roadblock to designing robust machine learning (ML) models.Recent works in SER focus on utilising multitask learning (MTL) methods to improve generalisation by learning shared representations.However, most of these studies propose MTL solutions with the requirement of meta labels for auxiliary tasks, which limits the training of SER systems.This paper proposes an MTL framework (MTL-AUG) that learns generalised representations from augmented data.We utilise augmentation-type classification and unsupervised reconstruction as auxiliary tasks, which allow training SER systems on augmented data without requiring any meta labels for auxiliary tasks.The semi-supervised nature of MTL-AUG allows for the exploitation of the abundant unlabelled data to further boost the performance of SER.We comprehensively evaluate the proposed framework in the following settings: (1) within corpus, (2) cross-corpus and cross-language, (3) noisy speech, (4) and adversarial attacks.Our evaluations using the widely used IEMOCAP, MSP-IMPROV, and EMODB datasets show improved results compared to existing state-of-the-art methods.

Locations

  • IEEE Transactions on Affective Computing - View
  • arXiv (Cornell University) - View - PDF
  • OPUS (Augsburg University) - View - PDF

Similar Works

Action Title Year Authors
+ Multitask Learning from Augmented Auxiliary Data for Improving Speech Emotion Recognition 2022 Siddique Latif
Rajib Rana
Sara Khalifa
Raja Jurdak
Björn Schüller
+ PDF Chat CopyPaste: An Augmentation Method for Speech Emotion Recognition 2021 Raghavendra Pappagari
Jesús Villalba
Piotr Żelasko
Laureano Moro‐Velazquez
Najim Dehak
+ CopyPaste: An Augmentation Method for Speech Emotion Recognition 2020 Raghavendra Pappagari
Jesús Villalba
Piotr Żelasko
Laureano Moro‐Velazquez
Najim Dehak
+ PDF Chat Exploring Self-Supervised Multi-view Contrastive Learning for Speech Emotion Recognition with Limited Annotations 2024 Bulat Khaertdinov
Pedro Jeuris
Annanda Sousa
Enrique Hortal
+ PDF Chat Exploring Self-Supervised Multi-view Contrastive Learning for Speech Emotion Recognition with Limited Annotations 2024 Bulat Khaertdinov
Pedro Jeruis
Annanda Sousa
Enrique Hortal
+ Recognizing More Emotions with Less Data Using Self-supervised Transfer Learning 2020 Jonathan Boigne
Biman Najika Liyanage
Ted Östrem
+ Recognizing More Emotions with Less Data Using Self-supervised Transfer Learning. 2020 Jonathan Boigne
Biman Liyanage
Ted Östrem
+ Recognizing More Emotions with Less Data Using Self-supervised Transfer Learning 2020 Jonathan Boigne
Biman Liyanage
Ted Östrem
+ A Comparative Study of Data Augmentation Techniques for Deep Learning Based Emotion Recognition 2022 Ravi Shankar
Abdouh Harouna Kenfack
Arjun Somayazulu
Archana Venkataraman
+ Towards Speech Emotion Recognition “in the Wild” Using Aggregated Corpora and Deep Multi-Task Learning 2017 Jaebok Kim
Gwenn Englebienne
Khiet P. Truong
Vanessa Evers
+ Towards Speech Emotion Recognition "in the wild" using Aggregated Corpora and Deep Multi-Task Learning 2017 Jaebok Kim
Gwenn Englebienne
Khiet P. Truong
Vanessa Evers
+ PDF Chat Generative Data Augmentation Guided by Triplet Loss for Speech Emotion Recognition 2022 Shijun Wang
Hamed Hemati
Jón Guðnason
Damian Borth
+ Contrastive Unsupervised Learning for Speech Emotion Recognition 2021 Mao Li
Bo Yang
Joshua J. Levy
Andreas Stolcke
Viktor Rozgić
Spyros Matsoukas
Constantinos Papayiannis
Daniel Bone
Chao Wang
+ PDF Chat Contrastive Unsupervised Learning for Speech Emotion Recognition 2021 Mao Li
Bo Yang
Joshua J. Levy
Andreas Stolcke
Viktor Rozgić
Spyros Matsoukas
Constantinos Papayiannis
Daniel Bone
Chao Wang
+ MSAC: Multiple Speech Attribute Control Method for Reliable Speech Emotion Recognition 2023 Yu Pan
+ Generative Data Augmentation Guided by Triplet Loss for Speech Emotion Recognition 2022 Shijun Wang
Hamed Hemati
Jón Guðnason
Damian Borth
+ Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting 2024 Tiantian Feng
Shrikanth Narayanan
+ Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting 2023 Tiantian Feng
Shrikanth Narayanan
+ PDF Chat SELM: Enhancing Speech Emotion Recognition for Out-of-Domain Scenarios 2024 Hazim Bukhari
Soham Deshmukh
Hira Dhamyal
Bhiksha Raj
Rita Singh
+ PDF Chat Learning Transferable Features for Speech Emotion Recognition 2017 Alison Marczewski
Adriano Veloso
Nívio Ziviani

Works Cited by This (28)

Action Title Year Authors
+ PDF Chat Heterogeneous Multi-task Learning for Human Pose Estimation with Deep Convolutional Neural Network 2014 Sijin Li
Zhi-Qiang Liu
Antoni B. Chan
+ PDF Chat Transfer Learning for Improving Speech Emotion Classification Accuracy 2018 Siddique Latif
Rajib Rana
Muhammad Shahzad Younis
Junaid Qadir
Julien Epps
+ PDF Chat Cross Lingual Speech Emotion Recognition: Urdu vs. Western Languages 2018 Siddique Latif
Adnan Qayyum
Muhammad Usman
Junaid Qadir
+ Automated screening for distress: A perspective for the future 2019 Rajib Rana
Siddique Latif
Raj Gururajan
Anthony Gray
Geraldine Mackenzie
Gerry Humphris
Jeff Dunn
+ PDF Chat SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition 2019 Daniel Park
William Chan
Yu Zhang
Chung‐Cheng Chiu
Barret Zoph
Ekin D. Cubuk
Quoc V. Le
+ PDF Chat Attention-augmented End-to-end Multi-task Learning for Emotion Prediction from Speech 2019 Zixing Zhang
Bingwen Wu
Björn Schüller
+ PDF Chat Attentive Convolutional Neural Network Based Speech Emotion Recognition: A Study on the Impact of Input Features, Signal Length, and Acted Speech 2017 Michael Neumann
Ngoc Thang Vu
+ PDF Chat On Enhancing Speech Emotion Recognition Using Generative Adversarial Networks 2018 Saurabh Sahu
Rahul Gupta
Carol Espy-Wilson
+ PDF Chat Variational Autoencoders for Learning Latent Representations of Speech Emotion: A Preliminary Study 2018 Siddique Latif
Rajib Rana
Junaid Qadir
Julien Epps
+ PDF Chat Progressive Neural Networks for Transfer Learning in Emotion Recognition 2017 John Gideon
Soheil Khorram
Zakaria Aldeneh
Dimitrios Dimitriadis
Emily Mower Provost