Audio-Based Activities of Daily Living (ADL) Recognition with Large-Scale Acoustic Embeddings from Online Videos

Type: Article

Publication Date: 2019-03-29

Citations: 62

DOI: https://doi.org/10.1145/3314404

Abstract

Over the years, activity sensing and recognition has been shown to play a key enabling role in a wide range of applications, from sustainability and human-computer interaction to health care. While many recognition tasks have traditionally employed inertial sensors, acoustic-based methods offer the benefit of capturing rich contextual information, which can be useful when discriminating complex activities. Given the emergence of deep learning techniques and leveraging new, large-scaled multi-media datasets, this paper revisits the opportunity of training audio-based classifiers without the onerous and time-consuming task of annotating audio data. We propose a framework for audio-based activity recognition that makes use of millions of embedding features from public online video sound clips. Based on the combination of oversampling and deep learning approaches, our framework does not require further feature processing or outliers filtering as in prior work. We evaluated our approach in the context of Activities of Daily Living (ADL) by recognizing 15 everyday activities with 14 participants in their own homes, achieving 64.2% and 83.6% averaged within-subject accuracy in terms of top-1 and top-3 classification respectively. Individual class performance was also examined in the paper to further study the co-occurrence characteristics of the activities and the robustness of the framework.

Locations

  • Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies - View
  • arXiv (Cornell University) - View - PDF
  • DataCite API - View

Similar Works

Action Title Year Authors
+ AudioAR: Audio-Based Activity Recognition with Large-Scale Acoustic Embeddings from YouTube Videos. 2018 Dawei Liang
Edison Thomaz
+ Domestic Activities Classification from Audio Recordings Using Multi-scale Dilated Depthwise Separable Convolutional Network 2023 Yufei Zeng
Yanxiong Li
Zhenfeng Zhou
Ruiqi Wang
Difeng Lu
+ Deep Learning for Sensor-based Human Activity Recognition: Overview, Challenges and Opportunities 2020 Kaixuan Chen
Dalin Zhang
Lina Yao
Bin Guo
Zhiwen Yu
Yunhao Liu
+ PDF Chat Domestic Activities Classification from Audio Recordings Using Multi-scale Dilated Depthwise Separable Convolutional Network 2021 Yufei Zeng
Yanxiong Li
Zhenfeng Zhou
Ruiqi Wang
Difeng Lu
+ PDF Chat Domestic Activity Clustering from Audio via Depthwise Separable Convolutional Autoencoder Network 2022 Yanxiong Li
Wenchang Cao
Konstantinos Drossos
Tuomas Virtanen
+ Domestic Activity Clustering from Audio via Depthwise Separable Convolutional Autoencoder Network 2022 Yanxiong Li
Wenchang Cao
Konstantinos Drossos
Tuomas Virtanen
+ DCASE 2018 Challenge - Task 5: Monitoring of domestic activities based on multi-channel acoustics 2018 Gert Dekkers
Lode Vuegen
Toon van Waterschoot
Bart Vanrumste
Peter Karsmakers
+ Classifying Human Activities with Inertial Sensors: A Machine Learning Approach 2021 Hamza Ali Imran
Saad Wazir
Usman Iftikhar
Usama Latif
+ Classifying Human Activities with Inertial Sensors: A Machine Learning Approach. 2021 Hamza Ali Imran
Saad Wazir
Usman Iftikhar
Usama Latif
+ PDF Chat Introducing IHARDS-CNN: A Cutting-Edge Deep Learning Method for Human Activity Recognition Using Wearable Sensors 2024 Nazanin Sedaghati
Masoud Kargar
Sina Abbaskhani
+ PDF Chat Temporal Action Localization for Inertial-based Human Activity Recognition 2024 Marius Bock
Michael Moeller
Kristof Van Laerhoven
+ PDF Chat Sensor Data for Human Activity Recognition: Feature Representation and Benchmarking 2020 Flávia Alves
Martin Gairing
Frans A. Oliehoek
Thanh-Toan Do
+ Sensor Data for Human Activity Recognition: Feature Representation and Benchmarking 2020 Flávia Alves
Martin Gairing
Frans A. Oliehoek
Thanh-Toan Do
+ PDF Chat Domestic Activities Clustering From Audio Recordings Using Convolutional Capsule Autoencoder Network 2021 Ziheng Lin
Yanxiong Li
Zhangjin Huang
Wenhao Zhang
Yufeng Tan
Yi-Chun Chen
Qianhua He
+ Domestic activities clustering from audio recordings using convolutional capsule autoencoder network 2021 Ziheng Lin
Yanxiong Li
Zhangjin Huang
Wenhao Zhang
Yufeng Tan
Yi‐Chun Chen
Qianhua He
+ PDF Chat A Comprehensive Methodological Survey of Human Activity Recognition Across Divers Data Modalities 2024 Jungpil Shin
Najmul Hassan
Abu Saleh Musa Miah
Satoshi Nishimura
+ Attend And Discriminate: Beyond the State-of-the-Art for Human Activity Recognition using Wearable Sensors. 2020 Alireza Abedin
Mahsa Ehsanpour
Qinfeng Shi
Hamid Rezatofighi
Damith C. Ranasinghe
+ Attend And Discriminate: Beyond the State-of-the-Art for Human Activity Recognition using Wearable Sensors 2020 Alireza Abedin
Mahsa Ehsanpour
Qinfeng Shi
Hamid Rezatofighi
Damith C. Ranasinghe
+ Human Activity Recognition using Wearable Sensors: Review, Challenges, Evaluation Benchmark 2021 Reem Abdel‐Salam
Rana Mostafa
Mayada Hadhood
+ Human Activity Recognition using Wearable Sensors: Review, Challenges, Evaluation Benchmark 2021 Reem Abdel‐Salam
Rana Mostafa
Mayada Hadhood

Works That Cite This (10)

Action Title Year Authors
+ PDF Chat Heads-Up Computing Moving Beyond the Device-Centered Paradigm 2023 Shengdong Zhao
Felicia Fang-Yi Tan
Katherine Fennedy
+ PDF Chat Seeing and Hearing Egocentric Actions: How Much Can We Learn? 2019 Alejandro Cartas
Jordi Luque
Petia Radeva
Carlos Segura
Mariella Dimiccoli
+ SensiX++: Bringing MLOps and Multi-tenant Model Serving to Sensory Edge Devices 2023 Chulhong Min
Akhil Mathur
Utku Günay Acer
Alessandro Montanari
Fahim Kawsar
+ PDF Chat Human Action Recognition From Various Data Modalities: A Review 2022 Zehua Sun
Qiuhong Ke
Hossein Rahmani
Mohammed Bennamoun
Gang Wang
Jun Liu
+ Human Action Recognition from Various Data Modalities: A Review 2021 Hossein Rahmani
Mohammed Bennamoun
Qiuhong Ke
+ Human Action Recognition from Various Data Modalities: A Review 2021 Hossein Rahmani
Mohammed Bennamoun
Qiuhong Ke
+ SensiX: A Platform for Collaborative Machine Learning on the Edge 2020 Chulhong Min
Akhil Mathur
Alessandro Montanari
Utku Günay Acer
Fahim Kawsar
+ PDF Chat SMART-vision: survey of modern action recognition techniques in vision 2024 Ali K. AlShami
Ryan Rabinowitz
Khang Nhứt Lâm
Yousra Shleibik
Melkamu Mersha
Terrance E. Boult
Jugal Kalita
+ Seeing and Hearing Egocentric Actions: How Much Can We Learn? 2019 Alejandro Cartas
Jordi Luque
Petia Radeva
Carlos Segura
Mariella Dimiccoli
+ Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study 2021 Dawei Liang
Yangyang Shi
Yun Wang
Nayan Singhal
Alex Xiao
Jonathan Shaw
Edison Thomaz
Ozlem Kalinli
Mike Seltzer

Works Cited by This (11)

Action Title Year Authors
+ Scikit-learn: Machine Learning in Python 2012 Fabián Pedregosa
Gaël Varoquaux
Alexandre Gramfort
Vincent Michel
Bertrand Thirion
Olivier Grisel
Mathieu Blondel
Peter Prettenhofer
Ron J. Weiss
Vincent Dubourg
+ PDF Chat SMOTE: Synthetic Minority Over-sampling Technique 2002 Nitesh V. Chawla
Kevin W. Bowyer
Lawrence Hall
W. Philip Kegelmeyer
+ PDF Chat Augur 2016 Ethan Fast
W. R. McGrath
Pranav Rajpurkar
Michael S. Bernstein
+ PDF Chat Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning 2017 Guillaume Lemaître
Fernando Nogueira
Christos K. Aridas
+ PDF Chat CNN architectures for large-scale audio classification 2017 Shawn Hershey
Sourish Chaudhuri
Daniel P. W. Ellis
Jort F. Gemmeke
Aren Jansen
Robert C. Moore
Manoj Plakal
Devin Platt
Rif A. Saurous
Bryan Seybold
+ SoundNet: Learning Sound Representations from Unlabeled Video 2016 Yusuf Aytar
Carl Vondrick
Antonio Torralba
+ Audio Set classification with attention model: A probabilistic perspective 2017 Qiuqiang Kong
Yong Xu
Wenwu Wang
Mark D. Plumbley
+ PDF Chat Audio Set Classification with Attention Model: A Probabilistic Perspective 2018 Qiuqiang Kong
Yong Xu
Wenwu Wang
Mark D. Plumbley
+ Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification 2017 Justin Salamon
Juan Pablo Bello
+ SoundNet: Learning Sound Representations from Unlabeled Video 2016 Yusuf Aytar
Carl Vondrick
Antonio Torralba