Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds

Type: Preprint

Publication Date: 2020-01-01

Citations: 25

DOI: https://doi.org/10.48550/arxiv.2011.01143

Locations

  • arXiv (Cornell University) - View
  • DataCite API - View

Similar Works

Action Title Year Authors
+ Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds 2021 Efthymios Tzinis
Scott Wisdom
Aren Jansen
Shawn Hershey
Tal Remez
Dan Ellis
John R. Hershey
+ AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation 2022 Efthymios Tzinis
Scott Wisdom
Tal Remez
John R. Hershey
+ Improving On-Screen Sound Separation for Open Domain Videos with Audio-Visual Self-attention. 2021 Efthymios Tzinis
Scott Wisdom
Tal Remez
John R. Hershey
+ PDF Chat Co-Separating Sounds of Visual Objects 2019 Ruohan Gao
Kristen Grauman
+ Co-Separating Sounds of Visual Objects 2019 Ruohan Gao
Kristen Grauman
+ Improving On-Screen Sound Separation for Open-Domain Videos with Audio-Visual Self-Attention 2021 Efthymios Tzinis
Scott Wisdom
Tal Remez
John R. Hershey
+ Learning to Separate Object Sounds by Watching Unlabeled Video 2018 Ruohan Gao
Rogério Feris
Kristen Grauman
+ PDF Chat Weakly-Supervised Audio-Visual Sound Source Detection and Separation 2021 Tanzila Rahman
Leonid Sigal
+ Weakly-supervised Audio-visual Sound Source Detection and Separation 2021 Tanzila Rahman
Leonid Sigal
+ Weakly-supervised Audio-visual Sound Source Detection and Separation 2021 Tanzila Rahman
Leonid Sigal
+ Separating Invisible Sounds Toward Universal Audiovisual Scene-Aware Sound Separation 2023 Yiyang Su
Ali Vosoughi
Shijian Deng
Yapeng Tian
Chenliang Xu
+ Visual Scene Graphs for Audio Source Separation 2021 Moitreya Chatterjee
Jonathan Le Roux
Narendra Ahuja
Anoop Cherian
+ Visual Scene Graphs for Audio Source Separation 2021 Moitreya Chatterjee
Jonathan Le Roux
Narendra Ahuja
Anoop Cherian
+ Multiple Sound Sources Localization from Coarse to Fine 2020 Rui Qian
Di Hu
Heinrich Dinkel
Mengyue Wu
Ning Xu
Weiyao Lin
+ PDF Chat Leveraging Category Information for Single-Frame Visual Sound Source Separation 2021 Lingyu Zhu
Esa Rahtu
+ Leveraging Category Information for Single-Frame Visual Sound Source Separation 2020 Lingyu Zhu
Esa Rahtu
+ PDF Chat Visually Guided Sound Source Separation With Audio-Visual Predictive Coding 2023 Zengjie Song
Zhaoxiang Zhang
+ Learning Audio-Visual Dynamics Using Scene Graphs for Audio Source Separation 2022 Moitreya Chatterjee
Narendra Ahuja
Anoop Cherian
+ A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition 2023 Shentong Mo
Pedro Morgado
+ Visual Sound Localization in the Wild by Cross-Modal Interference Erasing 2022 Xian Liu
Rui Qian
Hang Zhou
Di Hu
Weiyao Lin
Ziwei Liu
Bolei Zhou
Xiaowei Zhou

Works Cited by This (31)

Action Title Year Authors
+ PDF Chat YFCC100M 2016 Bart Thomée
David A. Shamma
Gerald Friedland
Benjamin Elizalde
Karl Ni
Douglas N. Poland
Damian Borth
Li-Jia Li
+ MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications 2017 Andrew Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
Marco Andreetto
Hartwig Adam
+ PDF Chat Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks 2018 Jen-Cheng Hou
Syu‐Siang Wang
Ying-Hui Lai
Yu Tsao
Hsiu-Wen Chang
Hsin‐Min Wang
+ PDF Chat Unsupervised Deep Clustering for Source Separation: Direct Learning from Mixtures Using Spatial Information 2019 Efthymios Tzinis
Shrikant Venkataramani
Paris Smaragdis
+ PDF Chat Unsupervised Training of a Deep Clustering Model for Multichannel Blind Source Separation 2019 Lukas Drude
Daniel Hasenklever
Reinhold Haeb‐Umbach
+ PDF Chat Audio-Visual Scene Analysis with Self-Supervised Multisensory Features 2018 Andrew Owens
Alexei A. Efros
+ PDF Chat Differentiable Consistency Constraints for Improved Deep Speech Enhancement 2019 Scott Wisdom
John R. Hershey
Kevin Wilson
Jeremy Thorpe
Michael Chinen
Brian Patton
Rif A. Saurous
+ PDF Chat Learning to Separate Object Sounds by Watching Unlabeled Video 2018 Ruohan Gao
Rogério Feris
Kristen Grauman
+ PDF Chat Learning to Localize Sound Source in Visual Scenes 2018 Arda Senocak
Tae-Hyun Oh
Junsik Kim
Ming–Hsuan Yang
In So Kweon
+ PDF Chat Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input 2018 David Harwath
AdriĂ  Recasens
DĂ­dac SurĂ­s
Galen Chuang
Antonio Torralba
James Glass