Audio-Visual Event Localization in Unconstrained Videos

Type: Book-Chapter

Publication Date: 2018-01-01

Citations: 376

DOI: https://doi.org/10.1007/978-3-030-01216-8_16

Locations

  • Lecture notes in computer science - View
  • arXiv (Cornell University) - View - PDF

Similar Works

Action Title Year Authors
+ Audio-Visual Event Localization in Unconstrained Videos 2018 Yapeng Tian
Jing Shi
Bochen Li
Zhiyao Duan
Chenliang Xu
+ Towards Long Form Audio-visual Video Understanding 2023 Wenxuan Hou
Guangyao Li
Yapeng Tian
Di Hu
+ PDF Chat Towards Open-Vocabulary Audio-Visual Event Localization 2024 Jinxing Zhou
Dan Guo
Ruohao Guo
Yuxin Mao
Jingjing Hu
Yiran Zhong
Xiaojun Chang
Meng Wang
+ PDF Chat Towards Long Form Audio-visual Video Understanding 2024 Wenxuan Hou
Guangyao Li
Yapeng Tian
Di Hu
+ PDF Chat Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline 2023 Tiantian Geng
Teng Wang
Jinming Duan
Runmin Cong
Feng Zheng
+ Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline 2023 Tiantian Geng
Teng Wang
Jinming Duan
Runmin Cong
Feng Zheng
+ AVE-CLIP: AudioCLIP-based Multi-window Temporal Transformer for Audio Visual Event Localization 2022 Tanvir Mahmud
Diana Marculescu
+ PDF Chat AVE-CLIP: AudioCLIP-based Multi-window Temporal Transformer for Audio Visual Event Localization 2023 Tanvir Mahmud
Diana Marculescu
+ AVECL-UMONS database for audio-visual event classification and localization 2020 Mathilde Brousmiche
Stéphane Dupont
Jean Rouat
+ PDF Chat CACE-Net: Co-guidance Attention and Contrastive Enhancement for Effective Audio-Visual Event Localization 2024 X. He
Xiangxi Liu
Yang Li
Dongcheng Zhao
Guobin Shen
Qingqun Kong
Xin Yang
Yi Zeng
+ Multi-Modulation Network for Audio-Visual Event Localization 2021 Hao Wang
Zheng-Jun Zha
Liang Li
Xuejin Chen
Jiebo Luo
+ Temporal Label-Refinement for Weakly-Supervised Audio-Visual Event Localization 2023 Kalyan Ramakrishnan
+ MPN: Multimodal Parallel Network for Audio-Visual Event Localization 2021 Jiashuo Yu
Ying Cheng
Rui Feng
+ PDF Chat MPN: Multimodal Parallel Network for Audio-Visual Event Localization 2021 Jiashuo Yu
Ying Cheng
Rui Feng
+ PDF Chat What Makes Audio Event Detection Harder than Classification? 2018 Fabrice Katzberg
Philipp Koch
Marco Maaß
Radoslaw Mazur
Ian McLoughlin
Alfred Mertins
Huy P. Phan
+ PDF Chat Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration 2024 Ziheng Zhou
Jinxing Zhou
Wei Qian
Shengeng Tang
Xiaojun Chang
Dan Guo
+ Eventness: Object Detection on Spectrograms for Temporal Localization of Audio Events 2017 Phuong Pham
Juncheng Li
Joseph Szurley
Samarjit Das
+ PDF Chat Eventness: Object Detection on Spectrograms for Temporal Localization of Audio Events 2018 Phuong Pham
Juncheng Li
Joseph Szurley
Samarjit Das
+ Eventness: Object Detection on Spectrograms for Temporal Localization of Audio Events 2017 Phuong Pham
Juncheng Li
Joseph Szurley
Samarjit Das
+ PDF Chat Dual-modality Seq2Seq Network for Audio-visual Event Localization 2019 Yan-Bo Lin
Yu-Jhe Li
Yu-Chiang Frank Wang

Works Cited by This (31)

Action Title Year Authors
+ PDF Chat Deep multimodal learning for Audio-Visual Speech Recognition 2015 Youssef Mroueh
Etienne Marcheret
Vaibhava Goel
+ PDF Chat Learning Spatiotemporal Features with 3D Convolutional Networks 2015 Du Tran
Lubomir Bourdev
Rob Fergus
Lorenzo Torresani
Manohar Paluri
+ PDF Chat ImageNet Large Scale Visual Recognition Challenge 2015 Olga Russakovsky
Jia Deng
Hao Su
Jonathan Krause
Sanjeev Satheesh
Sean Ma
Zhiheng Huang
Andrej Karpathy
Aditya Khosla
Michael S. Bernstein
+ PDF Chat Recurrent neural networks for polyphonic sound event detection in real life recordings 2016 Giambattista Parascandolo
Heikki Huttunen
Tuomas Virtanen
+ Multimodal Residual Learning for Visual QA 2016 Jin-Hwa Kim
Sangwoo Lee
Donghyun Kwak
Min-Oh Heo
Jeonghee Kim
Jung-Woo Ha
Byoung‐Tak Zhang
+ PDF Chat Anticipating Visual Representations from Unlabeled Video 2016 Carl Vondrick
Hamed Pirsiavash
Antonio Torralba
+ PDF Chat Ambient Sound Provides Supervision for Visual Learning 2016 Andrew Owens
Jiajun Wu
Josh H. McDermott
William T. Freeman
Antonio Torralba
+ PDF Chat CNN architectures for large-scale audio classification 2017 Shawn Hershey
Sourish Chaudhuri
Daniel P. W. Ellis
Jort F. Gemmeke
Aren Jansen
Robert C. Moore
Manoj Plakal
Devin Platt
Rif A. Saurous
Bryan Seybold
+ SoundNet: Learning Sound Representations from Unlabeled Video 2016 Yusuf Aytar
Carl Vondrick
Antonio Torralba
+ PDF Chat Temporal Convolutional Networks for Action Segmentation and Detection 2017 Colin Lea
M. D. Flynn
René Vidal
Austin Reiter
Gregory D. Hager