Shallow Feature Matters for Weakly Supervised Object Localization

Type: Article

Publication Date: 2021-06-01

Citations: 69

DOI: https://doi.org/10.1109/cvpr46437.2021.00593

Abstract

Weakly supervised object localization (WSOL) aims to localize objects by only utilizing image-level labels. Class activation maps (CAMs) are the commonly used features to achieve WSOL. However, previous CAM-based methods did not take full advantage of the shallow features, despite their importance for WSOL. Because shallow features are easily buried in background noise through conventional fusion. In this paper, we propose a simple but effective Shallow feature-aware Pseudo supervised Object Localization (SPOL) model for accurate WSOL, which makes the utmost of low-level features embedded in shallow layers. In practice, our SPOL model first generates the CAMs through a novel element-wise multiplication of shallow and deep feature maps, which filters the background noise and generates sharper boundaries robustly. Besides, we further propose a general class-agnostic segmentation model to achieve the accurate object mask, by only using the initial CAMs as the pseudo label without any extra annotation. Eventually, a bounding box extractor is applied to the object mask to locate the target. Experiments verify that our SPOL outperforms the state-of-the-art on both CUB- 200 and ImageNet-1K benchmarks, achieving 93.44% and 67.15% (i.e., 3.93% and 2.13% improvement) Top-5 localization accuracy, respectively.

Locations

  • arXiv (Cornell University) - View - PDF
  • 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) - View

Works That Cite This (22)

Action Title Year Authors
+ PDF Chat TCAM: Temporal Class Activation Maps for Object Localization in Weakly-Labeled Unconstrained Videos 2023 Soufiane Belharbi
Ismail Ben Ayed
Luke McCaffrey
Ɖric Granger
+ PDF Chat Scaling Novel Object Detection with Weakly Supervised Detection Transformers 2023 Tyler LaBonte
Yale Song
Xin Wang
Vibhav Vineet
Neel Joshi
+ PDF Chat FDCNet: Feature Drift Compensation Network for Class-Incremental Weakly Supervised Object Localization 2023 Sejin Park
Taeā€Hyung Lee
Yeejin Lee
Byeongkeun Kang
+ PDF Chat LCTR: On Awakening the Local Continuity of Transformer for Weakly Supervised Object Localization 2022 Zhiwei Chen
Changan Wang
Yabiao Wang
Guannan Jiang
Yunhang Shen
Ying Tai
Chengjie Wang
Wei Zhang
Liujuan Cao
+ CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection 2023 Chinedu Innocent Nwoye
Tong Yu
Saurav Sharma
Aditya Murali
Deepak Alapatt
Armine Vardazaryan
Kun Yuan
Jonas Hajek
Wolfgang Reiter
Amine Yamlahi
+ PDF Chat Accumulated Trivial Attention Matters in Vision Transformers on Small Datasets 2023 Xiangyu Chen
Qinghao Hu
Kaidong Li
Cuncong Zhong
Guanghui Wang
+ PDF Chat Hierarchical Visual Primitive Experts for Compositional Zero-Shot Learning 2023 Hanjae Kim
Jiyoung Lee
Seongheon Park
Kwanghoon Sohn
+ PDF Chat Fine-Grained Self-Supervised Learning with Jigsaw puzzles for medical image classification 2024 Wongi Park
Jongbin Ryu
+ PDF Chat Multiscale Feature Learning Using Co-Tuplet Loss for Offline Handwritten Signature Verification 2023 Fu-Hsien Huang
Hsinā€Min Lu
+ PDF Chat Guided Interpretable Facial Expression Recognition via Spatial Action Unit Cues 2024 Soufiane Belharbi
Marco Pedersoli
Alessandro L. Koerich
Simon Bacon
Ɖric Granger