BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision

Chenyu Yang, Yuntao Chen, Hao Tian, Chenxin Tao, Xizhou Zhu, Zhaoxiang Zhang, Gao Huang, Hongyang Li, Yu Qiao, Lewei Lu

Type: Article

Publication Date: 2023-06-01

Citations: 120

DOI: https://doi.org/10.1109/cvpr52729.2023.01710

Abstract

We present a novel bird's-eye-view (BEV) detector with perspective supervision, which converges faster and bet-suits modern image backbones. Existing state-of-the-art BEV detectors are often tied to certain depth pretrained backbones like Vo Vn et, hindering the synergy between booming image backbones and BEV detectors. To address this limitation, we prioritize easing the optimization of BEV detectors by introducing perspective view supervision. To this end, we propose a two-stage BEV detector; where proposals from the perspective head are fed into the bird' s-eye-view head for final predictions. To evaluate the effectiveness of our model, we conduct extensive ablation studies focusing on the form of supervision and the gener-ality of the proposed detector. The proposed method is ver-ified with a wide spectrum of traditional and modern image backbones and achieves new SoTA results on the large-scale nuScenes dataset. The code shall be released soon.

Locations

arXiv (Cornell University) - View - PDF
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) - View

Similar Works

Action	Title	Year	Authors
+	BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision	2022	Chenyu Yang Yuntao Chen Hao Tian Chenxin Tao Xizhou Zhu Zhaoxiang Zhang Gao Huang Hongyang Li Yu Qiao Lewei Lu
+	VoxelFormer: Bird's-Eye-View Feature Generation based on Dual-view Attention for Multi-view 3D Object Detection	2023	Zhuoling Li Chuanrui Zhang Wei-Chiu Ma Zhou Yipin Linyan Huang Haoqian Wang Ser-Nam Lim Hengshuang Zhao
+ PDF Chat	BEVDepth: Acquisition of Reliable Depth for Multi-View 3D Object Detection	2023	Yinhao Li Zheng Ge Guanyi Yu Jinrong Yang Zengran Wang Yukang Shi Jianjian Sun Zeming Li
+	BEVDepth: Acquisition of Reliable Depth for Multi-view 3D Object Detection	2022	Yinhao Li Zheng Ge Guanyi Yu Jinrong Yang Zengran Wang Yukang Shi Jianjian Sun Zeming Li
+ PDF Chat	SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos	2023	Haisong Liu Yao Teng Tao Lu Haiguang Wang Limin Wang
+	SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos	2023	Haisong Liu 要子伊藤 Tao Lü Haiguang Wang Limin Wang
+ PDF Chat	DualBEV: CNN is All You Need in View Transformation	2024	Peidong Li Wancheng Shen Qihao Huang Dixiao Cui
+	MV-FCOS3D++: Multi-View Camera-Only 4D Object Detection with Pretrained Monocular Backbones	2022	Tai Wang Qing Lian Chenming Zhu Xinge Zhu Wenwei Zhang
+	Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction	2023	Zhuofan Zong Dongzhi Jiang Guanglu Song Zeyue Xue Jingyong Su Hongsheng Li Yu Liu
+	M$^2$BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Birds-Eye View Representation	2022	Enze Xie Zhiding Yu Daquan Zhou Jonah Philion Anima Anandkumar Sanja Fidler Ping Luo José Manuel González y Fernández Valles
+ PDF Chat	Bird’s-Eye-View Panoptic Segmentation Using Monocular Frontal View Images	2022	Nikhil Gosala Abhinav Valada
+	BEVDistill: Cross-Modal BEV Distillation for Multi-View 3D Object Detection	2022	Zehui Chen Zhenyu Li Shiquan Zhang Liangji Fang Qinhong Jiang Feng Zhao
+	Towards Generalizable Multi-Camera 3D Object Detection via Perspective Debiasing	2023	Hao Lu Yunpeng Zhang Qing Lian Dalong Du Yingcong Chen
+ PDF Chat	FB-BEV: BEV Representation from Forward-Backward View Transformations	2023	Zhiqi Li Zhiding Yu Wenhai Wang Anima Anandkumar Tong Lü Jose M. Álvarez
+	FB-BEV: BEV Representation from Forward-Backward View Transformations	2023	Zhiqi Li Zhiding Yu Wenhai Wang Anima Anandkumar Tong Lü Jose M. Álvarez
+ PDF Chat	Focus on BEV: Self-calibrated Cycle View Transformation for Monocular Birds-Eye-View Segmentation	2024	Jiawei Zhao Qixing Jiang Xiang Li Junfeng Luo
+ PDF Chat	SeaBird: Segmentation in Bird's View with Dice Loss Improves Monocular 3D Detection of Large Objects	2024	Abhinav Kumar Yuliang Guo Xinyu Huang Liu Ren Xiaoming Liu
+ PDF Chat	Towards Domain Generalization for Multi-view 3D Object Detection in Bird-Eye-View	2023	Shuo Wang Xinhai Zhao Haiming Xu Zehui Chen Dameng Yu Jiahao Chang Zhen Yang Feng Zhao
+	Introducing Depth into Transformer-based 3D Object Detection	2023	Hao Zhang Hongyang Li Xingyu Liao Feng Li Shilong Liu Lionel M. Ni Lei Zhang
+	QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D Object Detection	2023	Yifan Zhang Zhen Dong Huanrui Yang Ming Lü Cheng-Ching Tseng Yuan Du Kurt Keutzer Li Du Shanghang Zhang

Works That Cite This (29)

Action	Title	Year	Authors
+ PDF Chat	Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection	2023	Shihao Wang Yingfei Liu Tiancai Wang Ying Li Xiangyu Zhang
+ PDF Chat	Exploring Recurrent Long-Term Temporal Fusion for Multi-View 3D Perception	2024	Chunrui Han Jinrong Yang Jianjian Sun Zheng Ge Runpei Dong Hongyu Zhou Weixin Mao Yuang Peng Xiangyu Zhang
+ PDF Chat	SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos	2023	Haisong Liu Yao Teng Tao Lu Haiguang Wang Limin Wang
+ PDF Chat	Far3D: Expanding the Horizon for Surround-View 3D Object Detection	2024	Xiaohui Jiang Shuailin Li Yingfei Liu Shihao Wang Fan Jia Tiancai Wang Lijin Han Xiangyu Zhang
+ PDF Chat	FB-BEV: BEV Representation from Forward-Backward View Transformations	2023	Zhiqi Li Zhiding Yu Wenhai Wang Anima Anandkumar Tong Lü Jose M. Álvarez
+	Fully Sparse Fusion for 3D Object Detection	2023	Yingyan Li Lue Fan Yang Liu Zehao Huang Yuntao Chen Naiyan Wang Zhaoxiang Zhang Ying Tan
+ PDF Chat	Eliminating Cross-modal Conflicts in BEV Space for LiDAR-Camera 3D Object Detection	2024	Jiahui Fu Chen Gao Zitian Wang Lirong Yang Xiaofei Wang Beipeng Mu Sifeng Liu
+ PDF Chat	Calibration-Free BEV Representation for Infrastructure Perception	2023	Siqi Fan Zhe Wang Xiaoliang Huo Yan Wang Jingjing Liu
+	OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection	2024	Zhongyu Xia Jishuo Li Zhiwei Lin Xinhao Wang Yongtao Wang Ming–Hsuan Yang
+ PDF Chat	Exploring Point-BEV Fusion for 3D Point Cloud Object Tracking with Transformer	2024	Zhipeng Luo Changqing Zhou Liang Pan Gongjie Zhang Tianrui Liu Yueru Luo Haiyu Zhao Ziwei Liu Shijian Lu

Works Cited by This (36)

Action	Title	Year	Authors
+ PDF Chat	Deep Residual Learning for Image Recognition	2016	Kaiming He Xiangyu Zhang Shaoqing Ren Jian Sun
+ PDF Chat	Multi-view 3D Object Detection Network for Autonomous Driving	2017	Xiaozhi Chen Huimin Ma Ji Wan Bo Li Tian Xia
+	Decoupled Weight Decay Regularization	2017	Ilya Loshchilov Frank Hutter
+ PDF Chat	Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving	2019	Yan Wang Wei‐Lun Chao Divyansh Garg Bharath Hariharan Mark Campbell Kilian Q. Weinberger
+ PDF Chat	Focal Loss for Dense Object Detection	2017	Tsung-Yi Lin Priya Goyal Ross Girshick Kaiming He Piotr Dollár
+ PDF Chat	Joint 3D Proposal Generation and Object Detection from View Aggregation	2018	Jason S. Ku Melissa Mozifian Jungwook Lee Ali Harakeh Steven L. Waslander
+ PDF Chat	Frustum PointNets for 3D Object Detection from RGB-D Data	2018	Charles R. Qi Wei Liu Chenxia Wu Hao Su Leonidas Guibas
+ PDF Chat	Disentangling Monocular 3D Object Detection	2019	Andrea Simonelli Samuel Rota Bulò Lorenzo Porzi Manuel López-Antequera Peter Kontschieder
+ PDF Chat	FCOS: Fully Convolutional One-Stage Object Detection	2019	Zhi Tian Chunhua Shen Hao Chen Tong He
+ PDF Chat	CenterMask: Real-Time Anchor-Free Instance Segmentation	2020	Youngwan Lee Jongyoul Park