Boosting Monocular 3D Object Detection With Object-Centric Auxiliary Depth Supervision

Type: Article

Publication Date: 2022-01-01

Citations: 9

DOI: https://doi.org/10.1109/tits.2022.3224082

Abstract

Recent advances in monocular 3D detection leverage a depth estimation network explicitly as an intermediate stage of the 3D detection network. Depth map approaches yield more accurate depth to objects than other methods thanks to the depth estimation network trained on a large-scale dataset. However, depth map approaches can be limited by the accuracy of the depth map, and sequentially using two separated networks for depth estimation and 3D detection significantly increases computation cost and inference time. In this work, we propose a method to boost the RGB image-based 3D detector by jointly training the detection network with a depth prediction loss analogous to the depth estimation task. In this way, our 3D detection network can be supervised by more depth supervision from raw LiDAR points, which does not require any human annotation cost, to estimate accurate depth without explicitly predicting the depth map. Our novel object-centric depth prediction loss focuses on depth around foreground objects, which is important for 3D object detection, to leverage pixel-wise depth supervision in an object-centric manner. Our depth regression model is further trained to predict the uncertainty of depth to represent the 3D confidence of objects. To effectively train the 3D detector with raw LiDAR points and to enable end-to-end training, we revisit the regression target of 3D objects and design a network architecture. Extensive experiments on KITTI and nuScenes benchmarks show that our method can significantly boost the monocular image-based 3D detector to outperform depth map approaches while maintaining the real-time inference speed.

Locations

  • IEEE Transactions on Intelligent Transportation Systems - View
  • arXiv (Cornell University) - View - PDF

Similar Works

Action Title Year Authors
+ Boosting Monocular 3D Object Detection with Object-Centric Auxiliary Depth Supervision 2022 Young-Seok Kim
Sanmin Kim
Sangmin Sim
Jun Won Choi
Dongsuk Kum
+ Is Pseudo-Lidar needed for Monocular 3D Object detection? 2021 Dennis Park
Rareș Ambruș
Vitor Guizilini
Jie Li
Adrien Gaidon
+ Is Pseudo-Lidar needed for Monocular 3D Object detection? 2021 Dennis Park
Rareș Ambruș
Vitor Guizilini
Jie Li
Adrien Gaidon
+ PDF Chat Is Pseudo-Lidar needed for Monocular 3D Object detection? 2021 Dennis Park
Rareș Ambruș
Vitor Guizilini
Jie Li
Adrien Gaidon
+ PDF Chat AuxDepthNet: Real-Time Monocular 3D Object Detection with Depth-Sensitive Features 2025 Ruochen Zhang
Hyeung‐Sik Choi
Dong-Wook Jung
Phan Huy Nam Anh
Sang-Ki Jeong
Zihao Zhu
+ PDF Chat Depth Is All You Need for Monocular 3D Detection 2023 Dennis Park
Jie Li
Dian Chen
Vitor Guizilini
Adrien Gaidon
+ Depth Is All You Need for Monocular 3D Detection 2022 Dennis Park
Jie Li
Dian Chen
Vitor Guizilini
Adrien Gaidon
+ Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images with Virtual Depth 2021 Chenhang He
Jianqiang Huang
Xian-Sheng Hua
Lei Zhang
+ PDF Chat MonoCD: Monocular 3D Object Detection with Complementary Depths 2024 Longfei Yan
Yan Pei
Shengzhou Xiong
Xuanyu Xiang
Yihua Tan
+ MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer 2022 Kuan‐Chih Huang
Tsung-Han Wu
Hung-Ting Su
Winston H. Hsu
+ PDF Chat MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer 2022 Kuan‐Chih Huang
Tsung-Han Wu
Hung-Ting Su
Winston H. Hsu
+ Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection 2021 Yinmin Zhang
Xinzhu Ma
Shuai Yi
Jun Hou
Zhihui Wang
Wanli Ouyang
Dan Xu
+ MonoJSG: Joint Semantic and Geometric Cost Volume for Monocular 3D Object Detection 2022 Qing Lian
Peiliang Li
Xiaozhi Chen
+ PDF Chat OBMO: One Bounding Box Multiple Objects for Monocular 3D Object Detection 2023 C.C. Huang
Tong He
Haidong Ren
Wenxiao Wang
Binbin Lin
Deng Cai
+ OBMO: One Bounding Box Multiple Objects for Monocular 3D Object Detection 2022 C.C. Huang
Tong He
Haidong Ren
Wenxiao Wang
Binbin Lin
Deng Cai
+ Learning Depth-Guided Convolutions for Monocular 3D Object Detection 2019 Mingyu Ding
Yuqi Huo
Hongwei Yi
Zhe Wang
Jianping Shi
Zhiwu Lu
Ping Luo
+ PDF Chat OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection 2024 Jinghua Hou
Tong Wang
Xiaoqing Ye
Zhe Liu
Shi Gong
Xiao Tan
Errui Ding
Jingdong Wang
Xiang Bai
+ Center3D: Center-based Monocular 3D Object Detection with Joint Depth Understanding 2020 Yunlei Tang
Sebastian Dörn
Chiragkumar Savani
+ Categorical Depth Distribution Network for Monocular 3D Object Detection 2021 Cody Reading
Ali Harakeh
Julia Chae
Steven L. Waslander
+ Categorical Depth Distribution Network for Monocular 3D Object Detection 2021 Cody Reading
Ali Harakeh
Julia Chae
Steven L. Waslander

Works Cited by This (43)

Action Title Year Authors
+ PDF Chat Unsupervised Monocular Depth Estimation with Left-Right Consistency 2017 Clément Godard
Oisin Mac Aodha
Gabriel Brostow
+ PDF Chat 3D Bounding Box Estimation Using Deep Learning and Geometry 2017 Arsalan Mousavian
Dragomir Anguelov
John P. Flynn
Jana Košecká
+ PDF Chat Deformable Convolutional Networks 2017 Jifeng Dai
Haozhi Qi
Yuwen Xiong
Yi Li
Guodong Zhang
Han Hu
Yichen Wei
+ PDF Chat Deep MANTA: A Coarse-to-Fine Many-Task Network for Joint 2D and 3D Vehicle Analysis from Monocular Image 2017 Florian Chabot
Mohamed Chaouch
Jaonary Rabarisoa
Céline Teulière
Thierry Château
+ PDF Chat PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud 2019 Shaoshuai Shi
Xiaogang Wang
Hongsheng Li
+ PDF Chat Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction 2019 Jason S. Ku
Alex D. Pon
Steven L. Waslander
+ PDF Chat Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving 2019 Yan Wang
Wei‐Lun Chao
Divyansh Garg
Bharath Hariharan
Mark Campbell
Kilian Q. Weinberger
+ From Big to Small: Multi-Scale Local Planar Guidance for Monocular Depth Estimation 2019 Jin Han Lee
Myung Kyu Han
Dong Wook Ko
Il Hong Suh
+ PDF Chat You Only Look Once: Unified, Real-Time Object Detection 2016 Joseph Redmon
Santosh Divvala
Ross Girshick
Ali Farhadi
+ PDF Chat Focal Loss for Dense Object Detection 2017 Tsung-Yi Lin
Priya Goyal
Ross Girshick
Kaiming He
Piotr Dollár