Pairwise Decomposition of Image Sequences for Active Multi-view Recognition

Type: Article

Publication Date: 2016-06-01

Citations: 207

DOI: https://doi.org/10.1109/cvpr.2016.414

Download PDF

Abstract

A multi-view image sequence provides a much richer capacity for object recognition than from a single image. However, most existing solutions to multi-view recognition typically adopt hand-crafted, model-based geometric methods, which do not readily embrace recent trends in deep learning. We propose to bring Convolutional Neural Networks to generic multi-view recognition, by decomposing an image sequence into a set of image pairs, classifying each pair independently, and then learning an object classifier by weighting the contribution of each pair. This allows for recognition over arbitrary camera trajectories, without requiring explicit training over the potentially infinite number of camera paths and lengths. Building these pairwise relationships then naturally extends to the next-best-view problem in an active recognition framework. To achieve this, we train a second Convolutional Neural Network to map directly from an observed image to next viewpoint. Finally, we incorporate this into a trajectory optimisation task, whereby the best recognition confidence is sought for a given trajectory length. We present state-of-the-art results in both guided and unguided multi-view recognition on the ModelNet dataset, and show how our method can be used with depth images, greyscale images, or both.

Locations

  • arXiv (Cornell University) - View - PDF

Similar Works

Action Title Year Authors
+ Pairwise Decomposition of Image Sequences for Active Multi-View Recognition 2016 Edward Johns
Stefan Leutenegger
Andrew J. Davison
+ Pairwise Decomposition of Image Sequences for Active Multi-View Recognition 2016 Edward Johns
Stefan Leutenegger
Andrew J. Davison
+ Geometry-Aware Recurrent Neural Networks for Active Visual Recognition 2018 Ricson Cheng
Ziyan Wang
Katerina Fragkiadaki
+ Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space 2022 Jinghuan Shang
Srijan Das
Michael S. Ryoo
+ PDF Chat DVANet: Disentangling View and Action Features for Multi-View Action Recognition 2024 Nyle Siddiqui
Praveen Tirupattur
Mubarak Shah
+ PDF Chat Learning Canonical View Representation for 3D Shape Recognition with Arbitrary Views 2021 Xin Wei
Yifei Gong
Fudong Wang
Xing Sun
Jian Sun
+ Learning Canonical View Representation for 3D Shape Recognition with Arbitrary Views 2021 Xin Wei
Yifei Gong
Fudong Wang
Xing Sun
Jian Sun
+ PDF Chat Deep Models for Multi-View 3D Object Recognition: A Review 2024 Mona Alzahrani
Muhammad Usman
Salma Kammoun
Saeed Anwar
Tarek Helmy
+ Few-Shot Viewpoint Estimation 2019 Hung-Yu Tseng
Shalini De Mello
Jonathan Tremblay
Sifei Liu
Stan Birchfield
Ming–Hsuan Yang
Jan Kautz
+ PDF Chat BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers 2022 Zhiqi Li
Wenhai Wang
Hongyang Li
Enze Xie
Chonghao Sima
Tong LĂŒ
Yu Qiao
Jifeng Dai
+ DVANet: Disentangling View and Action Features for Multi-View Action Recognition 2023 Nyle Siddiqui
Praveen Tirupattur
Mubarak Shah
+ Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views 2021 Nanbo Li
Cian Eastwood
Robert B. Fisher
+ PDF Chat Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views 2021 Nanbo Li
Cian Eastwood
Robert B. Fisher
+ Recurrent 3D Attentional Networks for End-to-End Active Object Recognition 2016 Min Liu
Yifei Shi
Lintao Zheng
Kai Xu
Hui Huang
Dinesh Manocha
+ PDF Chat Recurrent 3D attentional networks for end-to-end active object recognition 2019 Min Liu
Yifei Shi
Lintao Zheng
Kai Xu
Hui Huang
Dinesh Manocha
+ Cross-view Action Recognition Understanding From Exocentric to Egocentric Perspective 2023 Thanh-Dat Truong
Khoa Luu
+ Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views 2021 Nanbo Li
Cian Eastwood
Robert B. Fisher
+ Learning multiplane images from single views with self-supervision 2021 Gustavo Sutter P. Carvalho
Diogo Luvizon
Antonio Joia Neto
André G. C. Pacheco
OtĂĄvio A. B. Penatti
+ Learning multiplane images from single views with self-supervision 2021 Gustavo Sutter P. Carvalho
Diogo Luvizon
Antonio Joia Neto
André G. C. Pacheco
OtĂĄvio A. B. Penatti
+ PDF Chat Cross-View Action Modeling, Learning, and Recognition 2014 Jiang Wang
Xiaohan Nie
Xia Yin
Ying Wu
Song‐Chun Zhu

Works That Cite This (71)

Action Title Year Authors
+ PDF Chat Act Like a Radiologist: Towards Reliable Multi-View Correspondence Reasoning for Mammogram Mass Detection 2021 Yuhang Liu
Fandong Zhang
Chaoqi Chen
Siwen Wang
Yizhou Wang
Yizhou Yu
+ PDF Chat Robust Pooling Through the Data Mode 2022 Ayman Mukhaimar
Ruwan Tennakoon
Reza Hoseinnezhad
Chow Yin Lai
Alireza Bab‐Hadiashar
+ Shifting Perspective to See Difference: A Novel Multi-View Method for Skeleton based Action Recognition 2022 Ruijie Hou
Yanran Li
Ningyu Zhang
Yulin Zhou
Xiaosong Yang
Zhao Wang
+ Deep Learning Advances on Different 3D Data Representations: A Survey. 2018 Eman Ahmed
Alexandre Saint
Abd El Rahman Shabayek
Kseniya Cherenkova
Rig Das
Gleb Gusev
Djamila Aouada
Björn Ottersten
+ PDF Chat Large-Scale Shape Retrieval with Sparse 3D Convolutional Neural Networks 2017 Alexandr Notchenko
Yermek Kapushev
Evgeny Burnaev
+ A survey on Deep Learning Advances on Different 3D Data Representations 2018 Eman Ahmed
Alexandre Saint
Abd El Rahman Shabayek
Kseniya Cherenkova
Rig Das
Gleb Gusev
Djamila Aouada
Björn Ottersten
+ SPNet: Deep 3D Object Classification and Retrieval using Stereographic Projection 2018 Mohsen Yavartanoo
Eu Young Kim
Kyoung Mu Lee
+ A survey of Object Classification and Detection based on 2D/3D data 2019 Xiaoke Shen
+ PDF Chat Blended Convolution and Synthesis for Efficient Discrimination of 3D Shapes 2020 Sameera Ramasinghe
Salman Khan
Nick Barnes
Stephen Jay Gould
+ Learning Discriminative 3D Shape Representations by View Discerning Networks 2018 Biao Leng
Cheng Zhang
Xiaocheng Zhou
Cheng Xu
Kai Xu