T-CNN: Tubelets With Convolutional Neural Networks for Object Detection From Videos

Type: Article

Publication Date: 2017-08-07

Citations: 514

DOI: https://doi.org/10.1109/tcsvt.2017.2736553

Abstract

The state-of-the-art performance for object detection has been significantly improved over the past two years. Besides the introduction of powerful deep neural networks such as GoogleNet and VGG, novel object detection frameworks such as R-CNN and its successors, Fast R-CNN and Faster R-CNN, play an essential role in improving the state-of-the-art. Despite their effectiveness on still images, those frameworks are not specifically designed for object detection from videos. Temporal and contextual information of videos are not fully investigated and utilized. In this work, we propose a deep learning framework that incorporates temporal and contextual information from tubelets obtained in videos, which dramatically improves the baseline performance of existing still-image detection frameworks when they are applied to videos. It is called T-CNN, i.e. tubelets with convolutional neueral networks. The proposed framework won the recently introduced object-detection-from-video (VID) task with provided data in the ImageNet Large-Scale Visual Recognition Challenge 2015 (ILSVRC2015).

Locations

  • IEEE Transactions on Circuits and Systems for Video Technology - View
  • arXiv (Cornell University) - View - PDF
  • DataCite API - View

Similar Works

Action Title Year Authors
+ PDF Chat Object Detection from Video Tubelets with Convolutional Neural Networks 2016 Kai Kang
Wanli Ouyang
Hongsheng Li
Xiaogang Wang
+ PDF Chat Object Detection in Videos with Tubelet Proposal Networks 2017 Kai Kang
Hongsheng Li
Tong Xiao
Wanli Ouyang
Junjie Yan
Xihui Liu
Xiaogang Wang
+ PDF Chat Tube Convolutional Neural Network (T-CNN) for Action Detection in Videos 2017 Rui Hou
Chen Chen
Mubarak Shah
+ Tube Convolutional Neural Network (T-CNN) for Action Detection in Videos 2017 Rui Hou
Chen Chen
Mubarak Shah
+ Spatio-temporal Tubelet Feature Aggregation and Object Linking in Videos 2020 Daniel Cores
V.M. Brea
Manuel Mucientes
+ Spatio-temporal Tubelet Feature Aggregation and Object Linking in Videos 2020 Daniel Cores
V.M. Brea
Manuel Mucientes
+ Object Detection in Videos by Short and Long Range Object Linking 2018 Peng Tang
Chunyu Wang
Xinggang Wang
Wenyu Liu
Wenjun Zeng
Jingdong Wang
+ Deformable Tube Network for Action Detection in Videos 2019 Wei Li
Zehuan Yuan
Dashan Guo
Lei Huang
Xiangzhong Fang
Changhu Wang
+ Tube-CNN: Modeling temporal evolution of appearance for object detection in video 2018 Tuan-Hung Vu
Anton Osokin
Ivan Laptev
+ Few-Shot Video Object Detection 2021 Qi Fan
Chi-Keung Tang
Yu‐Wing Tai
+ Spatio-Temporal Learnable Proposals for End-to-End Video Object Detection 2022 Khurram Azeem Hashmi
Didier Stricker
Muhammamd Zeshan Afzal
+ PDF Chat TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training Model 2020 Bo Pang
Yizhuo Li
Yifan Zhang
Muchen Li
Cewu Lu
+ Towards Real-Time Accurate Object Detection in Both Images and Videos Based on Dual Refinement 2018 Xingyu Chen
Junzhi Yu
Shihan Kong
Zhengxing Wu
Li Wen
+ TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training Model 2020 Bo Pang
Yizhuo Li
Yifan Zhang
Muchen Li
Cewu Lu
+ YOLOV: Making Still Image Object Detectors Great at Video Object Detection 2022 Yuheng Shi
Naiyan Wang
Xiaojie Guo
+ PDF Chat XS-VID: An Extremely Small Video Object Detection Dataset 2024 Jiahao Guo
Ziyang Xu
Lianjun Wu
Fei Gao
Wenyu Liu
Xinggang Wang
+ PDF Chat SGE NET: Video Object Detection with Squeezed GRU and Information Entropy Map 2021 Rui Su
Wenjing Huang
Haoyu Ma
Xiaowei Song
Jinglu Hu
+ PDF Chat RT-DETRv3: Real-time End-to-End Object Detection with Hierarchical Dense Positive Supervision 2024 Shuo Wang
Chunlong Xia
Feng Lv
Y. -Zhu Shi
+ Context Matters: Refining Object Detection in Video with Recurrent Neural Networks 2016 Subarna Tripathi
Zachary C. Lipton
Serge Belongie
Truong Q. Nguyen
+ Context Matters: Refining Object Detection in Video with Recurrent Neural Networks 2016 Subarna Tripathi
Zachary C. Lipton
Serge Belongie
Truong Q. Nguyen

Works That Cite This (162)

Action Title Year Authors
+ PDF Chat Have We Ever Encountered This Before? Retrieving Out-of-Distribution Road Obstacles from Driving Scenes 2024 Youssef Shoeb
Robin Chan
Gesina Schwalbe
Azarm Nowzad
Fatma GĂŒney
Hanno Gottschalk
+ PDF Chat Attention Mechanisms for Object Recognition With Event-Based Cameras 2019 Marco Cannici
Marco Ciccone
Andrea Romanoni
Matteo Matteucci
+ PDF Chat TDIOT: Target-Driven Inference for Deep Video Object Tracking 2021 Filiz GĂŒrkan
Llukman Çerkezi
Ozgun Cirakman
Bilge GĂŒnsel
+ PDF Chat A Comprehensive Survey of Scene Graphs: Generation and Application 2021 Xiaojun Chang
Pengzhen Ren
Pengfei Xu
Zhihui Li
Xiaojiang Chen
Alex Hauptmann
+ PDF Chat Every Pixel Matters: Center-Aware Feature Alignment for Domain Adaptive Object Detector 2020 Cheng-Chun Hsu
Yi–Hsuan Tsai
Yen‐Yu Lin
Ming–Hsuan Yang
+ PDF Chat Computer Vision – ECCV 2016 Workshops 2016 Gang Hua
Hervé Jeǔou
+ ApproxDet: Content and Contention-Aware Approximate Object Detection for Mobiles 2020 Ran Xu
Chen-lin Zhang
Pengcheng Wang
Jayoung Lee
Subrata Mitra
Somali Chaterji
Yin Li
Saurabh Bagchi
+ PDF Chat A Survey of Deep Learning-Based Object Detection 2019 Licheng Jiao
Fan Zhang
Fang Liu
Shuyuan Yang
Lingling Li
Zhixi Feng
Rong Qu
+ PDF Chat Tracking Without Bells and Whistles 2019 Philipp Bergmann
Tim Meinhardt
Laura Leal-Taixé
+ Real-World Image Datasets for Federated Learning 2019 Jiahuan Luo
Xueyang Wu
Yun Luo
Anbu Huang
Yunfeng Huang
Yang Liu
Qiang Yang