+
|
Very Deep Convolutional Networks for Large-Scale Image Recognition
|
2014
|
Karen Simonyan
Andrew Zisserman
|
1
|
+
|
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
|
2015
|
Sergey Ioffe
Christian Szegedy
|
1
|
+
PDF
Chat
|
Going deeper with convolutions
|
2015
|
Christian Szegedy
Wei Liu
Yangqing Jia
Pierre Sermanet
Scott Reed
Dragomir Anguelov
Dumitru Erhan
Vincent Vanhoucke
Andrew Rabinovich
|
1
|
+
PDF
Chat
|
Deep Residual Learning for Image Recognition
|
2016
|
Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
|
1
|
+
|
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild
|
2012
|
Khurram Soomro
Amir Zamir
Mubarak Shah
|
1
|
+
PDF
Chat
|
Temporal Segment Networks: Towards Good Practices for Deep Action Recognition
|
2016
|
Limin Wang
Yuanjun Xiong
Zhe Wang
Yu Qiao
Dahua Lin
Xiaoou Tang
Luc Van Gool
|
1
|
+
PDF
Chat
|
Aggregated Residual Transformations for Deep Neural Networks
|
2017
|
Saining Xie
Ross Girshick
Piotr Dollár
Zhuowen Tu
Kaiming He
|
1
|
+
|
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
|
2017
|
Priya Goyal
Piotr Dollár
Ross Girshick
Pieter Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
|
1
|
+
PDF
Chat
|
The “Something Something” Video Database for Learning and Evaluating Visual Common Sense
|
2017
|
Raghav Goyal
Samira Ebrahimi Kahou
Vincent Michalski
Joanna Materzyńska
Susanne Westphal
Heuna Kim
Valentin Haenel
Ingo Fruend
P.N. Yianilos
Moritz Mueller-Freitag
|
1
|
+
|
mixup: Beyond Empirical Risk Minimization
|
2017
|
Hongyi Zhang
Moustapha Cissé
Yann Dauphin
David López-Paz
|
1
|
+
PDF
Chat
|
ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design
|
2018
|
Ningning Ma
Xiangyu Zhang
Hai-Tao Zheng
Jian Sun
|
1
|
+
PDF
Chat
|
Unified Perceptual Parsing for Scene Understanding
|
2018
|
Tete Xiao
Yingcheng Liu
Bolei Zhou
Yuning Jiang
Jian Sun
|
1
|
+
|
A Short Note about Kinetics-600
|
2018
|
João Carreira
Eric Noland
Andras Banki-Horvath
Chloe Hillier
Andrew Zisserman
|
1
|
+
PDF
Chat
|
Panoptic Feature Pyramid Networks
|
2019
|
Alexander Kirillov
Ross Girshick
Kaiming He
Piotr Dollár
|
1
|
+
|
Learning Spatio-Temporal Representation With Local and Global Diffusion
|
2019
|
Zhaofan Qiu
Ting Yao
Chong‐Wah Ngo
Xinmei Tian
Tao Mei
|
1
|
+
|
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
|
2019
|
Mingxing Tan
Quoc V. Le
|
1
|
+
PDF
Chat
|
Revisiting Unreasonable Effectiveness of Data in Deep Learning Era
|
2017
|
Chen Sun
Abhinav Shrivastava
Saurabh Singh
Abhinav Gupta
|
1
|
+
PDF
Chat
|
Non-local Neural Networks
|
2018
|
Xiaolong Wang
Ross Girshick
Abhinav Gupta
Kaiming He
|
1
|
+
PDF
Chat
|
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
|
2018
|
Xiangyu Zhang
Xinyu Zhou
Mengxiao Lin
Jian Sun
|
1
|
+
PDF
Chat
|
A Closer Look at Spatiotemporal Convolutions for Action Recognition
|
2018
|
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
|
1
|
+
PDF
Chat
|
MobileNetV2: Inverted Residuals and Linear Bottlenecks
|
2018
|
Mark Sandler
Andrew Howard
Menglong Zhu
Andrey Zhmoginov
Liang-Chieh Chen
|
1
|
+
|
SGDR: Stochastic Gradient Descent with Warm Restarts
|
2016
|
Ilya Loshchilov
Frank Hutter
|
1
|
+
PDF
Chat
|
Simple Baselines for Human Pose Estimation and Tracking
|
2018
|
Bin Xiao
Haiping Wu
Yichen Wei
|
1
|
+
PDF
Chat
|
Densely Connected Convolutional Networks
|
2017
|
Gao Huang
Zhuang Liu
Laurens van der Maaten
Kilian Q. Weinberger
|
1
|
+
PDF
Chat
|
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
|
2017
|
João Carreira
Andrew Zisserman
|
1
|
+
PDF
Chat
|
Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks
|
2017
|
Zhaofan Qiu
Ting Yao
Tao Mei
|
1
|
+
PDF
Chat
|
STM: SpatioTemporal and Motion Encoding for Action Recognition
|
2019
|
Boyuan Jiang
Mengmeng Wang
Weihao Gan
Wei Wu
Junjie Yan
|
1
|
+
PDF
Chat
|
Video Classification With Channel-Separated Convolutional Networks
|
2019
|
Du Tran
Heng Wang
Matt Feiszli
Lorenzo Torresani
|
1
|
+
PDF
Chat
|
Grouped Spatial-Temporal Aggregation for Efficient Action Recognition
|
2019
|
Chenxu Luo
Alan Yuille
|
1
|
+
|
Cascade R-CNN: High Quality Object Detection and Instance Segmentation
|
2019
|
Zhaowei Cai
Nuno Vasconcelos
|
1
|
+
PDF
Chat
|
TSM: Temporal Shift Module for Efficient Video Understanding
|
2019
|
Ji Lin
Chuang Gan
Song Han
|
1
|
+
PDF
Chat
|
SlowFast Networks for Video Recognition
|
2019
|
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
|
1
|
+
PDF
Chat
|
CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features
|
2019
|
Sangdoo Yun
Dongyoon Han
Sanghyuk Chun
Seong Joon Oh
Youngjoon Yoo
Junsuk Choe
|
1
|
+
PDF
Chat
|
TEINet: Towards an Efficient Architecture for Video Recognition
|
2020
|
Zhaoyang Liu
Donghao Luo
Yabiao Wang
Limin Wang
Ying Tai
Chengjie Wang
Jilin Li
Feiyue Huang
Tong Lü
|
1
|
+
PDF
Chat
|
Deep High-Resolution Representation Learning for Visual Recognition
|
2020
|
Jingdong Wang
Ke Sun
Tianheng Cheng
Borui Jiang
Chaorui Deng
Yang Zhao
Dong Liu
Yadong Mu
Mingkui Tan
Xinggang Wang
|
1
|
+
PDF
Chat
|
ResNeSt: Split-Attention Networks
|
2022
|
Hang Zhang
Chongruo Wu
Zhongyue Zhang
Yi Zhu
Haibin Lin
Zhi Zhang
Yue Sun
Tong He
Jonas Mueller
R. Manmatha
|
1
|
+
PDF
Chat
|
Designing Network Design Spaces
|
2020
|
Ilija Radosavovic
Raj Prateek Kosaraju
Ross Girshick
Kaiming He
Piotr Dollár
|
1
|
+
PDF
Chat
|
X3D: Expanding Architectures for Efficient Video Recognition
|
2020
|
Christoph Feichtenhofer
|
1
|
+
PDF
Chat
|
SmallBigNet: Integrating Core and Contextual Views for Video Classification
|
2020
|
Xianhang Li
Yali Wang
Zhipeng Zhou
Yu Qiao
|
1
|
+
PDF
Chat
|
TEA: Temporal Excitation and Aggregation for Action Recognition
|
2020
|
Yan Li
Bin Ji
Xintian Shi
Jianguo Zhang
Bin Kang
Limin Wang
|
1
|
+
PDF
Chat
|
Video Modeling With Correlation Networks
|
2020
|
Heng Wang
Du Tran
Lorenzo Torresani
Matt Feiszli
|
1
|
+
PDF
Chat
|
Multi-modal Transformer for Video Retrieval
|
2020
|
Valentin Gabeur
Chen Sun
Karteek Alahari
Cordelia Schmid
|
1
|
+
|
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
|
2020
|
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
Thomas Unterthiner
Mostafa Dehghani
Matthias Minderer
Georg Heigold
Sylvain Gelly
|
1
|
+
PDF
Chat
|
End-to-End Object Detection with Transformers
|
2020
|
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
|
1
|
+
PDF
Chat
|
MotionSqueeze: Neural Motion Feature Learning for Video Understanding
|
2020
|
Heeseung Kwon
Manjin Kim
Suha Kwak
Minsu Cho
|
1
|
+
PDF
Chat
|
Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
|
2019
|
Ramprasaath R. Selvaraju
Michael Cogswell
Abhishek Das
Ramakrishna Vedantam
Devi Parikh
Dhruv Batra
|
1
|
+
|
ConvTransformer: A Convolutional Transformer Network for Video Frame Synthesis
|
2020
|
Zhouyong Liu
Shun Luo
Wubin Li
Jingben Lu
Yufan Wu
Chunguo Li
Yang Luxi
|
1
|
+
|
TransPose: Towards Explainable Human Pose Estimation by Transformer
|
2020
|
Sen Yang
Zhibin Quan
Mu Nie
Wankou Yang
|
1
|
+
PDF
Chat
|
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
|
2021
|
Li Yuan
Yunpeng Chen
Tao Wang
Weihao Yu
Yujun Shi
Zihang Jiang
Francis E. H. Tay
Jiashi Feng
Shuicheng Yan
|
1
|
+
PDF
Chat
|
Learning Spatiotemporal Features with 3D Convolutional Networks
|
2015
|
Du Tran
Lubomir Bourdev
Rob Fergus
Lorenzo Torresani
Manohar Paluri
|
1
|