+
PDF
Chat
|
Hierarchical Saliency Detection
|
2013
|
Qiong Yan
Li Xu
Jianping Shi
Jiaya Jia
|
1
|
+
PDF
Chat
|
Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts
|
2014
|
Xianjie Chen
Roozbeh Mottaghi
Xiaobai Liu
Sanja Fidler
Raquel Urtasun
Alan Yuille
|
1
|
+
PDF
Chat
|
Deep Residual Learning for Image Recognition
|
2016
|
Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
|
1
|
+
|
The Kinetics Human Action Video Dataset
|
2017
|
Andrew Zisserman
João Carreira
Karen Simonyan
Will Kay
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
T.C. Green
Trevor Back
|
1
|
+
|
Capsule Network Performance on Complex Data
|
2017
|
Edgar Xi
Selina Bing
Yang Jin
|
1
|
+
|
Gaussian Error Linear Units (GELUs)
|
2016
|
Dan Hendrycks
Kevin Gimpel
|
1
|
+
|
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
|
2015
|
Shaoqing Ren
Kaiming He
Ross Girshick
Jian Sun
|
1
|
+
PDF
Chat
|
SCOPS: Self-Supervised Co-Part Segmentation
|
2019
|
Wei-Chih Hung
Varun Jampani
Sifei Liu
Pavlo Molchanov
Ming–Hsuan Yang
Jan Kautz
|
1
|
+
PDF
Chat
|
CapsuleGAN: Generative Adversarial Capsule Network
|
2019
|
Ayush Jaiswal
Wael AbdAlmageed
Yue Wu
Prem Natarajan
|
1
|
+
PDF
Chat
|
Dynamic-Structured Semantic Propagation Network
|
2018
|
Xiaodan Liang
Eric P. Xing
Hongfei Zhou
|
1
|
+
|
Dynamic Routing Between Capsules
|
2017
|
Sara Sabour
Nicholas Frosst
Geoffrey E. Hinton
|
1
|
+
PDF
Chat
|
SubSpace Capsule Network
|
2020
|
Marzieh Edraki
Nazanin Rahnavard
Mubarak Shah
|
1
|
+
PDF
Chat
|
Learning Compositional Neural Information Fusion for Human Parsing
|
2019
|
Wenguan Wang
Zhijie Zhang
Siyuan Qi
Jianbing Shen
Yanwei Pang
Ling Shao
|
1
|
+
PDF
Chat
|
Hierarchical Human Parsing With Typed Part-Relation Reasoning
|
2020
|
Wenguan Wang
Hailong Zhu
Jifeng Dai
Yanwei Pang
Jianbing Shen
Ling Shao
|
1
|
+
PDF
Chat
|
Deep Grouping Model for Unified Perceptual Parsing
|
2020
|
Zhiheng Li
Wenxuan Bao
Jiayang Zheng
Chenliang Xu
|
1
|
+
PDF
Chat
|
Hierarchical Graph Capsule Network
|
2021
|
Jinyu Yang
Peilin Zhao
Yu Rong
Chaochao Yan
Chunyuan Li
Hehuan Ma
Junzhou Huang
|
1
|
+
PDF
Chat
|
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
|
2021
|
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lü
Ping Luo
Ling Shao
|
1
|
+
|
DeepViT: Towards Deeper Vision Transformer
|
2021
|
Daquan Zhou
Bingyi Kang
Xiaojie Jin
Linjie Yang
Xiaochen Lian
Zihang Jiang
Qibin Hou
Jiashi Feng
|
1
|
+
PDF
Chat
|
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
|
2021
|
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
Baining Guo
|
1
|
+
|
Scalable Visual Transformers with Hierarchical Pooling
|
2021
|
Zizheng Pan
Bohan Zhuang
Jing Liu
Haoyu He
Jianfei Cai
|
1
|
+
|
Conditional Positional Encodings for Vision Transformers
|
2021
|
Xiangxiang Chu
Zhi Tian
Bo Zhang
Xinlong Wang
Xiaolin Wei
Huaxia Xia
Chunhua Shen
|
1
|
+
PDF
Chat
|
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search
|
2021
|
Changlin Li
Tao Tang
Guangrun Wang
Jiefeng Peng
Bing Wang
Xiaodan Liang
Xiaojun Chang
|
1
|
+
PDF
Chat
|
Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding
|
2021
|
Pengchuan Zhang
Xiyang Dai
Jianwei Yang
Bin Xiao
Lu Yuan
Lei Zhang
Jianfeng Gao
|
1
|
+
PDF
Chat
|
An Empirical Study of Training Self-Supervised Vision Transformers
|
2021
|
Xinlei Chen
Saining Xie
Kaiming He
|
1
|
+
|
Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning
|
2021
|
Zhenfang Chen
Jiayuan Mao
Jiajun Wu
Kenneth K. Wong
Joshua B. Tenenbaum
Chuang Gan
|
1
|
+
PDF
Chat
|
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
|
2021
|
Chun-Fu Richard Chen
Quanfu Fan
Rameswar Panda
|
1
|
+
PDF
Chat
|
Unsupervised Part Discovery by Unsupervised Disentanglement
|
2021
|
Sandro Braun
Patrick Esser
Björn Ommer
|
1
|
+
|
LocalViT: Bringing Locality to Vision Transformers
|
2021
|
Yawei Li
Kai Zhang
Jiezhang Cao
Radu Timofte
Luc Van Gool
|
1
|
+
PDF
Chat
|
Emerging Properties in Self-Supervised Vision Transformers
|
2021
|
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jeǵou
Julien Mairal
Piotr Bojanowski
Armand Joulin
|
1
|
+
PDF
Chat
|
Visual Grounding with Transformers
|
2022
|
Ye Du
Zehua Fu
Qingjie Liu
Yunhong Wang
|
1
|
+
PDF
Chat
|
Unsupervised Part Segmentation through Disentangling Appearance and Shape
|
2021
|
Shilong Liu
Lei Zhang
Xiao Yang
Hang Su
Jun Zhu
|
1
|
+
|
Glance-and-Gaze Vision Transformer
|
2021
|
Qihang Yu
Yingda Xia
Yutong Bai
Yongyi Lu
Alan Yuille
Wei Shen
|
1
|
+
|
Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer
|
2021
|
Zilong Huang
Youcheng Ben
Guozhong Luo
Pei Cheng
Gang Yu
Bin Fu
|
1
|
+
|
DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
|
2021
|
Yongming Rao
Wenliang Zhao
Benlin Liu
Jiwen Lu
Jie Zhou
Cho‐Jui Hsieh
|
1
|
+
|
BEiT: BERT Pre-Training of Image Transformers
|
2021
|
Hangbo Bao
Dong Li
Furu Wei
|
1
|
+
PDF
Chat
|
Training data-efficient image transformers & distillation through attention
|
2021
|
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jeǵou
|
1
|
+
PDF
Chat
|
HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers
|
2021
|
Mingyu Ding
Xiaochen Lian
Linjie Yang
Peng Wang
Xiaojie Jin
Zhiwu Lu
Ping Luo
|
1
|
+
|
XCiT: Cross-Covariance Image Transformers
|
2021
|
Alaaeldin El-Nouby
Hugo Touvron
Mathilde Caron
Piotr Bojanowski
Matthijs Douze
Armand Joulin
Ivan Laptev
Natalia Neverova
Gabriel Synnaeve
Jakob Verbeek
|
1
|
+
PDF
Chat
|
Bottleneck Transformers for Visual Recognition
|
2021
|
Aravind Srinivas
Tsung-Yi Lin
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
|
1
|
+
PDF
Chat
|
Scaling Vision Transformers
|
2022
|
Xiaohua Zhai
Alexander Kolesnikov
Neil Houlsby
Lucas Beyer
|
1
|
+
PDF
Chat
|
HOTR: End-to-End Human-Object Interaction Detection with Transformers
|
2021
|
Bumsoo Kim
Junhyun Lee
Jaewoo Kang
Eun‐Sol Kim
Hyunwoo J. Kim
|
1
|
+
|
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
|
2021
|
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
|
1
|
+
PDF
Chat
|
PVT v2: Improved baselines with Pyramid Vision Transformer
|
2022
|
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lü
Ping Luo
Ling Shao
|
1
|
+
PDF
Chat
|
Scaling Local Self-Attention for Parameter Efficient Visual Backbones
|
2021
|
Ashish Vaswani
Prajit Ramachandran
Aravind Srinivas
Niki Parmar
Blake A. Hechtman
Jonathon Shlens
|
1
|
+
PDF
Chat
|
Rethinking and Improving Relative Position Encoding for Vision Transformer
|
2021
|
Kan Wu
Houwen Peng
Minghao Chen
Jianlong Fu
Hongyang Chao
|
1
|
+
|
Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language
|
2021
|
Mingyu Ding
Zhenfang Chen
Tao Du
Ping Luo
Joshua B. Tenenbaum
Chuang Gan
|
1
|
+
|
iBOT: Image BERT Pre-Training with Online Tokenizer
|
2021
|
Jinghao Zhou
Wei Chen
Huiyu Wang
Wei Shen
Cihang Xie
Alan Yuille
Tao Kong
|
1
|
+
|
Improved Multiscale Vision Transformers for Classification and Detection
|
2021
|
Yanghao Li
Chao-Yuan Wu
Haoqi Fan
Karttikeya Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
|
1
|
+
PDF
Chat
|
CvT: Introducing Convolutions to Vision Transformers
|
2021
|
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
|
1
|
+
PDF
Chat
|
LeViT: a Vision Transformer in ConvNet’s Clothing for Faster Inference
|
2021
|
Ben Graham
Alaaeldin El-Nouby
Hugo Touvron
Pierre Stock
Armand Joulin
Hervé Jeǵou
Matthijs Douze
|
1
|