Hongwei Xue

Follow

Generating author description...

Common Coauthors
Commonly Cited References
Action Title Year Authors # of times referenced
+ Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers 2020 Zhicheng Huang
Zhaoyang Zeng
Bei Liu
Dongmei Fu
Jianlong Fu
3
+ PDF Chat Deep Residual Learning for Image Recognition 2016 Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
3
+ PDF Chat A Style-Based Generator Architecture for Generative Adversarial Networks 2019 Tero Karras
Samuli Laine
Timo Aila
3
+ Is Space-Time Attention All You Need for Video Understanding? 2021 Gedas Bertasius
Heng Wang
Lorenzo Torresani
2
+ PDF Chat Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning 2021 Zhicheng Huang
Zhaoyang Zeng
Yupan Huang
Bei Liu
Dongmei Fu
Jianlong Fu
2
+ Decoupled Weight Decay Regularization 2017 Ilya Loshchilov
Frank Hutter
2
+ HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training 2020 Linjie Li
Yen‐Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
2
+ PDF Chat HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips 2019 Antoine Miech
Dimitri Zhukov
Jean-Baptiste Alayrac
Makarand Tapaswi
Ivan Laptev
Josef Ĺ ivic
2
+ PDF Chat Hierarchical Conditional Relation Networks for Video Question Answering 2020 Thao Minh Le
Vuong Le
Svetha Venkatesh
Truyen Tran
2
+ An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale 2020 Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
Thomas Unterthiner
Mostafa Dehghani
Matthias Minderer
Georg Heigold
Sylvain Gelly
2
+ PDF Chat Image Super-Resolution Via Iterative Refinement 2022 Chitwan Saharia
Jonathan Ho
William Chan
Tim Salimans
David J. Fleet
Mohammad Norouzi
2
+ PDF Chat TediGAN: Text-Guided Diverse Face Image Generation and Manipulation 2021 Weihao Xia
Yujiu Yang
Jing‐Hao Xue
Baoyuan Wu
2
+ PDF Chat End-to-End Concept Word Detection for Video Captioning, Retrieval, and Question Answering 2017 Youngjae Yu
Hyungjin Ko
Jongwook Choi
Gunhee Kim
2
+ PDF Chat ActBERT: Learning Global-Local Video-Text Representations 2020 Linchao Zhu
Yi Yang
2
+ PDF Chat Motion-Appearance Co-memory Networks for Video Question Answering 2018 Jiyang Gao
Runzhou Ge
Kan Chen
Ram Nevatia
2
+ YouTube-8M: A Large-Scale Video Classification Benchmark 2016 Sami Abu-El-Haija
Nisarg Kothari
Joonseok Lee
Apostol Natsev
George Toderici
Balakrishnan Varadarajan
Sudheendra Vijayanarasimhan
2
+ Use What You Have: Video Retrieval Using Representations From Collaborative Experts 2019 Yang Liu
Samuel Albanie
Arsha Nagrani
Andrew Zisserman
2
+ PDF Chat VideoBERT: A Joint Model for Video and Language Representation Learning 2019 Chen Sun
Austin Myers
Carl Vondrick
Kevin Murphy
Cordelia Schmid
2
+ PDF Chat Unified Vision-Language Pre-Training for Image Captioning and VQA 2020 Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
2
+ Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks 2020 Xiujun Li
Xi Yin
Chunyuan Li
Pengchuan Zhang
Xiaowei Hu
Lei Zhang
Lijuan Wang
Houdong Hu
Dong Li
Furu Wei
2
+ PDF Chat End-to-End Learning of Visual Representations From Uncurated Instructional Videos 2020 Antoine Miech
Jean-Baptiste Alayrac
Lucas Smaira
Ivan Laptev
Josef Ĺ ivic
Andrew Zisserman
2
+ PDF Chat UNITER: UNiversal Image-TExt Representation Learning 2020 Yen-Chun Chen
Linjie Li
Licheng Yu
Ahmed El Kholy
Faisal Ahmed
Zhe Gan
Yu Cheng
Jingjing Liu
2
+ PDF Chat The Unreasonable Effectiveness of Deep Features as a Perceptual Metric 2018 Richard Zhang
Phillip Isola
Alexei A. Efros
Eli Shechtman
Oliver Wang
2
+ PDF Chat Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation 2021 Elad Richardson
Yuval Alaluf
Or Patashnik
Yotam Nitzan
Yaniv Azar
Stav Shapiro
Daniel Cohen‐Or
2
+ PDF Chat Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering 2018 Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Jay Gould
Lei Zhang
2
+ PDF Chat Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval 2021 Max Bain
Arsha Nagrani
GĂźl Varol
Andrew Zisserman
2
+ PDF Chat TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering 2017 Yunseok Jang
Yale Song
Youngjae Yu
Youngjin Kim
Gunhee Kim
2
+ PDF Chat Multi-modal Transformer for Video Retrieval 2020 Valentin Gabeur
Chen Sun
Karteek Alahari
Cordelia Schmid
2
+ Learning a Text-Video Embedding from Incomplete and Heterogeneous Data 2018 Antoine Miech
Ivan Laptev
Josef Ĺ ivic
2
+ PDF Chat In Defense of Grid Features for Visual Question Answering 2020 Huaizu Jiang
Ishan Misra
Marcus Rohrbach
Erik Learned-Miller
Xinlei Chen
2
+ PDF Chat Localizing Moments in Video with Natural Language 2017 Lisa Anne Hendricks
Oliver Wang
Eli Shechtman
Josef Ĺ ivic
Trevor Darrell
Bryan Russell
2
+ PDF Chat SlowFast Networks for Video Recognition 2019 Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
2
+ PDF Chat Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification 2018 Saining Xie
Chen Sun
Jonathan Huang
Zhuowen Tu
Kevin Murphy
2
+ PDF Chat Learning Spatiotemporal Features with 3D Convolutional Networks 2015 Du Tran
Lubomir Bourdev
Rob Fergus
Lorenzo Torresani
Manohar Paluri
2
+ GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium 2017 Martin Heusel
Hubert Ramsauer
Thomas Unterthiner
Bernhard Nessler
Sepp Hochreiter
2
+ PDF Chat A Joint Sequence Fusion Model for Video Question Answering and Retrieval 2018 Youngjae Yu
Jong-Seok Kim
Gunhee Kim
2
+ LXMERT: Learning Cross-Modality Encoder Representations from Transformers 2019 Hao Tan
Mohit Bansal
2
+ BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 2018 Jacob Devlin
Ming‐Wei Chang
Kenton Lee
Kristina Toutanova
2
+ Support-set bottlenecks for video-text representation learning 2020 Mandela Patrick
Po-Yao Huang
Yuki M. Asano
Florian Metze
Alexander G. Hauptmann
JoĂŁo F. Henriques
Andrea Vedaldi
2
+ PDF Chat Cross-Modal and Hierarchical Modeling of Video and Text 2018 Bowen Zhang
Hexiang Hu
Fei Sha
2
+ PDF Chat Image Inpainting for Irregular Holes Using Partial Convolutions 2018 Guilin Liu
Fitsum A. Reda
Kevin J. Shih
Ting-Chun Wang
Andrew Tao
Bryan Catanzaro
1
+ PDF Chat SPICE: Semantic Propositional Image Caption Evaluation 2016 Peter Anderson
Basura Fernando
Mark Johnson
Stephen Jay Gould
1
+ Towards Automatic Learning of Procedures from Web Instructional Videos 2017 Luowei Zhou
Chenliang Xu
Jason J. Corso
1
+ A Note on the Inception Score 2018 Shane Barratt
Rishi Sharma
1
+ Improved Techniques for Training GANs 2016 Tim Salimans
Ian Goodfellow
Wojciech Zaremba
Vicki Cheung
Alec Radford
Xi Chen
1
+ PDF Chat Going deeper with convolutions 2015 Christian Szegedy
Wei Liu
Yangqing Jia
Pierre Sermanet
Scott Reed
Dragomir Anguelov
Dumitru Erhan
Vincent Vanhoucke
Andrew Rabinovich
1
+ VSE++: Improving Visual-Semantic Embeddings with Hard Negatives 2017 Fartash Faghri
David J. Fleet
Jamie Kiros
Sanja Fidler
1
+ Conditional Image Generation with PixelCNN Decoders 2016 Aäron van den Oord
Nal Kalchbrenner
Oriol Vinyals
Lasse Espeholt
Alex Graves
Koray Kavukcuoglu
1
+ Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space 2016 Anh‐Tu Nguyen
Jeff Clune
Yoshua Bengio
Alexey Dosovitskiy
Jason Yosinski
1
+ Dense-Captioning Events in Videos 2017 Ranjay Krishna
Kenji Hata
Frederic Ren
Li Fei-Fei
Juan Carlos Niebles
1