+
PDF
Chat
|
Deep Residual Learning for Image Recognition
|
2016
|
Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
|
3
|
+
PDF
Chat
|
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
|
2021
|
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
Baining Guo
|
3
|
+
PDF
Chat
|
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
|
2017
|
Yash Goyal
Tejas Khot
Douglas Summers-Stay
Dhruv Batra
Devi Parikh
|
2
|
+
PDF
Chat
|
Scene Text Visual Question Answering
|
2019
|
Ali Furkan Biten
RubĂšn Tito
Andrés Mafla
LluĂs GĂłmez
Marçal Rusiñol
C. V. Jawahar
Ernest Valveny
DĂŹmosthenis Karatzas
|
2
|
+
PDF
Chat
|
LayoutLM: Pre-training of Text and Layout for Document Image Understanding
|
2020
|
Yiheng Xu
Minghao Li
Lei Cui
Shaohan Huang
Furu Wei
Ming Zhou
|
2
|
+
PDF
Chat
|
VizWiz Grand Challenge: Answering Visual Questions from Blind People
|
2018
|
Danna Gurari
Qing Li
Abigale Stangl
Anhong Guo
Chi Lin
Kristen Grauman
Jiebo Luo
Jeffrey P. Bigham
|
2
|
+
PDF
Chat
|
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
|
2022
|
Yupan Huang
Tengchao Lv
Lei Cui
Yutong Lu
Furu Wei
|
2
|
+
|
LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding
|
2021
|
Yang Xu
Yiheng Xu
Tengchao Lv
Lei Cui
Furu Wei
Guoxin Wang
Yijuan Lu
Dinei FlorĂȘncio
Cha Zhang
Wanxiang Che
|
2
|
+
PDF
Chat
|
End-to-End Document Recognition and Understanding with Dessurt
|
2023
|
Brian Davis
Bryan S. Morse
Brian Price
Chris Tensmeyer
Curtis Wigington
Vlad I. Morariu
|
2
|
+
PDF
Chat
|
Towards End-to-End Unified Scene Text Detection and Layout Analysis
|
2022
|
Shangbang Long
Siyang Qin
Dmitry Panteleev
Alessandro Bissacco
Yasuhisa Fujii
Michalis Raptis
|
2
|
+
|
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks
|
2022
|
Jiasen Lu
Christopher M. Clark
Rowan Zellers
Roozbeh Mottaghi
Aniruddha Kembhavi
|
2
|
+
|
Pix2seq: A Language Modeling Framework for Object Detection
|
2021
|
Ting Chen
Saurabh Saxena
Lala Li
David J. Fleet
Geoffrey E. Hinton
|
2
|
+
PDF
Chat
|
Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition
|
2017
|
Chee Kheng Chng
Chee Seng Chan
|
2
|
+
|
Bootstrap your own latent: A new approach to self-supervised Learning
|
2020
|
Jean-Bastien Grill
Florian Strub
Florent Altché
Corentin Tallec
Pierre H. Richemond
Elena Buchatskaya
Carl Doersch
Bernardo Ăvila Pires
Zhaohan Daniel Guo
Mohammad Gheshlaghi Azar
|
2
|
+
PDF
Chat
|
Momentum Contrast for Unsupervised Visual Representation Learning
|
2020
|
Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross Girshick
|
2
|
+
PDF
Chat
|
TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text
|
2021
|
Amanpreet Singh
Guan Pang
Mandy Toh
Jing Huang
Wojciech Galuba
Tal Hassner
|
2
|
+
|
Flamingo: a Visual Language Model for Few-Shot Learning
|
2022
|
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
Yana Hasson
Karel Lenc
Arthur Mensch
Katie Millican
Malcolm Reynolds
|
2
|
+
|
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
|
2019
|
Colin Raffel
Noam Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
|
2
|
+
PDF
Chat
|
Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer
|
2022
|
Yair Kittenplon
Inbal Lavi
Sharon Fogel
Yarin Bar
R. Manmatha
Pietro Perona
|
2
|
+
PDF
Chat
|
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
|
2022
|
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Faisal Ahmed
Zicheng Liu
Yumao Lu
Lijuan Wang
|
2
|
+
|
Attention Is All You Need
|
2017
|
Ashish Vaswani
Noam Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan N. Gomez
Ćukasz Kaiser
Illia Polosukhin
|
2
|
+
PDF
Chat
|
SPTS: Single-Point Text Spotting
|
2022
|
Dezhi Peng
Xinyu Wang
Yuliang Liu
Jiaxin Zhang
Mingxin Huang
Songxuan Lai
Jing Li
Shenggao Zhu
Dahua Lin
Chunhua Shen
|
2
|
+
PDF
Chat
|
Towards VQA Models That Can Read
|
2019
|
Amanpreet Singh
Vivek Natarajan
Meet Shah
Yu Jiang
Xinlei Chen
Dhruv Batra
Devi Parikh
Marcus Rohrbach
|
2
|
+
|
Supervised Contrastive Learning
|
2020
|
Prannay Khosla
Piotr Teterwak
Chen Wang
Aaron Sarna
Yonglong Tian
Phillip Isola
Aaron Maschinot
Ce Liu
Dilip Krishnan
|
2
|
+
PDF
Chat
|
OCR-Free Document Understanding Transformer
|
2022
|
Geewook Kim
Teakgyu Hong
Moonbin Yim
Jeongyeon Nam
Jinyoung Park
Jinyeong Yim
Wonseok Hwang
Sangdoo Yun
Dongyoon Han
Seunghyun Park
|
2
|
+
|
Domain Generalization by Marginal Transfer Learning
|
2017
|
Gilles Blanchard
Aniket Anand Deshmukh
ĂrĂŒn Doǧan
Gyemin Lee
Clayton Scott
|
1
|
+
PDF
Chat
|
Unified Deep Supervised Domain Adaptation and Generalization
|
2017
|
Saeid Motiian
Marco Piccirilli
Donald Adjeroh
Gianfranco Doretto
|
1
|
+
PDF
Chat
|
Learning to Generalize: Meta-Learning for Domain Generalization
|
2018
|
Da Li
Yongxin Yang
Yi-Zhe Song
Timothy M. Hospedales
|
1
|
+
|
mixup: Beyond Empirical Risk Minimization
|
2017
|
Hongyi Zhang
Moustapha Cissé
Yann Dauphin
David LĂłpez-Paz
|
1
|
+
|
Know What You Donât Know: Unanswerable Questions for SQuAD
|
2018
|
Pranav Rajpurkar
Robin Jia
Percy Liang
|
1
|
+
PDF
Chat
|
Synthetic Data for Text Localisation in Natural Images
|
2016
|
Ankush Gupta
Andrea Vedaldi
Andrew Zisserman
|
1
|
+
PDF
Chat
|
Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
|
2017
|
Ramprasaath R. Selvaraju
Michael Cogswell
Abhishek Das
Ramakrishna Vedantam
Devi Parikh
Dhruv Batra
|
1
|
+
PDF
Chat
|
Evaluation of deep convolutional nets for document image classification and retrieval
|
2015
|
Adam W. Harley
Alex Ufkes
Konstantinos G. Derpanis
|
1
|
+
PDF
Chat
|
Deeper, Broader and Artier Domain Generalization
|
2017
|
Da Li
Yongxin Yang
Yi-Zhe Song
Timothy M. Hospedales
|
1
|
+
PDF
Chat
|
Learning Deep Object Detectors from 3D Models
|
2015
|
Xingchao Peng
Baochen Sun
Karim Ali
Kate Saenko
|
1
|
+
PDF
Chat
|
PubLayNet: Largest Dataset Ever for Document Layout Analysis
|
2019
|
Xu Zhong
Jianbin Tang
Antonio Jimeno Yepes
|
1
|
+
PDF
Chat
|
Domain Generalization by Solving Jigsaw Puzzles
|
2019
|
Fabio Maria Carlucci
Antonio DâInnocente
Silvia Bucci
Barbara Caputo
Tatiana Tommasi
|
1
|
+
PDF
Chat
|
Deep Hashing Network for Unsupervised Domain Adaptation
|
2017
|
Hemanth Venkateswara
José Eusébio
Shayok Chakraborty
Sethuraman Panchanathan
|
1
|
+
PDF
Chat
|
Asymptotic Distribution of Coordinates on High Dimensional Spheres
|
2007
|
M. C. Spruill
|
1
|
+
|
Invariant Risk Minimization
|
2019
|
MartĂn Arjovsky
LĂ©on Bottou
Ishaan Gulrajani
David LĂłpez-Paz
|
1
|
+
PDF
Chat
|
Adversarial Domain Adaptation with Domain Mixup
|
2020
|
Minghao Xu
Jian Zhang
Bingbing Ni
Teng Li
Chengjie Wang
Qi Tian
Wenjun Zhang
|
1
|
+
|
Improve Unsupervised Domain Adaptation with Mixup Training
|
2020
|
Shen Yan
Huan Song
Nanxiang Li
Lincan Zou
Ren Liu
|
1
|
+
|
BERTScore: Evaluating Text Generation with BERT
|
2019
|
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
|
1
|
+
PDF
Chat
|
EAST: An Efficient and Accurate Scene Text Detector
|
2017
|
Xinyu Zhou
Cong Yao
He Wen
Yuzhi Wang
Shuchang Zhou
Weiran He
Jiajun Liang
|
1
|
+
PDF
Chat
|
Real-Time Scene Text Detection with Differentiable Binarization
|
2020
|
Minghui Liao
Zhaoyi Wan
Cong Yao
Kai Chen
Xiang Bai
|
1
|
+
PDF
Chat
|
Multilingual Denoising Pre-training for Neural Machine Translation
|
2020
|
Yinhan Liu
Jiatao Gu
Naman Goyal
Xian Li
Sergey Edunov
Marjan Ghazvininejad
Mike Lewis
Luke Zettlemoyer
|
1
|
+
|
Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization
|
2019
|
Shiori Sagawa
Pang Wei Koh
Tatsunori Hashimoto
Percy Liang
|
1
|
+
|
A Simple Framework for Contrastive Learning of Visual Representations
|
2020
|
Ting Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
|
1
|
+
|
Out-of-Distribution Generalization via Risk Extrapolation (REx)
|
2020
|
David Krueger
Ethan Caballero
Joern-Henrik Jacobsen
Amy Zhang
Jonathan Binas
Dinghuai Zhang
RĂ©mi Le Priol
Aaron Courville
|
1
|
+
|
Asymmetric Valleys: Beyond Sharp and Flat Local Minima
|
2019
|
Haowei He
Gao Huang
Yuan Yang
|
1
|