Recursive Recurrent Nets with Attention Modeling for OCR in the Wild

Chen‐Yu Lee, Simon Osindero

Type: Article

Publication Date: 2016-06-01

Citations: 443

DOI: https://doi.org/10.1109/cvpr.2016.245

Abstract

We present recursive recurrent neural networks with attention modeling (R2AM) for lexicon-free optical character recognition in natural scene images. The primary advantages of the proposed method are: (1) use of recursive convolutional neural networks (CNNs), which allow for parametrically efficient and effective image feature extraction, (2) an implicitly learned character-level language model, embodied in a recurrent neural network which avoids the need to use N-grams, and (3) the use of a soft-attention mechanism, allowing the model to selectively exploit image features in a coordinated way, and allowing for end-to-end training within a standard backpropagation framework. We validate our method with state-of-the-art performance on challenging benchmark datasets: Street View Text, IIIT5k, ICDAR and Synth90k.

Locations

arXiv (Cornell University) - View - PDF

Similar Works

Action	Title	Year	Authors
+	Recursive Recurrent Nets with Attention Modeling for OCR in the Wild	2016	Chen‐Yu Lee Simon Osindero
+	Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks	2017	Hui Li Peng Wang Chunhua Shen
+ PDF Chat	Towards End-to-End Text Spotting with Convolutional Recurrent Neural Networks	2017	Hui Li Peng Wang Chunhua Shen
+	Convolutional Character Networks	2019	Linjie Xing Zhi Tian Weilin Huang Matthew R. Scott
+ PDF Chat	Convolutional Character Networks	2019	Linjie Xing Zhi Tian Weilin Huang Matthew R. Scott
+ PDF Chat	Visual Attention Models for Scene Text Recognition	2017	Suman K. Ghosh Ernest Valveny Andrew D. Bagdanov
+	Visual attention models for scene text recognition	2017	Suman K. Ghosh Ernest Valveny Andrew D. Bagdanov
+	Visual attention models for scene text recognition	2017	Suman K. Ghosh Ernest Valveny Andrew D. Bagdanov
+	A Simple and Robust Convolutional-Attention Network for Irregular Text Recognition.	2019	Lü Yang Peng Wang Hui Li Ye Gao Linjiang Zhang Chunhua Shen Yanning Zhang
+ PDF Chat	FACLSTM: ConvLSTM with focused attention for scene text recognition	2020	Qingqing Wang Ye Huang Wenjing Jia Xiangjian He Michael Blumenstein Shujing Lyu Yue Lu
+ PDF Chat	SAFL: A Self-Attention Scene Text Recognizer with Focal Loss	2020	Bao Hieu Tran Thanh Le-Cong Huu Manh Nguyen Duc Anh Le Thanh Hung Nguyen Phi Le Nguyen
+	A Holistic Representation Guided Attention Network for Scene Text Recognition	2019	Yang Lu Peng Wang Hui Li Zhen Li Yanning Zhang
+	TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models	2021	Minghao Li Tengchao Lv Jingye Chen Lei Cui Yijuan Lu Dinei Florêncio Cha Zhang Zhoujun Li Furu Wei
+	Memory Matters: Convolutional Recurrent Neural Network for Scene Text Recognition	2016	Qiang Guo Dan Tu Guohui Li Jun Lei
+ PDF Chat	TrOCR: Transformer-Based Optical Character Recognition with Pre-trained Models	2023	Minghao Li Tengchao Lv Jingye Chen Lei Cui Yijuan Lu Dinei Florêncio Cha Zhang Zhoujun Li Furu Wei
+	ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting	2022	Shancheng Fang Zhendong Mao Hongtao Xie Yuxin Wang Chenggang Yan Yongdong Zhang
+	VIPTR: A Vision Permutable Extractor for Fast and Efficient Scene Text Recognition	2024	Xianfu Cheng Weixiao Zhou Xiang Li Xiaohong Chen Jian Yang Tongliang Li Zhoujun Li
+ PDF Chat	ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting	2022	Shancheng Fang Zhendong Mao Hongtao Xie Yuxin Wang Chenggang Yan Yongdong Zhang
+	Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition	2020	Lei Kang Pau Riba Marçal Rusiñol Alícia Fornés Mauricio Villegas
+	Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition	2018	Hui Li Peng Wang Chunhua Shen Guyu Zhang

Works That Cite This (172)

Action	Title	Year	Authors
+ PDF Chat	What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis	2019	Jeonghun Baek Geewook Kim Junyeop Lee Sungrae Park Dongyoon Han Sangdoo Yun Seong Joon Oh Hwalsuk Lee
+ PDF Chat	Seq-UPS: Sequential Uncertainty-aware Pseudo-label Selection for Semi-Supervised Text Recognition	2023	Gaurav Patel Jan P. Allebach Qiang Qiu
+ PDF Chat	SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition	2020	Zhi Qiao Yu Zhou Dongbao Yang Yucan Zhou Weiping Wang
+ PDF Chat	From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network	2021	Yuxin Wang Hongtao Xie Shancheng Fang Jing Wang Shenggao Zhu Yongdong Zhang
+	On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention	2019	Junyeop Lee Sungrae Park Jeonghun Baek Seong Joon Oh Seonghyeon Kim Hwalsuk Lee
+	Visual Attention in Imaginative Agents	2021	Samrudhdhi B. Rangrej James J. Clark
+ PDF Chat	SAFE: Scale Aware Feature Encoder for Scene Text Recognition	2019	Wei Liu Chaofeng Chen Kenneth K. Wong
+	Accurate, Data-Efficient, Unconstrained Text Recognition with Convolutional Neural Networks	2018	Mohamed Yousef Khaled F. Hussain Usama S. Mohammed
+	Towards Accurate Scene Text Recognition with Semantic Reasoning Networks	2020	Deli Yu Xuan Li Chengquan Zhang Junyu Han Jingtuo Liu Errui Ding
+	2D-CTC for Scene Text Recognition	2019	Zhaoyi Wan Fengming Xie Yibo Liu Xiang Bai Cong Yao

Works Cited by This (23)

Action	Title	Year	Authors
+	Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition	2014	Max Jaderberg Karen Simonyan Andrea Vedaldi Andrew Zisserman
+	End-to-End Text Recognition with Hybrid HMM Maxout Models	2013	Ouais Alsharif Joëlle Pineau
+	Very Deep Convolutional Networks for Large-Scale Image Recognition	2014	Karen Simonyan Andrew Zisserman
+	Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks	2013	Ian Goodfellow Yaroslav Bulatov Julian Ibarz Sacha Arnoud Vinay Shet
+ PDF Chat	Supervised mid-level features for word image representation	2015	Albert Gordo
+ PDF Chat	Show and tell: A neural image caption generator	2015	Oriol Vinyals Alexander Toshev Samy Bengio Dumitru Erhan
+ PDF Chat	Reading Text in the Wild with Convolutional Neural Networks	2015	Max Jaderberg Karen Simonyan Andrea Vedaldi Andrew Zisserman
+ PDF Chat	Long-term recurrent convolutional networks for visual recognition and description	2015	Jeff Donahue Lisa Anne Hendricks Sergio Guadarrama Marcus Rohrbach Subhashini Venugopalan Trevor Darrell Kate Saenko
+	Visualizing and Understanding Recurrent Networks	2015	Andrej Karpathy Justin Johnson Li Fei-Fei
+ PDF Chat	Deep learning in neural networks: An overview	2014	Jürgen Schmidhuber