Recursive Recurrent Nets with Attention Modeling for OCR in the Wild

Type: Article

Publication Date: 2016-06-01

Citations: 443

DOI: https://doi.org/10.1109/cvpr.2016.245

Download PDF

Abstract

We present recursive recurrent neural networks with attention modeling (R2AM) for lexicon-free optical character recognition in natural scene images. The primary advantages of the proposed method are: (1) use of recursive convolutional neural networks (CNNs), which allow for parametrically efficient and effective image feature extraction, (2) an implicitly learned character-level language model, embodied in a recurrent neural network which avoids the need to use N-grams, and (3) the use of a soft-attention mechanism, allowing the model to selectively exploit image features in a coordinated way, and allowing for end-to-end training within a standard backpropagation framework. We validate our method with state-of-the-art performance on challenging benchmark datasets: Street View Text, IIIT5k, ICDAR and Synth90k.

Locations

  • arXiv (Cornell University) - View - PDF

Similar Works

Action Title Year Authors
+ Recursive Recurrent Nets with Attention Modeling for OCR in the Wild 2016 Chen‐Yu Lee
Simon Osindero
+ Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks 2017 Hui Li
Peng Wang
Chunhua Shen
+ PDF Chat Towards End-to-End Text Spotting with Convolutional Recurrent Neural Networks 2017 Hui Li
Peng Wang
Chunhua Shen
+ Convolutional Character Networks 2019 Linjie Xing
Zhi Tian
Weilin Huang
Matthew R. Scott
+ PDF Chat Convolutional Character Networks 2019 Linjie Xing
Zhi Tian
Weilin Huang
Matthew R. Scott
+ PDF Chat Visual Attention Models for Scene Text Recognition 2017 Suman K. Ghosh
Ernest Valveny
Andrew D. Bagdanov
+ Visual attention models for scene text recognition 2017 Suman K. Ghosh
Ernest Valveny
Andrew D. Bagdanov
+ Visual attention models for scene text recognition 2017 Suman K. Ghosh
Ernest Valveny
Andrew D. Bagdanov
+ A Simple and Robust Convolutional-Attention Network for Irregular Text Recognition. 2019 LĂŒ Yang
Peng Wang
Hui Li
Ye Gao
Linjiang Zhang
Chunhua Shen
Yanning Zhang
+ PDF Chat FACLSTM: ConvLSTM with focused attention for scene text recognition 2020 Qingqing Wang
Ye Huang
Wenjing Jia
Xiangjian He
Michael Blumenstein
Shujing Lyu
Yue Lu
+ PDF Chat SAFL: A Self-Attention Scene Text Recognizer with Focal Loss 2020 Bao Hieu Tran
Thanh Le-Cong
Huu Manh Nguyen
Duc Anh Le
Thanh Hung Nguyen
Phi Le Nguyen
+ A Holistic Representation Guided Attention Network for Scene Text Recognition 2019 Yang Lu
Peng Wang
Hui Li
Zhen Li
Yanning Zhang
+ TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models 2021 Minghao Li
Tengchao Lv
Jingye Chen
Lei Cui
Yijuan Lu
Dinei FlorĂȘncio
Cha Zhang
Zhoujun Li
Furu Wei
+ Memory Matters: Convolutional Recurrent Neural Network for Scene Text Recognition 2016 Qiang Guo
Dan Tu
Guohui Li
Jun Lei
+ PDF Chat TrOCR: Transformer-Based Optical Character Recognition with Pre-trained Models 2023 Minghao Li
Tengchao Lv
Jingye Chen
Lei Cui
Yijuan Lu
Dinei FlorĂȘncio
Cha Zhang
Zhoujun Li
Furu Wei
+ ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting 2022 Shancheng Fang
Zhendong Mao
Hongtao Xie
Yuxin Wang
Chenggang Yan
Yongdong Zhang
+ VIPTR: A Vision Permutable Extractor for Fast and Efficient Scene Text Recognition 2024 Xianfu Cheng
Weixiao Zhou
Xiang Li
Xiaohong Chen
Jian Yang
Tongliang Li
Zhoujun Li
+ PDF Chat ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting 2022 Shancheng Fang
Zhendong Mao
Hongtao Xie
Yuxin Wang
Chenggang Yan
Yongdong Zhang
+ Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition 2020 Lei Kang
Pau Riba
Marçal Rusiñol
Alícia Fornés
Mauricio Villegas
+ Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition 2018 Hui Li
Peng Wang
Chunhua Shen
Guyu Zhang