[toc]
- Annotation for Symbols and Abbreviations
- Survey
- Papers for Transformer based OCR
- Papers for Non-Transformer based OCR or OCR related
- Code Collection
- Dataset Collection
- Competition
- Related Awesome Resource
This repository is constructed to share current progress of transformer based optical character recognition(OCR). Welcome to share~! We also have another repository collecting Artificial Intelligence related interesting materials: AI Collections.
🙌🏻 To share the progress of transformer based OCR, please comment in this issue. The owner of this awesome will check this issule regularly and update this awesome repository as quick as possible. Finally, thanks for your sharing and contributions.
😳 If there are any errors in this collection, please also comment in the issue, and we will correct it as quick as possible.
- 🙌🏻 denotes 'Help Needed'
- 😳 denotes 'Sorry for error'
- ✨ denotes the work that the owner likes.
- OCR: Optical Character Recognition
- STR: Scene Text Recognition
- TLR: Text Line Recognition
- HTR: Handwritten Text Recognition
- Text VQA: Text-based Visual Question Answering
-
✨ Text Recognition in the Wild: A Survey
- ArXiv, May 7, 2020
-
Scene Text Detection and Recognition: The Deep Learning Era
- ICCV, 2020
-
A Survey of Deep Learning Approaches for OCR and Document Understanding
- ArXiv, Nov. 27, 2020
(Title-Link | Main Task | Date in Semantic Scholar)
-
Transformer-based HTR for Historical Documents
- Handwritten Text Recognition (HTR)
- ArXiv, Mar. 22, 2022
-
✨ TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
- Text Line Recognition (TLR)
- ArXiv, Sep. 21, 2021
- code: TrOCR
- Label: TrOCR
-
Visual-Semantic Transformer for Scene Text Recognition
- STR
- ArXiv, Dec. 2, 2021
-
Transformer for Handwritten Text Recognition Using Bidirectional Post-decoding
- HTR
- ICDAR, 2021
-
Rescoring Sequence-to-Sequence Models for Text Line Recognition with CTC-Prefixes
- TLR
- ArXiv, Oct. 12, 2021
-
- Scene Text Recognition (STR)
- Nov. 24, 2021
- aims at improving the low-resource text image recognition performance with the help of high-resource datasets.
- Evaluated on Japanese dataset created for the ICDAR2019 robust reading challenge on multilingual scene text detection and recognition.
- No results on IIIT, SVT, ....
-
TRIG: Transformer-Based Text Recognizer with Initial Embedding Guidance
- STR
- Nov. 13, 2021
-
Rethinking Text Line Recognition Models
- TLR
- Apr. 15, 2021
-
Vision Transformer for Fast and Efficient Scene Text Recognition
- STR
- May 18, 2021
-
MASTER: Multi-Aspect Non-local Network for Scene Text Recognition
- STR
- PR, 2021
-
Bidirectional Scene Text Recognition with a Single Decoder
- STR
- ECAI, 2020
-
Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition
- TLR
- ArXiv, 2020
-
Text Recognition in Images Based on Transformer with Hierarchical Attention
- ICIP, 2019
-
NRTR: A No-Recurrence Sequence-to-Sequence Model for Scene Text Recognition
- STR
- ICDAR 2019, Jun. 4, 2019
-
A Simple and Robust Convolutional-Attention Network for Irregular Text Recognition
- STR
- ArXiv, Apr. 4, 2019
- Also: A holistic representation guided attention network for scene text recognition
- Neurocomupting 2020
(Title-Link | Main Task | Date in Semantic Scholar )
-
Decoupling Visual-Semantic Feature Learning for Robust Scene Text Recognition
- STR
- ArXiv, Nov. 24, 2021
-
Beyond OCR + VQA: Involving OCR into the Flow for Robust and Accurate TextVQA
- Text VQA
- MM, Oct. 17, 2021
-
STRIVE: Scene Text Replacement In Videos
- Scene text synthesis
- ArXiv, Sep. 6, 2021
-
From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network
- STR
- ArXiv, Aug. 22, 2021
-
Towards the Unseen: Iterative Text Recognition by Distilling from Errors
- STR
- ArXiv, Jul. 26, 2021
-
Text is Text, No Matter What: Unifying Text Recognition using Knowledge Distillation
- STR
- ArXiv, Jul. 26, 2021
-
Joint Visual Semantic Reasoning: Multi-Stage Decoder for Text Recognition
- STR
- ArXiv, Jul. 26, 2021
-
Dictionary-guided Scene Text Recognition
- STR
- CVPR 2021, Jun. 1, 2021
-
Implicit Feature Alignment: Learn to Convert Text Recognizer to Text Spotter
- Text reognizer and text spotter
- CVPR 2021, Jun. 1, 2021
-
TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text
- Text spotter and STR
- CVPR 2021, May 12, 2021
-
Primitive Representation Learning for Scene Text Recognition
- STR
- CVPR 2021, May 10, 2021
-
- STR
- CVPR 2021, Mar. 7, 2021
-
Rethinking Text Line Recognition Models
- TLR
- ArXiv, Apr. 15, 2021
-
- Handwriting text image synthesis
- ArXiv, Apr. 8, 2021
-
MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition
- Handwritten Text Recognition (HTR)
- CVPR 2021, Apr. 5, 2021
-
A Multiplexed Network for End-to-End, Multilingual OCR
- Multilingual STR
- CVPR 2021, Mar. 29, 2021
-
✨ Sequence-to-Sequence Contrastive Learning for Text Recognition
- STR
- CVPR 2021, Dec. 20, 2020
-
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption
- Visual Question Answering(VQA) and OCR
- CVPR 2021, Dec. 8, 2020
-
Decoupled Attention Network for Text Recognition
- HTR
- AAAI 2020, Dec. 21, 2019
-
✨ What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis
- STR
- ICCV 2019, Apr. 3, 2019
- code: clovaai/deep-text-recognition-benchmark
- Label: deep-text
-
Evaluating Sequence-to-Sequence Models for Handwritten Text Recognition
- HTR
- ICDAR, Mar. 18, 2019
-
An Efficient End-to-End Neural Model for Handwritten Text Recognition
- HTR
- BMVC, 2018
-
Gated Convolutional Recurrent Neural Networks for Multilingual Handwriting Recognition
- HTR
- ICDAR, 2017
- Label: GCRNN
-
- STR
- TPAMI, Jul. 21, 2015
- Label: CRNN
- ✨ clovaai/deep-text-recognition-benchmark
- ✨ TrOCR
- ✨ TextSnake for Scene Text Detection
- caffe_ocr
- imgtxtenh - Tool for enhancing noisy scanned text images
-
Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study
-
ICDAR 2021 Competition on Scene Video Text Spotting
-
Synthetic Chinese String Dataset
- blog introduction: https://blog.csdn.net/MONKEY3233/article/details/104194169
- caffe_ocr introduction: https://github.com/senlinuc/caffe_ocr
- 准确率是指整串正确的比例,在验证集上统计,"准确率-no lexicon"表示没用词典的准确率,"准确率-lexicon-minctcloss"指先在词典中查找Edit Distance <=2的单词,再选择ctcloss最小的单词作为识别结果
-
Open Handwriting Recognition and Translation Evaluation (OpenHaRT)
-
Robust Reading Competition
- It has many ICDAR competition datasets and tasks.
- Leaderboard and online submission are supported.
- https://rrc.cvc.uab.es
-
IAM Dataset
-
Task:
- Handwritten Recognition
-
International Journal on Document Analysis and Recognition, 2002
-
The IAM-database: an English sentence database for offline handwriting recognition
-
-
SORIE Dataset
-
Tasks:
- Scanned Receipt Text Localisation,
- Scanned Receipt OCR,
- Key Information Extraction from Scanned Receipts
-
ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction
-
-
Font Resource
- Google Handwritten Fonts: https://fonts.google.com/?category=Handwriting
- 1001 Handwritten Fonts: https://www.1001fonts.com/handwritten-fonts.html?page=1
- PaddlePaddle/PaddleOCR
- hwalsuklee/awesome-deep-text-detection-recognition
- Jyouhou/SceneTextPapers
- kba/awesome-ocr
- wanghaisheng/awesome-ocr
- TianzhongSong/awesome-SynthText
- ZumingHuang/awesome-ocr-resources
- whitelok/image-text-localization-recognition
- janzd/awesome-scene-text
- jackyjsy/awesome-sign-language-recognition
- Still updating ... (Welcome to comment more excellent resource in this issue)