Tesseract struggles with 90-degree angled text sometimes #4387

0dinD · 2025-01-31T13:21:17Z

Current Behavior

I was investigating whether Tesseract can handle mixed orientation in the text (see also: #2055), and found a specific case where it almost works, but fails in a way that makes me think there's a bug in the code. More specifically, in the example that I provide below, Tesseract seems to be reading the 90-degree text "upside-down", as in, reading the 90-degree text as if though it was 270-degree text.

For example, as you can see in the output hOCR below, the textangle is correctly identified as 90 degrees, but Tesseract is reading the text "upside-down", i.e. from a 270 degree perspective. Look at words like "anbeu" ("neque" but upside-down), "luenb" ("quam" but upside-down), "wesdi" ("ipsum" but upside-down) and so on.

Command used: tesseract text-90deg.png text-90deg --psm 1 hocr

Input image:

Output hOCR:

text-90deg.hocr.txt

Tested with the current latest AppImage of Tesseract, 5.5.0

Expected Behavior

Tesseract should read all the text in the correct orientation so that there are no jumbled words in the hOCR output.

Suggested Fix

Find and fix the bug that makes Tesseract read 90-degree text as 270-degree text in this case.

tesseract -v

tesseract 5.5.0
 leptonica-1.79.0
  libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 2.0.3) : libpng 1.6.37 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.1
 Found AVX512BW
 Found AVX512F
 Found AVX512VNNI
 Found AVX2
 Found AVX
 Found FMA
 Found SSE4.1
 Found OpenMP 201511
 Found libarchive 3.4.0 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.8 liblz4/1.9.2 libzstd/1.4.4

Operating System

Ubuntu 22.04 Jammy

Other Operating System

No response

uname -a

No response

Compiler

No response

CPU

No response

Virtualization / Containers

No response

Other Information

No response

The text was updated successfully, but these errors were encountered:

0dinD · 2025-01-31T14:02:58Z

Oh, wait, now I just realized that I had the orientations the wrong way around: The hOCR spec for textangle says that it's counter-clockwise, so there is no discrepancy in the example above, with regard to textangle vs the OCR output.

The main issue then, seems to be that --psm 1 is not able to detect the correct orientation for the bottom text. The correct orientation should be 270 degrees in this case, but --psm 1 clearly chooses 90 degrees, which results in garbled output because the OCR will read things upside-down.

amitdo · 2025-02-03T10:07:11Z

Is osd.traineddata located in the same path of eng.traineddata in your machine?

0dinD · 2025-02-03T20:34:26Z

@amitdo It looks like they are both in the same path, yes:

zerodind@machine:~/git/dev/tesseract$ ls -lah /usr/share/tesseract-ocr/4.00/tessdata/
total 18M
drwxr-xr-x 4 root root 4,0K dec 12 00:52 .
drwxr-xr-x 3 root root 4,0K feb 10  2022 ..
drwxr-xr-x 2 root root 4,0K jun 21  2022 configs
-rw-r--r-- 1 root root 4,0M sep 15  2017 eng.traineddata
-rw-r--r-- 1 root root  11M sep 15  2017 osd.traineddata
-rw-r--r-- 1 root root  572 feb  9  2022 pdf.ttf
-rw-r--r-- 1 root root 4,0M sep 15  2017 swe.traineddata
drwxr-xr-x 2 root root 4,0K jun 21  2022 tessconfigs

Let me know if there's any more information you need. For the record, in the repro case I gave above in this issue description, I was using the latest AppImage of Tesseract, version 5.5.0. Not sure if the AppImage itself contains the traineddata or if it uses the system files (which are for an older Tesseract version). Either way, I'm pretty this issue has existed for a long while, I initially did not use the AppImage but rather my system (Ubuntu 22.04 Jammy) version of Tesseract (version 4.1.1).

amitdo · 2025-02-05T11:55:12Z

I tested it myself with tesseract 5.5.0. I get a similar result.

amitdo · 2025-02-05T12:50:46Z

I manually removed the top block:

With this image, Tesseract works well. It detects the textangle as 270 and thus the text recognition is fine.

0dinD mentioned this issue Jan 31, 2025

[Feature]: Process hOCR textangle attribute in hOCR to PDF transform ocrmypdf/OCRmyPDF#1467

Open

amitdo added the OSD Orientation and Script Detection label Feb 3, 2025

amitdo added the bug label Feb 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tesseract struggles with 90-degree angled text sometimes #4387

Tesseract struggles with 90-degree angled text sometimes #4387

0dinD commented Jan 31, 2025

0dinD commented Jan 31, 2025

amitdo commented Feb 3, 2025 •

edited

Loading

0dinD commented Feb 3, 2025

amitdo commented Feb 5, 2025

amitdo commented Feb 5, 2025

Tesseract struggles with 90-degree angled text sometimes #4387

Tesseract struggles with 90-degree angled text sometimes #4387

Comments

0dinD commented Jan 31, 2025

Current Behavior

Expected Behavior

Suggested Fix

tesseract -v

Operating System

Other Operating System

uname -a

Compiler

CPU

Virtualization / Containers

Other Information

0dinD commented Jan 31, 2025

amitdo commented Feb 3, 2025 • edited Loading

0dinD commented Feb 3, 2025

amitdo commented Feb 5, 2025

amitdo commented Feb 5, 2025

amitdo commented Feb 3, 2025 •

edited

Loading