Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stricter cropping #70

Open
beckstefan opened this issue Sep 24, 2020 · 2 comments
Open

Stricter cropping #70

beckstefan opened this issue Sep 24, 2020 · 2 comments

Comments

@beckstefan
Copy link

A DFG requirement when scanning is to show a part of the opposite page. On some pages this tends to be a problem, since anybaseocr-crop does not crop the text and later tools detect text/characters where they shouldn't.

Here are two examples.

cropping_1
cropping_2

What would be a strategy to tackle this?

@bertsky
Copy link
Contributor

bertsky commented Nov 9, 2020

AFAICT this processor tries to avoid textual noise via separator line detection. There are a couple of (crappy and badly documented) parameters for this (rular...), but IMHO your best shot here would be trying to increase the contrast so the binarized image shows a distinct, contiguous vertical line where the gutter/spine is.

Besides binarization settings, there is a second workflow detail that might help: If you deskew before cropping, these lines should be easier to detect.

@bertsky
Copy link
Contributor

bertsky commented May 2, 2022

@beckstefan is this gone with the reimplementation of the cropper?

(If you could post or link to the originals, I could run it...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants