Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ocrd-anybaseocr-crop: TypeError: argument of type 'NoneType' is not iterable #74

Open
jbarth-ubhd opened this issue Oct 26, 2020 · 12 comments

Comments

@jbarth-ubhd
Copy link

Perhaps a problem only in combination with ocrd-sbb-binarize(?)


(venv) jb@pers109:~/literatur_schoenen_wissenschaften1780a> ocrd-anybaseocr-crop
 -I OCR-D-BIN -O OCR-D-CROP
16:04:18.388 INFO OcrdAnybaseocrCropper - INPUT FILE 0 / P_00001
Traceback (most recent call last):
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/bin/ocrd-anybaseocr-crop",
 line 8, in <module>
    sys.exit(cli())
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-package
s/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-package
s/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-package
s/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-package
s/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-package
s/ocrd_anybaseocr/cli/ocrd_anybaseocr_cropping.py", line 527, in cli
    return ocrd_cli_wrap_processor(OcrdAnybaseocrCropper, *args, **kwargs)
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-package
s/ocrd/decorators/__init__.py", line 81, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-package
s/ocrd/processor/helpers.py", line 69, in run_processor
    processor.process()
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-package
s/ocrd_anybaseocr/cli/ocrd_anybaseocr_cropping.py", line 448, in process
    feature_selector='binarized') # should also be deskewed
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-package
s/ocrd/workspace.py", line 420, in image_from_page
    for feature in feature_selector.split(',') if feature) and
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-package
s/ocrd/workspace.py", line 420, in <genexpr>
    for feature in feature_selector.split(',') if feature) and
TypeError: argument of type 'NoneType' is not iterable
@kba
Copy link
Member

kba commented Oct 27, 2020

This is likely due to a combination of issues that have been fixed in qurator-spk/sbb_binarization#11 resp. OCR-D/core#633 - can you try again with sbb_binarization and core updated, please?

@jbarth-ubhd
Copy link
Author

Nope:

  • core commit 1582f000184a15d4057f6b3505d033f88373152a
  • sbb_binarization commit f363063e75bcf4c28a2210d346e33dbfcaa8cca9
  • ocrd_anybaseocr commit cb82aad
16:42:21.774 INFO ocrd.task_sequence.run_tasks - Start processing task 'sbb-binarize -I OCR-D-IMG -O OCR-D-N1 -p '{"model": "/usr/local/ocrd_models/sbb/binarization/models", "operation_level": "page"}''
16:58:28.098 INFO ocrd.task_sequence.run_tasks - Finished processing task 'sbb-binarize -I OCR-D-IMG -O OCR-D-N1 -p '{"model": "/usr/local/ocrd_models/sbb/binarization/models", "operation_level": "page"}''
16:58:28.100 INFO ocrd.task_sequence.run_tasks - Start processing task 'anybaseocr-crop -I OCR-D-N1 -O OCR-D-N2 -p '{"force": true, "colSeparator": 0.04, "maxRularArea": 0.3, "minArea": 0.05, "minRularArea": 0.01, "positionBelow": 0.75, "positionLeft": 0.4, "positionRight": 0.6, "rularRatioMax": 10.0, "rularRatioMin": 3.0, "rularWidth": 0.95, "operation_level": "page"}''
Traceback (most recent call last):
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/bin/ocrd", line 8, in <module>
    sys.exit(cli())
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.7/site-packages/ocrd/cli/process.py", line 26, in process_cli
    run_tasks(mets, log_level, page_id, tasks, overwrite)
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.7/site-packages/ocrd/task_sequence.py", line 149, in run_tasks
    raise Exception("%s exited with non-zero return value %s. STDOUT:\n%s\nSTDERR:\n%s" % (task.executable, returncode, out, err))
Exception: ocrd-anybaseocr-crop exited with non-zero return value 1. STDOUT:

STDERR:
16:58:29.882 INFO OcrdAnybaseocrCropper - INPUT FILE 0 / P_00001
Traceback (most recent call last):
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/bin/ocrd-anybaseocr-crop", line 8, in <module>
    sys.exit(cli())
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-packages/ocrd_anybaseocr/cli/ocrd_anybaseocr_cropping.py", line 527, in cli
    return ocrd_cli_wrap_processor(OcrdAnybaseocrCropper, *args, **kwargs)
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-packages/ocrd/decorators/__init__.py", line 81, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-packages/ocrd/processor/helpers.py", line 69, in run_processor
    processor.process()
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-packages/ocrd_anybaseocr/cli/ocrd_anybaseocr_cropping.py", line 448, in process
    feature_selector='binarized') # should also be deskewed
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-packages/ocrd/workspace.py", line 420, in image_from_page
    for feature in feature_selector.split(',') if feature) and
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/site-packages/ocrd/workspace.py", line 420, in <genexpr>
    for feature in feature_selector.split(',') if feature) and
TypeError: argument of type 'NoneType' is not iterable

Command exited with non-zero status 1
15206.61user 1637.87system 16:28.55elapsed 1703%CPU (0avgtext+0avgdata 11868964maxresident)k
1053272inputs+1560outputs (1800major+47188701minor)pagefaults 0swaps

@jbarth-ubhd
Copy link
Author

workflow:

. /usr/local/ocrd_all/venv/bin/activate
export TMPDIR=/dwork/tmp
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
( ocrd-create-mets.xml
/usr/bin/time ocrd process \
"sbb-binarize -I OCR-D-IMG -O OCR-D-N1 -P model /usr/local/ocrd_models/sbb/binarization/models" \
"anybaseocr-crop -I OCR-D-N1 -O OCR-D-N2" \
"cis-ocropy-denoise -I OCR-D-N2 -O OCR-D-N4 -P level-of-operation page" \
"cis-ocropy-deskew -I OCR-D-N4 -O OCR-D-N5 -P level-of-operation page" \
"sbb-textline-detector -I OCR-D-N5 -O OCR-D-N6 -P model /usr/local/ocrd_models/sbb/textline" \
"cis-ocropy-clip -I OCR-D-N6 -O OCR-D-N7 -P level-of-operation region" \
"cis-ocropy-deskew -I OCR-D-N7 -O OCR-D-N8 -P level-of-operation region" \
"cis-ocropy-resegment -I OCR-D-N8 -O OCR-D-N9" \
"cis-ocropy-dewarp -I OCR-D-N9 -O OCR-D-N10" \
"calamari-recognize -I OCR-D-N10 -O OCR-D-OCR -P checkpoint /usr/local/ocrd_models/calamari/GT4HistOCR/*.ckpt.json"
) >cmd.log 2>&1

@jbarth-ubhd
Copy link
Author

cave: N2...N4. second binarization should not be necessary with sbb-binarize(?)

@kba
Copy link
Member

kba commented Oct 27, 2020

The line numbers in the stacktrace look suspicious. Are you sure core is up-to-date?

What's the output of

(source /dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/bin/activate; ocrd --version)

@jbarth-ubhd
Copy link
Author

jbarth-ubhd commented Oct 27, 2020

Did make all again, now it's

.../ocrd_all> (source /dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/bin/activate; ocrd --version)
ocrd, version 2.18.1

I'll try it again...

@jbarth-ubhd
Copy link
Author

jbarth-ubhd commented Oct 27, 2020

17:30:48.526 INFO ocrd.task_sequence.run_tasks - Start processing task 'sbb-binarize -I OCR-D-IMG 
-O OCR-D-N1 -p '{"model": "/usr/local/ocrd_models/sbb/binarization/models", "operation_level": 
"page"}''
17:46:02.399 INFO ocrd.task_sequence.run_tasks - Finished processing task 'sbb-binarize -I 
OCR-D-IMG -O OCR-D-N1 -p '{"model": "/usr/local/ocrd_models/sbb/binarization/models", 
"operation_level": "page"}''
17:46:02.401 INFO ocrd.task_sequence.run_tasks - Start processing task 'anybaseocr-crop -I OCR-D-N1 
-O OCR-D-N2 -p '{"force": true, "colSeparator": 0.04, "maxRularArea": 0.3, "minArea": 0.05, 
"minRularArea": 0.01, "positionBelow": 0.75, "positionLeft": 0.4, "positionRight": 0.6, 
"rularRatioMax": 10.0, "rularRatioMin": 3.0, "rularWidth": 0.95, "operation_level": "page"}''
Traceback (most recent call last):
  File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/bin/ocrd", line 8, in <module>
    sys.exit(cli())
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.7/site-packages/click/core.py",
 line 829, in __call__
    return self.main(*args, **kwargs)
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.7/site-packages/click/core.py",
 line 782, in main
    rv = self.invoke(ctx)
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.7/site-packages/click/core.py",
 line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.7/site-packages/click/core.py",
 line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.7/site-packages/click/core.py",
 line 610, in invoke
    return callback(*args, **kwargs)
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.7/site-packages/ocrd/cli/proces
s.py", line 26, in process_cli
    run_tasks(mets, log_level, page_id, tasks, overwrite)
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.7/site-packages/ocrd/task_seque
nce.py", line 149, in run_tasks
    raise Exception("%s exited with non-zero return value %s. STDOUT:\n%s\nSTDERR:\n%s" % 
(task.executable, returncode, out, err))
Exception: ocrd-anybaseocr-crop exited with non-zero return value 1. STDOUT:

STDERR:
17:46:03.008 INFO OcrdAnybaseocrCropper - INPUT FILE 0 / P_00001
Traceback (most recent call last):
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/bin/ocrd-anyba
seocr-crop", line 8, in <module>
    sys.exit(cli())
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/
site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/
site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/
site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/
site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/
site-packages/ocrd_anybaseocr/cli/ocrd_anybaseocr_cropping.py", line 527, in cli
    return ocrd_cli_wrap_processor(OcrdAnybaseocrCropper, *args, **kwargs)
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/
site-packages/ocrd/decorators/__init__.py", line 81, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/
site-packages/ocrd/processor/helpers.py", line 69, in run_processor
    processor.process()
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/
site-packages/ocrd_anybaseocr/cli/ocrd_anybaseocr_cropping.py", line 448, in process
    feature_selector='binarized') # should also be deskewed
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/
site-packages/ocrd/workspace.py", line 420, in image_from_page
    for feature in feature_selector.split(',') if feature) and
  File 
"/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.7/
site-packages/ocrd/workspace.py", line 420, in <genexpr>
    for feature in feature_selector.split(',') if feature) and
TypeError: argument of type 'NoneType' is not iterable

Command exited with non-zero status 1
15405.11user 1581.59system 15:27.10elapsed 1832%CPU (0avgtext+0avgdata 11925248maxresident)k
7520inputs+1576outputs (93major+45448743minor)pagefaults 0swaps

@kba
Copy link
Member

kba commented Oct 27, 2020

The fix to core to handle AlternativeImage without comments is only in 2.19.0 which isn't yet in ocrd_all. I'll send a PR later.

But I do not understand why sbb_binarization still seems to produce AlternativeImage without comments - can you verify that this is the case? I.e. how do the pg:Page elements begin in a PAGE-XML in OCR-D-N1?

@jbarth-ubhd
Copy link
Author

jbarth-ubhd commented Oct 27, 2020

did core> git pull https://github.com/OCR-D/core now ...

@kba
Copy link
Member

kba commented Oct 27, 2020

did core> git pull https://github.com/OCR-D/core now ...

This is merely a workaround in core, though, the real issue remains why sbb_binarization does not produce comments.

@jbarth-ubhd
Copy link
Author

still the same problem.

@bertsky
Copy link
Contributor

bertsky commented Mar 23, 2021

@jbarth-ubhd has this since been resolved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants