Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ocrd_cli_wrap_processor: always do initLogging #1296

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bertsky
Copy link
Collaborator

@bertsky bertsky commented Nov 20, 2024

fixes #1223 and replaces #1295

@bertsky bertsky requested a review from kba November 20, 2024 19:10
@bertsky
Copy link
Collaborator Author

bertsky commented Nov 20, 2024

(Of course, as already commented in #1223, it may still be necessary to be careful with imports on the side of the processor implementations. For example, moving import tensorflow to setup.)

Alternatively, we could still do initLogging on the module level, though not in ocrd_utils.logging but rather ocrd.decorators, which arguably will only be imported by applications that do need log handlers.

Copy link
Member

@kba kba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Of course, as already commented in #1223, it may still be necessary to be careful with imports on the side of the processor implementations. For example, moving import tensorflow to setup.)

I am wondering how future-proof these mechanisms are. There have been different workarounds to prevent tensorflow and others from logging, like our tf_disable_interactive_logs and various copy-and-pasted stanzas like in eynollah:

os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
stderr = sys.stderr
sys.stderr = open(os.devnull, "w")
import tensorflow as tf
from tensorflow.python.keras import backend as K
from tensorflow.keras.models import load_model
sys.stderr = stderr
tf.get_logger().setLevel("ERROR")
warnings.filterwarnings("ignore")

And apart from being really intrusive and inconsistent, they could break at any point.

Alternatively, we could still do initLogging on the module level, though not in ocrd_utils.logging but rather ocrd.decorators, which arguably will only be imported by applications that do need log handlers.

Module-level in ocrd.decorators.__init__.py makes sense and a cursory glance over how eynollah and ocrd_kraken are structured make it seem this would work. For ocrd_calamari, there is a top-level __init__.py which does from .recognize import CalamariRecognize, i.e. tensorflow is imported before ocrd.decorators but it is guarded by tf_disable_interactive_logs.

So, as much as I dislike module-level function calls in general, in the interest of fewer surprises in the future, I'm for putting initLogging in ocrd.decorators.__init__.

@bertsky
Copy link
Collaborator Author

bertsky commented Nov 21, 2024

There have been different workarounds to prevent tensorflow and others from logging, like our tf_disable_interactive_logs

right, but that just covers the rogue print and write statements scattered across Keras and TF (so interactive_logs is actually a misnomer within Keras itself)

and various copy-and-pasted stanzas like in eynollah:

os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"

That should not be necessary if we run our initLogging before TF can do theirs, since we already set the level for tensorflow to error (i.e. 3).

stderr = sys.stderr
sys.stderr = open(os.devnull, "w")

wow, that's rather extreme, it speaks to the frustration that we had with correct logging setup in the Python ecosystem

import tensorflow as tf
from tensorflow.python.keras import backend as K
from tensorflow.keras.models import load_model
sys.stderr = stderr
tf.get_logger().setLevel("ERROR")

should also not be necessary anymore

warnings.filterwarnings("ignore")

That's something we could indeed add into ocrd_cli_wrap_processor in the non-processing contexts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

warnings in processor output garble dump-json
2 participants