Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing root logger handlers is more frustrating then helpful #359

Open
maxwnewcomer opened this issue Jan 29, 2025 · 0 comments
Open

Removing root logger handlers is more frustrating then helpful #359

maxwnewcomer opened this issue Jan 29, 2025 · 0 comments

Comments

@maxwnewcomer
Copy link

I've been trying to setup processing container with unstructured_ingest.v2 and have been fighting with the logger for a second. I don't work in python too often, so reminding myself how loggers work took some ramp up.

Since I'm trying to deploy this processing container in prod w proper observability I've been trying to get it log with structured outputs. For the longest time, unstructured logs were not showing up properly (normal formating, no json)

I was extremely confused for a while until I dug into the packages config where I found:

remove_root_handlers(logger)

with the comment:

def remove_root_handlers(logger: Logger) -> None:
    # NOTE(robinson): in some environments such as Google Colab, there is a root handler
    # that doesn't not mask secrets, meaning sensitive info such as api keys appear in logs.
    # Removing these when they exist prevents this behavior
    if logger.root.hasHandlers():
        for handler in logger.root.handlers:
            logger.root.removeHandler(handler)

I understand this is for secret hiding, but I believe this is the wrong way to go about this. It turns out commenting out that line allows me to still log while keeping the secret obfuscation logic.

{"timestamp": "2025-01-29 14:24:51,415", "level": "INFO", "logger": "unstructured_ingest.v2", "message": "Created download with configs: {\"download_dir\":null}, connection configs: {\"access_config\":\"**********\"}"}

I haven't tested in google collab, and I haven't done much more investigation into why google collab would deobfuscate these variables, but this was very frustrating as I was getting into using the library and I'd love to see what could be done to make this less of a hassle for future people using the library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant