Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicated logging messages when importing module twice #11

Closed
epogrebnyak opened this issue Aug 5, 2019 · 17 comments
Closed

Duplicated logging messages when importing module twice #11

epogrebnyak opened this issue Aug 5, 2019 · 17 comments
Labels

Comments

@epogrebnyak
Copy link
Contributor

I run example.py in IPython console in Spyder and it seems that console output gets proportional to the number of times I have run the script. On the first run in new kernel I get just single message, one per call of handout.Handout instance method.

This is what I get on the fifth run:

Iteration 0
Iteration 0
Iteration 0
Iteration 0
Iteration 0
Iteration 0
Iteration 1
Iteration 1
Iteration 1
Iteration 1
Iteration 1
Iteration 1
Iteration 2
Iteration 2
Iteration 2
Iteration 2
Iteration 2
Iteration 2
Handout written to: output\index.html
Handout written to: output\index.html
Handout written to: output\index.html
Handout written to: output\index.html
Handout written to: output\index.html
Handout written to: output\index.html
Saved figure: output\figure-0.png
Saved figure: output\figure-0.png
Saved figure: output\figure-0.png
Saved figure: output\figure-0.png
Saved figure: output\figure-0.png
Saved figure: output\figure-0.png
Handout written to: output\index.html
Handout written to: output\index.html
Handout written to: output\index.html
Handout written to: output\index.html
Handout written to: output\index.html
Handout written to: output\index.html
Saved figure: output\figure-1.png
Saved figure: output\figure-1.png
Saved figure: output\figure-1.png
Saved figure: output\figure-1.png
Saved figure: output\figure-1.png
Saved figure: output\figure-1.png
Saved figure: output\figure-2.png
Saved figure: output\figure-2.png
Saved figure: output\figure-2.png
Saved figure: output\figure-2.png
Saved figure: output\figure-2.png
Saved figure: output\figure-2.png
Saved figure: output\figure-3.png
Saved figure: output\figure-3.png
Saved figure: output\figure-3.png
Saved figure: output\figure-3.png
Saved figure: output\figure-3.png
Saved figure: output\figure-3.png
Handout written to: output\index.html
Handout written to: output\index.html
Handout written to: output\index.html
Handout written to: output\index.html
Handout written to: output\index.html
Handout written to: output\index.html

Maybe not a bug, but it would have felt more confortable the handout.Handout instance made just a single print to the screen per call.

I run the script as runfile('D:/github/handout/example.py', wdir='D:/github/handout')

@danijar
Copy link
Owner

danijar commented Aug 5, 2019

Handout uses a separate logger instance, so you can do this:

import handout
import logging
logging.getLogger('handout').setLevel(logging.ERROR)

I'm also open to changing which messages are printed by default.

For example, I think doc.add_text() should always print so you can use it to replace print() where needed. Besides this, the path to the index.html could be printed once in the beginning to open the handout in the browser. All other messages could be hidden by default (e.g. log level info).

Do you have thoughts on this?

@danijar
Copy link
Owner

danijar commented Aug 5, 2019

Taking a second look at what you say, it does seem incorrect that the same figure is added multiple times. Have you tried just running your script as python3 script.py?

It's not entirely clear to me how Python Handout would be supposed to behave in an interactive console. It goes in a different direction than what motivated the library, i.e. to create reports in a reproducible environment. But I'd be happy to talk through how this could work and if we can extend it in a reasonable way.

@danijar danijar changed the title growing screen messages in IPython Using Handout inside IPython environment Aug 5, 2019
@epogrebnyak
Copy link
Contributor Author

epogrebnyak commented Aug 5, 2019

python example.py produces single output as inteded.

Changing the logging as suggested above did not help in IPython interactive session, each runfile() adds up to more lines of output. There is a state somewhere inside the interactive session. :)

@epogrebnyak
Copy link
Contributor Author

Besides this, the path to the index.html could be printed once in the beginning to open the handout in the browser.

Maybe a path should be printed just for show() method? It would be a clutter to see paths for add_text(). In my opinion add_text() should behave like print().

@danijar
Copy link
Owner

danijar commented Aug 5, 2019

Re amount of logging messages, I reopened #2 to track this for the next release.

Re executing from within Spyder, could you describe the setup a bit more, please? I assume runfile() imports the script and executes it inline so that its variables become available in the interactive environment? Does the generated handout look as you would expect?

There is no global state in Handout, so as long as you create a new handout.Handout() instance every time you generate a report, I'm not sure where the accumulating state comes from.

@epogrebnyak
Copy link
Contributor Author

epogrebnyak commented Aug 8, 2019

There is no global state in Handout indeed - but there is seemingly is some state in IPython logger configuration - it accumulates messages (maybe it is a desired behaviour for a REPL).

Minimal example:

import handout
doc = handout.Handout('output')
doc.add_text('Print this text')

Running once on new kernel in IPython (same output for python minimal.py):

[1]: runfile('D:/github/handout/minimal.py', wdir='D:/github/handout')
Print this text

But on next runs I get multiple logger messages, one more message per run:

[2]: runfile('D:/github/handout/minimal.py', wdir='D:/github/handout')
Reloaded modules: handout, handout.handout, handout.blocks, handout.tools
Print this text
Print this text

[3]: runfile('D:/github/handout/minimal.py', wdir='D:/github/handout')
Reloaded modules: handout, handout.handout, handout.blocks, handout.tools
Print this text
Print this text
Print this text

P.S. This is not a terrible thing, but rather a nuisance.

@epogrebnyak
Copy link
Contributor Author

Maybe I can fill a question on Stack Overflow to ask for clues.

@danijar
Copy link
Owner

danijar commented Aug 8, 2019

Thanks a lot for investigating! I'm quite surprised how the resulting handout looks like. Does it show the code lines that were executed in the REPL and in the correct order?

@epogrebnyak
Copy link
Contributor Author

epogrebnyak commented Aug 9, 2019

I should have mentioned the html output is normal (the 'handout'), the duplication of messages appears only in the IPython console.

One more line in the example:

import handout
doc = handout.Handout('output')
doc.add_text('Print this text')
doc.show() 

On the first run the console output is:

Print this text
Handout written to: output\index.html

On the second:

Print this text
Print this text
Handout written to: output\index.html
Handout written to: output\index.html

Both print() and logger messages get accumulated in the console.

@danijar
Copy link
Owner

danijar commented Aug 9, 2019

Thanks. I was actually curious how the resulting handout looks like, because there is no source file when running handout from the interactive interpreter.

Besides this, would it be fair to say that this is a problem with IPython (and we can notify them about it) or is there something we can and should do on our end?

@epogrebnyak
Copy link
Contributor Author

no source file when running handout from the interactive interpreter

There is the source when running with runfile('D:/github/handout/minimal.py', wdir='D:/github/handout') command. At least something with oriignal file content is executed.

I filled an issue with IPython, maybe someone can shed some light on it.

The issue is definitely on IPython side, but maybe we can do some tricky configuration of logger at init to prevent message duplication.

@epogrebnyak
Copy link
Contributor Author

The best of my understanding is that a logger object a session-level singleton, that exists in IPython session. Any time the script is run in the session again the getLogger('handout') refers to existing instance.

Any time there is a new call _logger.info(message) we somehow create not just a logging event, but also a steam 'handler', that gets accumulated inside _logger.handlers.

Would need a bit of more research to find out what to do about it (in principle - randomise creation of a logger name or cut out adding more handlers to the logger).

@danijar
Copy link
Owner

danijar commented Aug 9, 2019

This reminds me a bit of duplicate logging messages I've seen in TensorFlow. What do you think about getting rid of the logger and just using print() statements together with a quite=False option in the constructor?

@epogrebnyak
Copy link
Contributor Author

Let me bear with this issue for a little while - it appears the problem is not even IPython, but it is specific to Spyder. I came up to running the script in IPython session in console and the messages do not duplicate:

In [1]: %run minimal.py
Print this text

In [2]: %run minimal.py
Print this text

Most Stack Overflow answers like this one deal with a situation where there is addHandler() call in script code and it gets repeated when script is called in the same session. Also there is a solution to set logger.propagate = False, but injecting it in Handout code had no effect.

The wierd point is to detect where the handlers get added in between the calls to the script. I would like to investigate a bit, if no result, can possibly use print. Maybe wrap print in own helper:

def log(message, logger=None):
    ```Logging utility to print *message* to console.```
    if logger:
        logger.info(message)
    else:
        print(message, silent=False) 

Some people may see logging vs printing as a matter of code style, so the decision will be up to you as an author and maintainer of the project.

@epogrebnyak
Copy link
Contributor Author

epogrebnyak commented Aug 10, 2019

God, @danijar! ))) you are actually adding a new handler in __init__.py!

Maybe this code should be inside handout.py or tools.py or new logging.py - __init__.py is the last place I looked for some configuration interference code, as I searched codebase of Spyder and IPython instead. ))

https://github.com/danijar/handout/blob/master/handout/__init__.py#L1-L13

import sys
import logging

from .handout import Handout


# Set up logger.
logger = logging.getLogger('handout')
logger.setLevel(logging.INFO)
logger.propagate = False  # Global logger should not print messages again.
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(logging.Formatter('%(message)s'))
logger.addHandler(handler)

What happens in that IPython resets logger handlers after %run command, while Spyder keeps the handlers alive. The reason for this is some ambiguity in logging.shutdown() behaviour.

As a quick fix, we can do something like below for __init__py:

import sys
import logging

from .handout import Handout

# Set up logger.
logger = logging.getLogger('handout')
logger.setLevel(logging.INFO)
logger.propagate = False  # Global logger should not print messages again.
if not logger.handlers:
    handler = logging.StreamHandler(sys.stdout)
    handler.setFormatter(logging.Formatter('%(message)s'))
    logger.addHandler(handler)

@epogrebnyak
Copy link
Contributor Author

epogrebnyak commented Aug 10, 2019

The fix in __init__.py really works, but I suggest making things more explicit via create_logger() function. create_logger() may be placed in tools.py, for example

import sys
import logging


def create_logger():
    logger = logging.getLogger('handout')
    logger.setLevel(logging.INFO)
    logger.propagate = False  # Global logger should not print messages again.
    if not logger.handlers:
        handler = logging.StreamHandler(sys.stdout)
        handler.setFormatter(logging.Formatter('%(message)s'))
        logger.addHandler(handler)
    # The logger can be access anywhere in code by its name 
    # with logging.getLogger('handout') but we pass it explcitly
    # as function result
    return logger

later in Handout constructior:

self._logger = create_logger()

@danijar
Copy link
Owner

danijar commented Aug 10, 2019

Thanks for investigating! Yes, we shouldn't keep adding handlers :D

The reason to create it in __init__.py is to make it available once the library is imported so the user can change the logging level, format, etc. So let's include your first solution for now? If you have a better solution, just let me know.

@danijar danijar added bug and removed discussion labels Aug 10, 2019
@danijar danijar changed the title Using Handout inside IPython environment Duplicated logging messages when importing module twice Aug 10, 2019
epogrebnyak added a commit to epogrebnyak/handout that referenced this issue Aug 10, 2019
danijar added a commit that referenced this issue Aug 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants