Ensure that the millRun goroutine terminates when Close called. #100

howbazaar · 2020-03-26T03:46:10Z

Currently the millRun goroutines leaks. This is very noticable if
a Logger is constructed periodically, used and then closed.

This change ensures that the millCh channel is closed if it exists.

Existing log rotation tests cover the duplicate Close calls.

Currently the millRun goroutines leaks. This is very noticable if a Logger is constructed periodically, used and then closed. This change ensures that the millCh channel is closed if it exists.

howbazaar · 2020-03-26T20:55:28Z

I notice that the travis-ci has failures. Interestingly I noticed that could happen when I was reading the tests as there is no synchronisation between the goroutine running the millRunOnce and the tests, so the test triggers the mill, but doesn't wait for the milling to complete before it checks the files.

Do you have any preference on how you'd like to see the synchronisation added?

natefinch · 2020-03-26T21:00:44Z

No preference, do whatever you feel is best and is most straight forward. I'm not too picky. I really need to make a v3 of this thing and clean some of this stuff up.

howbazaar · 2020-03-27T02:27:12Z

Wow, adding synchronisation and determinism has proved to be way more complicated than I thought.

howbazaar · 2020-03-27T02:50:31Z

FYI, here is the comment I wrote while evaluating the race in TestCleanupExistingBackups:

       // There is a race here between the first mill request which happens
       // when openExistingOrNew starts, and the mill that happens as a part
       // of rotate, that gets called due the the size written into the log file.
       // If the first mill managed to process the old log file before the mill
       // request due to rotate, then we get a notification and may get three
       // files back, as the rotate call creates an additional log file. The
       // second mill call (due to rotate) will remove that additional file.
       // If there is a second call, we need to call waitForNotify or the Close
       // method blocks.

While fixing the tests I noticed that the original patch closed the millRun goroutine in the wrong place. I didn't realise that the `close` method was called internally as well as part of the `rotate` method. The closing of the mill signalling channel is now done in the `Close` method. There were a bunch of race errors detected, mostly around the updating of the time, and the `fakeFS`. Synchronisation is added to these. All of the `time.After` calls have been removed from the tests and the execution time has gone from 700ms to 7ms on my machine. Two different notify channels were added to the `Logger` internals. These are only ever set in the tests, and no notification is done for any normal `Logger` use. In order to avoid spurious errors the `Close` method needed to wait for the `millRun` goroutine to complete, otherwise there was potential for the `millRunOnce` method to return errors. I temporarily added a panic to that method while testing. I use a wait group to wait for the goroutine to be complete. However due to the way the `sync.WaitGroup` works, you can't reuse them, so we have a pointer to a `sync.WaitGroup`. This patch does introduce a change in behaviour, which is in evidence due to the deleted test. Effectively I was left with two choices: allow the compression of existing old log files as part of writing to a new logger (which wouldn't rotate the files yet); or have a race in the `TestCleanupExistingBackups`. This test failure was intermittent, due to the race. I decided on determinism as the likelihood of having old uncompressed files around that needed to be compressed was small.

howbazaar · 2020-03-27T04:10:03Z

OK, after disussion with @4a6f656c, realised that we really do need the rotate on resume functionality.

To make the test deterministic, I changed it to not write enough to trigger a rotate. This way we always just have one notification.

4a6f656c

Generally looks good - some minor issues and cleanup suggestions inline.

4a6f656c · 2020-03-27T04:59:26Z

lumberjack.go

@@ -165,7 +175,16 @@ func (l *Logger) Write(p []byte) (n int, err error) {
 func (l *Logger) Close() error {
 	l.mu.Lock()
 	defer l.mu.Unlock()
-	return l.close()
+	if err := l.close(); err != nil {
+		return err


This should not return here - if openExistingOrNew gets called, it will call l.mill and l.millCh will become non-nil. If the l.close call then fails (for example, Close fails on the underlying file descriptor), the millCh will not be closed.

Either the close error can be preserved and returned late, or you may be able to close the millCh first.

4a6f656c · 2020-03-27T05:07:37Z

lumberjack_test.go

+	case <-notify:
+		// All good.
+	case <-time.After(2 * time.Second):
+		fmt.Println("logger notify not signalled")


Logging testing failures to stdout is less than ideal - use t.Fatal

4a6f656c · 2020-03-27T05:08:32Z

lumberjack_test.go

@@ -814,3 +815,13 @@ func exists(path string, t testing.TB) {
 	_, err := os.Stat(path)
 	assertUp(err == nil, t, 1, "expected file to exist, but got error from os.Stat: %v", err)
 }
+
+func waitForNotify(notify <-chan struct{}, t testing.TB) {
+	select {


This should call t.Helper so that the calls to t are more useful.

4a6f656c · 2020-03-27T05:12:51Z

lumberjack.go

+	// It is safe to check the millCh here as we are inside the mutex lock.
+	if l.millCh == nil {
+		l.millCh = make(chan struct{}, 1)
+		l.wg = &sync.WaitGroup{}


Nit: A sync.WaitGroup seems like overkill here - you're only ever having one waiter, I'd probably just use a millShutdownCh channel instead.

4a6f656c · 2020-03-27T05:14:33Z

lumberjack.go

 	return err
 }

 // millRun runs in a goroutine to manage post-rotation compression and removal
 // of old log files.
-func (l *Logger) millRun() {
-	for _ = range l.millCh {
+func (l *Logger) millRun(ch <-chan struct{}) {


Unless I'm missing something, there should be no need to pass the channel as an argument - we should be still able to use l.millCh directly.

4a6f656c · 2020-03-27T05:16:06Z

linux_test.go

+		Filename:         filename,
+		MaxBackups:       1,
+		MaxSize:          100, // megabytes
+		notifyCompressed: notify,


This could initialise inline and call waitForNotify with l.notifyCompressed (which seems more readable/self-documenting).

4a6f656c · 2020-03-27T05:16:31Z

linux_test.go

-	// we need to wait a little bit since the files get compressed on a different
-	// goroutine.
-	<-time.After(10 * time.Millisecond)
+	waitForNotify(notify, t)


i.e. waitForNotify(l.notifyCompressed, t)

4a6f656c · 2020-03-27T05:17:14Z

lumberjack_test.go

+		Filename:      filename,
+		MaxSize:       10,
+		MaxBackups:    1,
+		notifyRemoved: notify,


Same as earlier - consider initialising inline and calling waitForNotify(l.notifyRemoved, t) below.

haykbaluyan · 2020-04-03T20:23:36Z

@howbazaar sorry bothering, but when can we expect this to merge?

hloeung · 2020-04-05T09:29:27Z

@howbazaar sorry bothering, but when can we expect this to merge?

One more for @natefinch

borud · 2022-01-14T08:35:56Z

Any progress on this?

DellCurry · 2023-01-31T09:34:54Z

Would this PR (or #57 / #80 which solve the same bug) merged? I think it is really a big bug to many go developers since zap officially recommend this repo to rotate log file.

SimonRichardson · 2024-05-22T11:54:23Z

This is now superseded by #211

Ensure that the millRun goroutine terminates when Close called.

10710b3

Currently the millRun goroutines leaks. This is very noticable if a Logger is constructed periodically, used and then closed. This change ensures that the millCh channel is closed if it exists.

howbazaar force-pushed the fix-mill-once-goroutine-leak branch from 756caf0 to 6dba581 Compare March 27, 2020 03:11

Bring back the compress on resume functionality.

c81c014

This was referenced Mar 27, 2020

Avoid persistent mill goroutine #90

Closed

Stop mill goroutine when logger is closed #80

Open

fixed goroutine leak #57

Open

4a6f656c reviewed Mar 27, 2020

View reviewed changes

connyay mentioned this pull request Aug 6, 2020

Fixed goroutine leak, made tests deterministic tanium/lumberjack#1

Merged

SimonRichardson mentioned this pull request May 22, 2024

Ensure that the millRun goroutine terminates when Close called. #211

Open

zxhio mentioned this pull request Nov 18, 2024

Why not close millCh when close Logger #219

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure that the millRun goroutine terminates when Close called. #100

Ensure that the millRun goroutine terminates when Close called. #100

howbazaar commented Mar 26, 2020

howbazaar commented Mar 26, 2020

natefinch commented Mar 26, 2020

howbazaar commented Mar 27, 2020

howbazaar commented Mar 27, 2020

howbazaar commented Mar 27, 2020

4a6f656c left a comment

4a6f656c Mar 27, 2020

4a6f656c Mar 27, 2020

4a6f656c Mar 27, 2020

4a6f656c Mar 27, 2020

4a6f656c Mar 27, 2020

4a6f656c Mar 27, 2020

4a6f656c Mar 27, 2020

4a6f656c Mar 27, 2020

haykbaluyan commented Apr 3, 2020

hloeung commented Apr 5, 2020

borud commented Jan 14, 2022

DellCurry commented Jan 31, 2023

SimonRichardson commented May 22, 2024

Ensure that the millRun goroutine terminates when Close called. #100

Are you sure you want to change the base?

Ensure that the millRun goroutine terminates when Close called. #100

Conversation

howbazaar commented Mar 26, 2020

howbazaar commented Mar 26, 2020

natefinch commented Mar 26, 2020

howbazaar commented Mar 27, 2020

howbazaar commented Mar 27, 2020

howbazaar commented Mar 27, 2020

4a6f656c left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

haykbaluyan commented Apr 3, 2020

hloeung commented Apr 5, 2020

borud commented Jan 14, 2022

DellCurry commented Jan 31, 2023

SimonRichardson commented May 22, 2024