Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nsqd: reset topic/channel health on successful backend write #671

Merged
merged 2 commits into from
Oct 13, 2015

Conversation

judwhite
Copy link
Contributor

  • channel and topic put reset ctx.nsqd.SetHealth
  • change NSQD SetHealth/GetError to use atomic.Value
  • nsqd_test.go: change exp to nexp in nequal output
  • relates to nsqd: retry health check #594

  • Fix data race - may have to do with setting channel.backend after messagePump has started.
  • Root cause timing between clietnMsgChan and memoryMsgChan
  • Update SetHealth to use atomic instead of mutex
  • Call SetHealth for every topic and channel call to put which uses the backend queue

@judwhite judwhite changed the title nsqd topic/channel: reset health on successful backend write [WIP] nsqd topic/channel: reset health on successful backend write Oct 13, 2015
@@ -325,6 +325,8 @@ func (c *Channel) put(m *Message) error {
c.name, err)
c.ctx.nsqd.SetHealth(err)
return err
} else if !c.ctx.nsqd.IsHealthy() {
c.ctx.nsqd.SetHealth(nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the simplest approach is to always c.ctx.nsqd.SetHealth(err) right after line 322 (before the err check). As you mentioned in the description, you didn't do this so that we didn't add a mutex to this code path.

I propose that we instead change the way health is stored so that it is just an error (and we set it via atomic.Value). This eliminates the mutex and would allow us to do make the above change.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'll add this.

@judwhite judwhite force-pushed the feature/nsqd-nok branch 2 times, most recently from 247ee8e to 0c0608a Compare October 13, 2015 17:12
@@ -571,7 +573,7 @@ func (c *Channel) messagePump() {
atomic.StoreInt32(&c.bufferedCount, 1)
c.clientMsgChan <- msg
atomic.StoreInt32(&c.bufferedCount, 0)
// the client will call back to mark as in-flight w/ it's info
// the client will call back to mark as in-flight w/ its info
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hah, thank you, I noticed this while perusing

- channel and topic put reset ctx.nsqd.SetHealth
- change nsqd SetHealth/GetError to use atomic.Value; skip allocation in
  SetHealth if attempting to set an already healthy queue to healthy
- nsqd_test.go: change `exp` to `nexp` in `nequal` output
- relates to nsqio#594
@judwhite
Copy link
Contributor Author

@mreiferson RFR

@judwhite judwhite changed the title [WIP] nsqd topic/channel: reset health on successful backend write nsqd topic/channel: reset health on successful backend write Oct 13, 2015
if err != nil {
n.setFlag(flagHealthy, false)
n.errValue.Store(errStore{err: err})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it not possible to store the error directly, why the indirection?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately you can't call atomic.Value Store() with nil (doc). Even if we stored just the last value, and relied on flagHealthy, they need to be of the same concrete type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks.

@mreiferson
Copy link
Member

nice work LGTM

mreiferson added a commit that referenced this pull request Oct 13, 2015
nsqd topic/channel: reset health on successful backend write
@mreiferson mreiferson merged commit c02e25f into nsqio:master Oct 13, 2015
@judwhite judwhite deleted the feature/nsqd-nok branch October 15, 2015 15:50
@mreiferson mreiferson changed the title nsqd topic/channel: reset health on successful backend write nsqd: reset topic/channel health on successful backend write Jun 14, 2020
@mreiferson mreiferson added the bug label Jun 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants