Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reset successes when half opening #192

Merged
merged 1 commit into from
Sep 18, 2018
Merged

reset successes when half opening #192

merged 1 commit into from
Sep 18, 2018

Conversation

jpittis
Copy link
Contributor

@jpittis jpittis commented Sep 13, 2018

Seems like in #188 the calls to log_state_transition(:half_open) and @successes.reset were removed.

@sirupsen @pushrax

@sirupsen
Copy link
Contributor

These lines didn't change in that PR as far as I can see? What tests are failing? I am not sure what this is doing.

@jpittis
Copy link
Contributor Author

jpittis commented Sep 13, 2018

If you look at the current Semian master, there is no code path that resets @successes to zero.

(I'm not sure if this is the right fix / why the tests far failing. Either way, I can't see the code path in the current master that resets the success count.)

Copy link
Contributor

@pushrax pushrax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sirupsen the bug is subtle and we all missed it in #189. There was previously a call to self.half_open in CircuitBreaker, that was not delegated to @state. #189 effectively changed this call to @state.half_open!, skipping the state transition log and call to @successes.reset. The implicit delegation adds a lot of overhead to understanding what's going on here.

Change makes sense to me but there should be a regression test.

@@ -87,7 +87,7 @@ def open

def half_open
log_state_transition(:half_open)
@state.half_open
half_open!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this delegator should just be removed, and State#half_open! can then be renamed back to State#half_open.

Copy link
Contributor

@sirupsen sirupsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize for the initial terse reply, I didn't quite understand the PR and opened it at a time where I didn't have the time to dig in further! I just spent some more time with it this morning. Wonderful find. How did you get here? 👍

At first, I was terrified of the consequences this might have for production. However, on further thought and testing, it seems that the bug here boils down to success_threshold essentially always being the equivalent of 1. I was afraid this would be disastrous, rendering circuits useless, however, then the tests wouldn't have passed.

With that in mind, it makes sense to me no existing test failed since we don't test the success_threshold > 1 case properly between multiple open/close cycles. I wholeheartedly agree with Justin's observation that we need to clean up the delegate. I've done this in a squash of your commit by distinguishing between (1) Am I in the half_open? state, and (2) Am I about to transition to the half_open? state. The former being a state, the latter being a decision the circuit makes. I think this source of confusion is what caused me to introduce the bug in the previous PR. Furthermore, I think that {close,half_open,open} is a tid-bit terse for transitioning between states, so I renamed these methods as well.

@jpittis hope you don't mind, but since this is a fallout from my PR I considered your PR a glorified issue of "hey I found this bug and I think this is a solution" and fixed it here and added a regression test. This test will fail properly with the previous behaviour. There was a subtle bug in the first proposed fix here, since it'd transition to the half_open state every time the condition for half_open? was met. I updated it to only transition once!

Please review @jpittis @pushrax @max611 , then let's get this out this week!

@jpittis
Copy link
Contributor Author

jpittis commented Sep 17, 2018

We found this bug because we were re-implementing a Semian circuit breaker in Go for the semian sinulator.

As you say, I meant this PR as an issue + diff.

I didn’t raise a fuss because I came to the same conclusion as you that the bug is relatively harmless.

Copy link
Member

@max611 max611 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I love the use of transition_to, makes it easier to read 👀

Copy link
Contributor Author

@jpittis jpittis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM. Looks like we're having issues with Semian test flakyness.

@@ -37,33 +37,33 @@ def acquire(resource = nil, &block)
result
end

def half_open?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change reverts #189, is that what you wanted?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's fine, I'll use the other helper—the other PR never shipped.

@sirupsen
Copy link
Contributor

@max611 would you be able to look into the flaky tests here + ship this to core + bump version + bump Shopify? Might be good to do in parallel when you're waiting for reviews for LRU?

@max611
Copy link
Member

max611 commented Sep 17, 2018

@sirupsen Yep will look into it! 👍

Copy link
Contributor

@pushrax pushrax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nitpicks otherwise LGTM

log_state_transition(:half_open)
@state.half_open
@state.half_open!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a slight inconsistency here now — half_open! gets a bang while open and close do not.

def test_open_close_open_cycle
resource = Semian.register(:open_close, tickets: 1, exceptions: [SomeError], error_threshold: 2, error_timeout: 5, success_threshold: 2)

half_open_future_travel = -> { Time.now + resource.circuit_breaker.error_timeout + 1 }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timecop.travel already computes the absolute time from the delta in this fancy method.

log_state_transition(:open)
@state.open
end

def half_open
def transition_to_half_open
log_state_transition(:half_open)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this line is tested.

@max611 max611 merged commit 1d6f931 into master Sep 18, 2018
@max611 max611 deleted the no-success-incr branch September 18, 2018 18:34
@pushrax
Copy link
Contributor

pushrax commented Sep 18, 2018

There were some comments left unaddressed.

@max611
Copy link
Member

max611 commented Sep 18, 2018

@pushrax my bad, missed those 🤦‍♂️ I'll create a new PR to address the comments

max611 added a commit that referenced this pull request Sep 19, 2018
@max611 max611 temporarily deployed to rubygems October 19, 2018 15:07 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants