-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reset successes when half opening #192
Conversation
These lines didn't change in that PR as far as I can see? What tests are failing? I am not sure what this is doing. |
If you look at the current Semian master, there is no code path that resets (I'm not sure if this is the right fix / why the tests far failing. Either way, I can't see the code path in the current master that resets the success count.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sirupsen the bug is subtle and we all missed it in #189. There was previously a call to self.half_open
in CircuitBreaker, that was not delegated to @state
. #189 effectively changed this call to @state.half_open!
, skipping the state transition log and call to @successes.reset
. The implicit delegation adds a lot of overhead to understanding what's going on here.
Change makes sense to me but there should be a regression test.
lib/semian/circuit_breaker.rb
Outdated
@@ -87,7 +87,7 @@ def open | |||
|
|||
def half_open | |||
log_state_transition(:half_open) | |||
@state.half_open | |||
half_open! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this delegator should just be removed, and State#half_open!
can then be renamed back to State#half_open
.
df7af9f
to
5d6ac7e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I apologize for the initial terse reply, I didn't quite understand the PR and opened it at a time where I didn't have the time to dig in further! I just spent some more time with it this morning. Wonderful find. How did you get here? 👍
At first, I was terrified of the consequences this might have for production. However, on further thought and testing, it seems that the bug here boils down to success_threshold
essentially always being the equivalent of 1
. I was afraid this would be disastrous, rendering circuits useless, however, then the tests wouldn't have passed.
With that in mind, it makes sense to me no existing test failed since we don't test the success_threshold > 1
case properly between multiple open/close cycles. I wholeheartedly agree with Justin's observation that we need to clean up the delegate. I've done this in a squash of your commit by distinguishing between (1) Am I in the half_open?
state, and (2) Am I about to transition to the half_open?
state. The former being a state, the latter being a decision the circuit makes. I think this source of confusion is what caused me to introduce the bug in the previous PR. Furthermore, I think that {close,half_open,open}
is a tid-bit terse for transitioning between states, so I renamed these methods as well.
@jpittis hope you don't mind, but since this is a fallout from my PR I considered your PR a glorified issue of "hey I found this bug and I think this is a solution" and fixed it here and added a regression test. This test will fail properly with the previous behaviour. There was a subtle bug in the first proposed fix here, since it'd transition to the half_open
state every time the condition for half_open?
was met. I updated it to only transition once!
Please review @jpittis @pushrax @max611 , then let's get this out this week!
We found this bug because we were re-implementing a Semian circuit breaker in Go for the semian sinulator. As you say, I meant this PR as an issue + diff. I didn’t raise a fuss because I came to the same conclusion as you that the bug is relatively harmless. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I love the use of transition_to
, makes it easier to read 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM. Looks like we're having issues with Semian test flakyness.
@@ -37,33 +37,33 @@ def acquire(resource = nil, &block) | |||
result | |||
end | |||
|
|||
def half_open? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change reverts #189, is that what you wanted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's fine, I'll use the other helper—the other PR never shipped.
@max611 would you be able to look into the flaky tests here + ship this to core + bump version + bump Shopify? Might be good to do in parallel when you're waiting for reviews for LRU? |
@sirupsen Yep will look into it! 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny nitpicks otherwise LGTM
log_state_transition(:half_open) | ||
@state.half_open | ||
@state.half_open! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a slight inconsistency here now — half_open!
gets a bang while open
and close
do not.
def test_open_close_open_cycle | ||
resource = Semian.register(:open_close, tickets: 1, exceptions: [SomeError], error_threshold: 2, error_timeout: 5, success_threshold: 2) | ||
|
||
half_open_future_travel = -> { Time.now + resource.circuit_breaker.error_timeout + 1 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timecop.travel
already computes the absolute time from the delta in this fancy method.
log_state_transition(:open) | ||
@state.open | ||
end | ||
|
||
def half_open | ||
def transition_to_half_open | ||
log_state_transition(:half_open) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this line is tested.
5d6ac7e
to
5ba188a
Compare
There were some comments left unaddressed. |
@pushrax my bad, missed those 🤦♂️ I'll create a new PR to address the comments |
Seems like in #188 the calls to
log_state_transition(:half_open)
and@successes.reset
were removed.@sirupsen @pushrax