-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make heavy loaded optimization configurable #114
Conversation
The inequality in maybeReplaceWithLessBusyConnection was supposed to check if the least busy connection has at least 20% less inflight requests compared to heavy loaded connection. However, this check was incorrect: if alternative == nil || alternative.AvailableStreams() * 120 > c.AvailableStreams() * 100 { return c } Since "alternative" is the least busy connection, by definition it has the largest number of available streams. Therefore, alternative.AvailableStreams() > c.AvailableStreams() is always true. This commit rewrites the condition by using a number of in use streams (inflight requests). The inequality now correctly checks is the least busy connection has at least 20% less in use streams compared to the heavy loaded connection.
Scylla Go Driver has a capability to avoid sending requests to an overloaded shard, instead sending the request on a different connection (at the same node). This change makes it possible to customize the parameters used to determine when this behavior would kick in.
This PR is also available at scylladb/gocql branch: heavy-loaded-optim and at scylladb/gocql tag: heavy-loaded-optim-v1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation-wise looks good, but I would really appreciate a test to verify that the optimization indeed does what was intended.
if alternative == nil || alternative.AvailableStreams() * 120 > c.AvailableStreams() * 100 { | ||
return c | ||
} else { | ||
if alternative != nil && alternative.InUseStreams() * 100 < c.InUseStreams() * 80 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of fact that the problem with the original equation wasn't spotted when it was reviewed, I would appreciate some unit tests in order to convince us better that it indeed is fixed now.
I think we already have some scyllaConnPicker
tests in scylla.go
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry but isn't AvailableStreams == NumStreams - InUseStreams
?
And if so then any equation expressed in InUseStreams
can be expressed in AvailableStreams
and the other way around.
And this brings us back to the initial motivation of this patch.
You were absolutely right that the original expression was an always-true one.
However the mistake was not in the use of AvailableStreams
vs InUseStreams
but rather in the equation itself.
It was supposed to be as follows:
alternative.AvailableStreams() * 100 > c.AvailableStreams() * 120
Or in your current version:
alternative.AvailableStreams() * 80 > c.AvailableStreams() * 100 + 20 * c.NumStreams()
Don't you think that adjusting the original equation would make this patch much simpler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry but isn't AvailableStreams == NumStreams - InUseStreams?
And if so then any equation expressed in InUseStreams can be expressed in AvailableStreams and the other way around.
Yes.
The reason I changed it to use InUseStreams
instead of AvailableStreams
is that it was easier for me to understand it in terms of InUseStreams
(aka inflight requests). It also makes it easier to document and explain the configurable parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
@@ -267,6 +285,8 @@ func NewCluster(hosts ...string) *ClusterConfig { | |||
ConnectTimeout: 600 * time.Millisecond, | |||
Port: 9042, | |||
NumConns: 2, | |||
HeavyLoadedConnectionThreshold: 512, | |||
HeavyLoadedSwitchConnectionPercentage: 20, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please run gofmt on this file. It will be annoying for others contributors if gofmt keeps formatting code that they didn't modify.
// least busy connection has less than 80 inflight requests. | ||
// | ||
// Default: 20% | ||
HeavyLoadedSwitchConnectionPercentage int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please validate that this parameter is within valid range when the session is being created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Some comments below.
@@ -145,3 +145,7 @@ func (s *IDGenerator) Clear(stream int) (inuse bool) { | |||
func (s *IDGenerator) Available() int { | |||
return s.NumStreams - int(atomic.LoadInt32(&s.inuseStreams)) - 1 | |||
} | |||
|
|||
func (s *IDGenerator) InUse() int { | |||
return int(atomic.LoadInt32(&s.inuseStreams)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A question from a C-guy: why did you have to create all these complications with casting like this one (int
would be a int64
on 64-bit platform according to https://www.educative.io/answers/what-is-type-int-in-golang)?
I assume that the reason you need to do all these transformations is because NumStreams
is int
but for whatever unclear to me reason inuseStreams
is int32
. And this makes no sense to me at all because inuseStreams
can get a value stored in NumStreams
, and this implies that they should have the same type.
Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is due to golang atomic
package that does not have API for int
.
if alternative == nil || alternative.AvailableStreams() * 120 > c.AvailableStreams() * 100 { | ||
return c | ||
} else { | ||
if alternative != nil && alternative.InUseStreams() * 100 < c.InUseStreams() * 80 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry but isn't AvailableStreams == NumStreams - InUseStreams
?
And if so then any equation expressed in InUseStreams
can be expressed in AvailableStreams
and the other way around.
And this brings us back to the initial motivation of this patch.
You were absolutely right that the original expression was an always-true one.
However the mistake was not in the use of AvailableStreams
vs InUseStreams
but rather in the equation itself.
It was supposed to be as follows:
alternative.AvailableStreams() * 100 > c.AvailableStreams() * 120
Or in your current version:
alternative.AvailableStreams() * 80 > c.AvailableStreams() * 100 + 20 * c.NumStreams()
Don't you think that adjusting the original equation would make this patch much simpler?
Any updates? |
@vladzcloudius do you think this one is still required? |
Of course it's useful. |
@avelanarius, let us know if you be able to finish PR, otherwise someone else should pick it up |
it looks like this one is stuck, let's move it to #211 |
Scylla Go Driver has a capability to avoid sending requests to an overloaded shard, instead sending the request on a different connection (at the same node).
This PR makes it possible to customize the parameters used to determine when this behavior would kick in.
Additionally, this PR fixes an incorrect inequality when comparing the number of streams.