-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Time out SUBSCRIBE query instead of forking. #1783
Conversation
sqlitecluster/SQLiteNode.cpp
Outdated
|
||
// We set this for all commits because this only gets all commits in response to SUBSCRIBE, which is done synchronously, and blocks the commit thread. | ||
// For asynchronous queries, there's nothing being blocked, so it doesn't much matter how long these take. | ||
// This is really not the correct encapsulation for this, but we can improve that later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pet peeve, but this is a TODO in a comment. Can we make a GH issue for it instead?
// This is really not the correct encapsulation for this, but we can improve that later. |
sqlitecluster/SQLiteNode.cpp
Outdated
// We set this for all commits because this only gets all commits in response to SUBSCRIBE, which is done synchronously, and blocks the commit thread. | ||
// For asynchronous queries, there's nothing being blocked, so it doesn't much matter how long these take. | ||
// This is really not the correct encapsulation for this, but we can improve that later. | ||
timeoutLimitUS = 10'000'000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the cluster gives up on the leader node after 30s, should we make this higher and closer to 30s, say 25?
If there's a reason for 10 and it's not an arbitrary number, let's add a comment
sqlitecluster/SQLite.cpp
Outdated
_timeoutLimit = STimeNow() + timeoutLimitUS; | ||
} | ||
int queryResult = SQuery(_db, "getting commits", query, result); | ||
_timeoutLimit = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be calling clearTimeout
here?
Updated testing on the latest change showing the same behavior as the original testing:
|
Details
See issue, it's pretty detailed.
Fixed Issues
Fixes https://github.com/Expensify/Expensify/issues/406697
Tests
Testing on commit e146093 (before rolling back the test hacks):
We successfully time out the query:
This kills the peer connection:
Follower reconnects, switches to searching:
Follower runs through the state machine, succeeds on second try:
Auth built and tested against this as well.
Internal Testing Reminder: when changing bedrock, please compile auth against your new changes