Skip to content

Commit

Permalink
Merge pull request #1586 from Expensify/tyler-adjsut-poll-timeouts
Browse files Browse the repository at this point in the history
Shorten timeout when no sockets to poll
  • Loading branch information
bondydaa authored Oct 4, 2023
2 parents af314fd + 2502f9c commit ac0a516
Show file tree
Hide file tree
Showing 2 changed files with 38 additions and 10 deletions.
46 changes: 36 additions & 10 deletions BedrockServer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -820,26 +820,52 @@ void BedrockServer::runCommand(unique_ptr<BedrockCommand>&& _command, bool isBlo
networkLoopCount++;
fd_map fdm;
command->prePoll(fdm);
const uint64_t now = STimeNow();
uint64_t nextActivity = 0;
S_poll(fdm, max(nextActivity, now) - now);

// Timeout is five minutes unless we're shutting down or standing down, in which case it's 5 seconds.
// Note that BedrockCommad::postPoll sets the timeout to the command's timeout if it's lower than this value anyway,
// So this only has an effect if it will be shorter than the command's timeout.
uint64_t maxWaitMs = 5 * 60 * 1'000;
// Determine how long we'll wait in `poll`.
uint64_t maxWaitUs = 0;

// The default case is to wait until the command will time out.
uint64_t now = STimeNow();
if (now < command->timeout()) {
maxWaitUs = command->timeout() - now;
} else {
// The command is already timed out. This will hit the check for core.isTimedOut(command) below.
break;
}

// There are two other special cases for the time to wait.
if (fdm.empty()) {
// If there are no sockets to poll, wait 1 second.
// Why would there be no sockets? It's because Auth::Stripe, as a rate-limiting feature, attaches sockets to requests after their made.
// This means a request can sit around with no actual socket attached to it for some length of time until it's turn to talk to Stripe comes up.
// If that happens though, and we're sitting in `poll` when it becomes our turn, we will wait the full five minute timeout of the original `poll`
// call before we time out and try again wit the newly-attached socket.
// Setting this to one second lets us try again more frequently. This is probably not the ideal way to handle this, but it works for now.
maxWaitUs = 1'000'000;
}

// Also, if we're shutting down or standing down, wait 1 second. This keeps the rest of the server from being blocked on commands that won't finish.
bool shuttingDown = false;
auto _syncNodeCopy = atomic_load(&_syncNode);
if (_shutdownState.load() != RUNNING || (_syncNodeCopy && _syncNodeCopy->getState() == SQLiteNodeState::STANDINGDOWN)) {
maxWaitMs = 5'000;
maxWaitUs = 1'000'000;
shuttingDown = true;
}

// Ok, go ahead and `poll`.
S_poll(fdm, maxWaitUs);


// The 3rd parameter to `postPoll` here is the total allowed idle time on this connection. We will kill connections that do nothing at all after 5 minutes normally,
// or after only 5 seconds when we're shutting down so that we can clean up and move along.
uint64_t ignore{0};
auto start = STimeNow();
command->postPoll(fdm, nextActivity, maxWaitMs);
command->postPoll(fdm, ignore, shuttingDown ? 5'000 : 300'000);
postPollCumulativeTime += (STimeNow() - start);
}

if (networkLoopCount) {
SINFO("Completed HTTPS request in " << networkLoopCount << " loops with " << postPollCumulativeTime << " total time in postPoll");
SINFO("Completed HTTPS request in " << networkLoopCount << " loops with " << postPollCumulativeTime << "us total time in postPoll");
}

// Get a DB handle to work on. This will automatically be returned when dbScope goes out of scope.
Expand Down
2 changes: 2 additions & 0 deletions libstuff/SHTTPSManager.h
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ class SStandaloneHTTPSManager : public STCPManager {
void prePoll(fd_map& fdm, Transaction& transaction);

// Default timeout for HTTPS requests is 5 minutes.This can be changed on any call to postPoll.
// This is a total amount of milliseconds of idle activity since the last send on a socket before killing it.
// The purpose of this is to be able to shut down when no activity is happening.
void postPoll(fd_map& fdm, Transaction& transaction, uint64_t& nextActivity, uint64_t timeoutMS = (5 * 60 * 1000));

// Close a transaction and remove it from our internal lists.
Expand Down

0 comments on commit ac0a516

Please sign in to comment.