Skip to content

Commit

Permalink
Fix possible out-of-order/inconsistent seqno-to-time mapping
Browse files Browse the repository at this point in the history
Summary: The crash test with COERCE_CONTEXT_SWITCH=1 is showing a
failure:

```
db_stress: db/seqno_to_time_mapping.cc:480: bool rocksdb::SeqnoToTimeMapping::Append(rocksdb::SequenceNumber, uint64_t): Assertion `false' failed.
```

with `DBImpl::SetOptions()` in the call stack. This assertion and those
around it are mostly there for catching systematic problems with
recording the mappings, as small imprecisions here and there are not a
problem in production. Nevertheless, we need to fix this to maintain the
assertions for catching possible future systematic problems.

Because the seqno and time are acquired before holding the DB mutex,
there could be a race where T1 acquires latest seqno, T1 acquires latest
seqno, T2 acquires unix time, T1 acquires unix time, and entries are
not just saved out-of-order, but would represent an inconsistent (time
traveling) mapping if they were saved.

We can fix this by getting the seqno and unix times while under the
mutex. (Hopefully this is not caused by non-monotonic clock adjustments.)

Test Plan: local run blackbox_crash_test with COERCE_CONTEXT_SWITCH=1.
This is not really a production concern, and the conditions are not
really reproducible in a unit test after the fix.
  • Loading branch information
pdillinger committed Jan 7, 2025
1 parent 9b1d0c0 commit 0ff2e12
Showing 1 changed file with 31 additions and 20 deletions.
51 changes: 31 additions & 20 deletions db/db_impl/db_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -6812,17 +6812,24 @@ void DBImpl::RecordSeqnoToTimeMapping(uint64_t populate_historical_seconds) {
// for SeqnoToTimeMapping. We don't know how long it has been since the last
// sequence number was written, so we at least have a one-sided bound by
// sampling in this order.
SequenceNumber seqno = GetLatestSequenceNumber();
int64_t unix_time_signed = 0;
immutable_db_options_.clock->GetCurrentTime(&unix_time_signed)
.PermitUncheckedError(); // Ignore error
uint64_t unix_time = static_cast<uint64_t>(unix_time_signed);

// ALSO, to avoid out-of-order mappings, we need to get the seqno and times
// while holding the DB mutex. (This is really to make testing happy because
// it's fine to throw out extra close-but-not-quite-consistent mappings in
// production.)
std::vector<SuperVersionContext> sv_contexts;
if (populate_historical_seconds > 0) {
bool success = true;
{
InstrumentedMutexLock l(&mutex_);
bool success = true;
SequenceNumber seqno;
uint64_t unix_time;
{
InstrumentedMutexLock l(&mutex_);

seqno = GetLatestSequenceNumber();
int64_t unix_time_signed = 0;
immutable_db_options_.clock->GetCurrentTime(&unix_time_signed)
.PermitUncheckedError(); // Ignore error
unix_time = static_cast<uint64_t>(unix_time_signed);

if (populate_historical_seconds > 0) {
if (seqno > 1 && unix_time > populate_historical_seconds) {
// seqno=0 is reserved
SequenceNumber from_seqno = 1;
Expand All @@ -6836,7 +6843,20 @@ void DBImpl::RecordSeqnoToTimeMapping(uint64_t populate_historical_seconds) {
assert(unix_time > populate_historical_seconds);
success = false;
}
} else {
// FIXME: assert(seqno > 0);
// Always successful assuming seqno never go backwards
seqno_to_time_mapping_.Append(seqno, unix_time);
InstallSeqnoToTimeMappingInSV(&sv_contexts);
}
}

// clean up & report outside db mutex
for (SuperVersionContext& sv_context : sv_contexts) {
sv_context.Clean();
}

if (populate_historical_seconds > 0) {
if (success) {
ROCKS_LOG_INFO(
immutable_db_options_.info_log,
Expand All @@ -6851,16 +6871,7 @@ void DBImpl::RecordSeqnoToTimeMapping(uint64_t populate_historical_seconds) {
seqno, unix_time - populate_historical_seconds, unix_time);
}
} else {
InstrumentedMutexLock l(&mutex_);
// FIXME: assert(seqno > 0);
// Always successful assuming seqno never go backwards
seqno_to_time_mapping_.Append(seqno, unix_time);
InstallSeqnoToTimeMappingInSV(&sv_contexts);
}

// clean up outside db mutex
for (SuperVersionContext& sv_context : sv_contexts) {
sv_context.Clean();
assert(success);
}
}

Expand Down

0 comments on commit 0ff2e12

Please sign in to comment.