Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert live entries to init entries in lowest level of bucketlist #4492

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ThomasBrady
Copy link
Contributor

@ThomasBrady ThomasBrady commented Oct 1, 2024

Description

Resolves #193

Changes BucketOutputIterator::put to convert LIVEENTRY to INITENTRY if mKeepDeadEntries == true.

Perhaps this variable should be renamed to mIsLowestLevel or something now that it affects dead entry persistence and init entry to live entry conversion.

Checklist

  • Reviewed the contributing document
  • Rebased on top of master (no merge commits)
  • Ran clang-format v8.0.0 (via make format or the Visual Studio extension)
  • Compiles
  • Ran all tests
  • If change impacts performance, include supporting evidence per the performance document

@ThomasBrady ThomasBrady requested review from SirTyson and removed request for SirTyson October 1, 2024 20:56
@ThomasBrady ThomasBrady changed the title Convert live entries to init entries in lowest level of bucketlist WIP: Convert live entries to init entries in lowest level of bucketlist Oct 1, 2024
@ThomasBrady ThomasBrady changed the title WIP: Convert live entries to init entries in lowest level of bucketlist Convert live entries to init entries in lowest level of bucketlist Oct 2, 2024
@ThomasBrady
Copy link
Contributor Author

@SirTyson can you PTAL?

Copy link
Contributor

@SirTyson SirTyson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Just one logic cleanup and I think a higher level test would be more helpful here vs. a low level one.

@@ -67,17 +67,36 @@ BucketOutputIterator::put(BucketEntry const& e)
++mMergeCounters.mOutputIteratorTombstoneElisions;
return;
}
std::optional<BucketEntry> maybeInitEntry;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to do this check here. If you move this to the bottom right before *mBuf = ..., you could get rid of the optional the complicated referencing later on and just say

if (shouldRewriteAsInit)
{
auto copy = e;
copy = init;
*mBuf = init;
}
else 
{
*mBuf = e;
}

@@ -399,6 +399,37 @@ TEST_CASE("merges proceed old-style despite newer shadows",
}
}

TEST_CASE("lowest level merge converts live to init", "[bucket][bucketinit]")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test would be better as a BucketList level test instead of a Bucket level test. While I'm confident your code works today, there's an assumption here that keepDeadEntries = false is always the case and only the case for the bottom level bucket. While this is true today, it's an unstated assumption that could change in the future (as in will change with my state archival stuff), so a higher level test would test that:

  1. Your code works (helpful for this PR)
  2. This assumption in the BucketList logic is upheld (useful for future PRs)

src/bucket/Bucket.h Outdated Show resolved Hide resolved
@@ -157,9 +156,31 @@ BucketOutputIterator<BucketT>::put(typename BucketT::EntryT const& e)
mBuf = std::make_unique<typename BucketT::EntryT>();
}

if constexpr (std::is_same_v<BucketT, LiveBucket>)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SirTyson is this correct? Do we want to do this conversion on other BucketT types?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, the constexpr check is correct, since this only applies to LiveBucket types. We can still simplify this though. Try

// Comment about rewrite here
bool shouldRewrite = std::is_same_v<BucketT, LiveBucket> && mRewriteLiveToInitEntries && e.type() == LIVEENTRY &&
            protocolVersionStartsFrom(
                mMeta.ledgerVersion,
                BucketT::FIRST_PROTOCOL_SUPPORTING_INITENTRY_AND_METAENTRY);
if (shouldRewrite)
{
            ++mMergeCounters.mOutputIteratorLiveToInitRewrites;
            auto eCopy = e;
            eCopy.type(INITENTRY);
            *mBuf = eCopy;
}
else
{
            *mBuf = e;
}

++mMergeCounters.mOutputIteratorBufferUpdates;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need the block to be conditional on constexpr (std::is_same_v<BucketT, LiveBucket>) in order to compile, as other BucketT::EntryT have different types, no?

For example, when BucketT is a HotArchiveBucket, e.type() == LIVEENTRY is a comparison of different enumeration types (BucketEntryType to HotArchiveBucketEntryType), and eCopy.type(INITENTRY) doesn't compile as there's no HotArchiveBucketEntry::type(BucketEntryType t).

Copy link
Contributor

@SirTyson SirTyson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I was a little unclear in my last comment. A little cleanup on the interface, and I think the test case still needs improving.

@@ -306,7 +306,8 @@ BucketBase::merge(BucketManager& bucketManager, uint32_t maxProtocolVersion,
std::shared_ptr<BucketT> const& newBucket,
std::vector<std::shared_ptr<BucketT>> const& shadows,
bool keepTombstoneEntries, bool countMergeEvents,
asio::io_context& ctx, bool doFsync)
asio::io_context& ctx, bool doFsync,
bool rewriteLiveToInitEntries)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't think I was clear in my original comment. I don't think we need another bool for rewriteLiveToInitEntries. This leads to footguns, as it should always be the case (currently) that when keepTombstoneEntries == true, rewriteLiveToInitEntries == true. When I was talking about the implicit assumption, I didn't mean to add a flag, as this adds more complexity for future maintainers. Rather, we should keep the interface as is, but add a test that will trip. Sorry about that.

@@ -65,6 +65,9 @@ class BucketBase : public NonMovableOrCopyable
public:
static constexpr ProtocolVersion
FIRST_PROTOCOL_SUPPORTING_PERSISTENT_EVICTION = ProtocolVersion::V_23;
static constexpr ProtocolVersion
FIRST_PROTOCOL_SUPPORTING_INITENTRY_AND_METAENTRY =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be redefined here. It's only relevant for LiveBucket, so we should define it in LiveBucket only (which it currently is). It's static so we can use it everywhere via LiveBucket::FIRST_PROTOCOL_SUPPORTING_INITENTRY_AND_METAENTRY.

@@ -157,9 +156,31 @@ BucketOutputIterator<BucketT>::put(typename BucketT::EntryT const& e)
mBuf = std::make_unique<typename BucketT::EntryT>();
}

if constexpr (std::is_same_v<BucketT, LiveBucket>)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, the constexpr check is correct, since this only applies to LiveBucket types. We can still simplify this though. Try

// Comment about rewrite here
bool shouldRewrite = std::is_same_v<BucketT, LiveBucket> && mRewriteLiveToInitEntries && e.type() == LIVEENTRY &&
            protocolVersionStartsFrom(
                mMeta.ledgerVersion,
                BucketT::FIRST_PROTOCOL_SUPPORTING_INITENTRY_AND_METAENTRY);
if (shouldRewrite)
{
            ++mMergeCounters.mOutputIteratorLiveToInitRewrites;
            auto eCopy = e;
            eCopy.type(INITENTRY);
            *mBuf = eCopy;
}
else
{
            *mBuf = e;
}

++mMergeCounters.mOutputIteratorBufferUpdates;

@@ -521,6 +522,22 @@ TEST_CASE_VERSIONS("live bucket tombstones expire at bottom level",
REQUIRE(e0.nDead != 0);
REQUIRE(e1.nDead != 0);
REQUIRE(e2.nDead == 0);
if (protocolVersionStartsFrom(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't actually test our fix. The problem is, in old bucketlist code that has long since been fixed, live entries where written to the bottom level. The current code doesn't exhibit this behavior. What you're testing here is if the current BucketList logic works, but not that your conversion code from the buggy BucketList -> good bucketlist actually works. What we should do is populate a BucketList, but specifically populate it with buggy contents such that the bottom level has live entries (see "live bucket tombstones expire at bottom level" for populating a bucketList without addBatch). Then, add empty batched to your BucketList until all entries have coallesced in the bottom bucket (see "hot archive bucket tombstones expire at bottom level" for this example). Finally, scan the bottom bucket and make sure they're all init.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants