Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Block Author Reward During Era Transition in Polkadot and Other Chains #4923

Open
2 tasks done
ToufeeqP opened this issue Jul 2, 2024 · 11 comments
Open
2 tasks done
Labels
C2-good-first-issue A task for a first time contributor to become familiar with the Polkadot-SDK. I2-bug The node fails to follow expected behavior.

Comments

@ToufeeqP
Copy link
Contributor

ToufeeqP commented Jul 2, 2024

Is there an existing issue?

  • I have searched the existing issues

Experiencing problems? Have you tried our Stack Exchange first?

  • This is not a support question.

Description of bug

During era transitions in PoS-based chains utilizing the BABE protocol for block production, the authorship of blocks is sometimes incorrectly identified. This leads to the block author reward being misallocated. Specifically, the reward may be given to the wrong authority or not given at all for certain era-transitioning blocks.

As a result of this bug, many of the era transition blocks have either rewarded the incorrect authority or were not rewarded at all. The following list of blocks is non-exhaustive:

  • No reward blocks: [328745, 362133]
  • Incorrect reward blocks: [10016532, 10030907, 10059692, 10073976, 2088572]

Steps to reproduce

  1. Set up a PoS-based chain using Substrate with BABE as the block production authority with multiple validators.
  2. Modify the staking dynamics to change the vote weight of the existing validator set or increase the number of validators
  3. Allow the chain to progress to an era transition.
  4. Observe the authorship rewards during the blocks where the new era is enacted.
  5. Verify the author of a block by decoding the SEAL digest and check the author who was rewarded by the authorship pallet.
@ToufeeqP ToufeeqP added I10-unconfirmed Issue might be valid, but it's not yet known. I2-bug The node fails to follow expected behavior. labels Jul 2, 2024
@chungquantin
Copy link
Contributor

I think it would be clearer if you can attach a code snippet for the PoS chain runtime & node service configuration.

@burdges
Copy link

burdges commented Jul 10, 2024

This is bizarre because you'd expect the chain tracks its own block author rewards on-chain.

@ToufeeqP
Copy link
Contributor Author

I think it would be clearer if you can attach a code snippet for the PoS chain runtime & node service configuration.

This issue applies to both the Polkadot and Kusama chains as well. I have already provided some blocks on Polkadot as references where the block authoring reward was either not awarded to the author or was given to a different validator than the one who authored the block.

@bkchr
Copy link
Member

bkchr commented Jul 17, 2024

Yeah the problem here is that when a session enacts a new validator set and the call to determine the block author comes later, we will already fetch the validator from the new set. The solution is that we buffer the old set in the current block.

@bkchr bkchr added C2-good-first-issue A task for a first time contributor to become familiar with the Polkadot-SDK. and removed I10-unconfirmed Issue might be valid, but it's not yet known. labels Jul 18, 2024
@eagr
Copy link
Contributor

eagr commented Aug 26, 2024

To ensure I understand this correctly.

When allocating block rewards, the author is fetched using the authorship pallet. If the author id is not already available in storage, the pallet will try to determine the author from the block digest, the result of which is the index of the responsible validator. The index will then be used to find the author in the current validator set, which may or may not coincide with the previous session. This is why it sometimes fails to identify the correct validator to reward or at all. So the session pallet would also need to store the old validator set for rewardee lookup in this case.

Curious, why isn’t the author account id in the digest in the first place?

@bkchr
Copy link
Member

bkchr commented Aug 27, 2024

Curious, why isn’t the author account id in the digest in the first place?

Because the account id is not known to the node side. The node side only knows the session key, which is different to the account.

@eagr
Copy link
Contributor

eagr commented Sep 7, 2024

Because the account id is not known to the node side. The node side only knows the session key, which is different to the account.

Ah, I see.


It seems to me the buffer of the previous validator set is only useful for determining the author of the first block of each session, which is kinda trivial (but a bug is a bug). Do you it's worth the overhead of extra storage? @bkchr @burdges

The best I could find about determining whether a block is the first in a session is to check if authorities change is signaled in the digest. But that's specific to consensus engines. Do you think it makes sense to add a generic interface for that, similar to trait FindAuthor? Is if block_number % blocks_per_session == 0 reliable for this?

@burdges
Copy link

burdges commented Sep 7, 2024

It's not storage overhead that's problematic here. We already have two validator sets simultaniously because elections require doing so. Afaik approvals checks needs the old authorities sets anyways, likely a third validator sets, so maybe someone can tell you how to access it.

@eagr
Copy link
Contributor

eagr commented Sep 7, 2024

impl<T: Config, Inner: FindAuthor<u32>> FindAuthor<T::ValidatorId>
	for FindAccountFromAuthorIndex<T, Inner>
{
	fn find_author<'a, I>(digests: I) -> Option<T::ValidatorId>
	where
		I: 'a + IntoIterator<Item = (ConsensusEngineId, &'a [u8])>,
	{
		let i = Inner::find_author(digests)?;

		fn first_block_of_session(digests: I) -> bool {
			// guess this wouldn't work? as a slot can be without a block
			frame_system::Pallet::<T>::block_number() % SLOTS_PER_SESSION == 0

			// or
			// check if authorities change is signaled in digests
		}

		let validators = if first_block_of_session(digests) {
			OldValidators::<T>::get()
		} else {
			Validators::<T>::get()
		};
		validators.get(i as usize).cloned()
	}
}

Thinking about something like this: OldValidators will be populated with the value of Validators on each rotation. Idk anything about the multiple sets of validators, but if the Validators value is being used for author lookup, guess OldValidators should be fine too, no?

@burdges
Copy link

burdges commented Sep 7, 2024

You missunderstood my comment:

We should store the different session keys for each validator together in one structure of course, otherwise wastes lookup space. An OldValidators must already exist in approval voting, so all other session keys should live there too.

An OldValidators must already exist for equivocations slashing too, so the exact keys your interested in here, well unless we produce a history proof off-chain, but likely no.

Just use the OldValidators that already exists.

Also, our storage lacks O(1) moves, and future storages would lack them too, so Validators and OldValidators should not be a fixed storage path, but their storage prefixs should presumably be constructed dynamically based upon the session. I'm maybe wrong on this, maybe we copy them since they're only 1000 records, but more likely we keep them for 28 days.

@eagr
Copy link
Contributor

eagr commented Sep 8, 2024

What you said makes sense. If the set is already stored somewhere, for sure we shouldn't duplicate it. I've tried searching for auth, valid, set, key in Substrate, and gone through the storage items, but I couldn't find a storage that corresponds to OldValidators anywhere. So yeah, if that's the case, I'd need some pointer to how to access it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C2-good-first-issue A task for a first time contributor to become familiar with the Polkadot-SDK. I2-bug The node fails to follow expected behavior.
Projects
None yet
Development

No branches or pull requests

5 participants