Skip to content

Commit

Permalink
Staking ledger bonding fixes (#3639)
Browse files Browse the repository at this point in the history
Currently, the staking logic does not prevent a controller from becoming
a stash of *another* ledger (introduced by [removing this
check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850)).
Given that the remaining of the code expects that never happens, bonding
a ledger with a stash that is a controller of another ledger may lead to
data inconsistencies and data losses in bonded ledgers. For more
detailed explanation of this issue:
https://hackmd.io/@gpestana/HJoBm2tqo/%2FTPdi28H7Qc2mNUqLSMn15w

In a nutshell, when fetching a ledger with a given controller, we may be
end up getting the wrong ledger which can lead to unexpected ledger
states.

This PR also ensures that `set_controller` does not lead to data
inconsistencies in the staking ledger and bonded storage in the case
when a controller of a stash is a stash of *another* ledger. and
improves the staking `try-runtime` checks to catch potential issues with
the storage preemptively.

In summary, there are two important cases here:

1. **"Sane" double bonded ledger**

When a controller of a ledger is a stash of *another* ledger. In this
case, we have:

```
> Bonded(stash, controller)
(A, B)  // stash A with controller B
(B, C) // B is also a stash of another ledger
(C, D)

> Ledger(controller)
Ledger(B) = L_a (stash = A)
Ledger(C) = L_b (stash = B)
Ledger(D) = L_c (stash = C)
```

In this case, the ledgers can be mutated and all operations are OK.
However, we should not allow `set_controller` to be called if it means
it results in a "corrupt" double bonded ledger (see below).

3. **"Corrupt" double bonded ledger**

```
> Bonded(stash, controller)
(A, B)  // stash A with controller B
(B, B)
(C, D)
```
In this case, B is a stash and controller AND is corrupted, since B is
responsible for 2 ledgers which is not correct and will lead to
inconsistent states. Thus, in this case, in this PR we are preventing
these ledgers from mutating (i.e. operations like bonding extra etc)
until the ledger is brought back to a consistent state.

---

**Changes**:
- Checks if stash is already a controller when calling `Call::bond`
(fixes the regression introduced by [removing this
check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850));
- Ensures that all fetching ledgers from storage are done through the
`StakingLedger` API;
- Ensures that -- when fetching a ledger from storage using the
`StakingLedger` API --, a `Error::BadState` is returned if the ledger
bonding is in a bad state. This prevents bad ledgers from mutating (e.g.
`bond_extra`, `set_controller`, etc) its state and avoid further data
inconsistencies.
- Prevents stashes which are controllers or another ledger from calling
`set_controller`, since that may lead to a bad state.
- Adds further try-state runtime checks that check if there are ledgers
in a bad state based on their bonded metadata.

Related to #3245

---------

Co-authored-by: Kian Paimani <[email protected]>
Co-authored-by: kianenigma <[email protected]>
  • Loading branch information
3 people committed Apr 11, 2024
1 parent 434d07d commit b73bffc
Show file tree
Hide file tree
Showing 6 changed files with 354 additions and 21 deletions.
19 changes: 19 additions & 0 deletions prdoc/pr_3639.prdoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
title: Prevents staking controllers from becoming stashes of different ledgers; Ensures that no ledger in bad state is mutated.

doc:
- audience: Runtime User
description: |
This PR introduces a fix to the staking logic which prevents an existing controller from bonding as a stash of another ledger, which
lead to staking ledger inconsistencies down the line. In addition, it adds a few (temporary) gates to prevent ledgers that are already
in a bad state from mutating its state.

In summary:
* Checks if stash is already a controller when calling `Call::bond` and fails if that's the case;
* Ensures that all fetching ledgers from storage are done through the `StakingLedger` API;
* Ensures that a `Error::BadState` is returned if the ledger bonding is in a bad state. This prevents bad ledgers from mutating (e.g.
`bond_extra`, `set_controller`, etc) its state and avoid further data inconsistencies.
* Prevents stashes which are controllers or another ledger from calling `set_controller`, since that may lead to a bad state.
* Adds further try-state runtime checks that check if there are ledgers in a bad state based on their bonded metadata.

crates:
- name: pallet-staking
61 changes: 53 additions & 8 deletions substrate/frame/staking/src/ledger.rs
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,8 @@
//! state consistency.
use frame_support::{
defensive,
traits::{LockableCurrency, WithdrawReasons},
defensive, ensure,
traits::{Defensive, LockableCurrency, WithdrawReasons},
};
use sp_staking::StakingAccount;
use sp_std::prelude::*;
Expand Down Expand Up @@ -106,18 +106,39 @@ impl<T: Config> StakingLedger<T> {
/// This getter can be called with either a controller or stash account, provided that the
/// account is properly wrapped in the respective [`StakingAccount`] variant. This is meant to
/// abstract the concept of controller/stash accounts from the caller.
///
/// Returns [`Error::BadState`] when a bond is in "bad state". A bond is in a bad state when a
/// stash has a controller which is bonding a ledger associated with another stash.
pub(crate) fn get(account: StakingAccount<T::AccountId>) -> Result<StakingLedger<T>, Error<T>> {
let controller = match account {
StakingAccount::Stash(stash) => <Bonded<T>>::get(stash).ok_or(Error::<T>::NotStash),
StakingAccount::Controller(controller) => Ok(controller),
}?;
let (stash, controller) = match account.clone() {
StakingAccount::Stash(stash) =>
(stash.clone(), <Bonded<T>>::get(&stash).ok_or(Error::<T>::NotStash)?),
StakingAccount::Controller(controller) => (
Ledger::<T>::get(&controller)
.map(|l| l.stash)
.ok_or(Error::<T>::NotController)?,
controller,
),
};

<Ledger<T>>::get(&controller)
let ledger = <Ledger<T>>::get(&controller)
.map(|mut ledger| {
ledger.controller = Some(controller.clone());
ledger
})
.ok_or(Error::<T>::NotController)
.ok_or(Error::<T>::NotController)?;

// if ledger bond is in a bad state, return error to prevent applying operations that may
// further spoil the ledger's state. A bond is in bad state when the bonded controller is
// associted with a different ledger (i.e. a ledger with a different stash).
//
// See <https://github.com/paritytech/polkadot-sdk/issues/3245> for more details.
ensure!(
Bonded::<T>::get(&stash) == Some(controller) && ledger.stash == stash,
Error::<T>::BadState
);

Ok(ledger)
}

/// Returns the reward destination of a staking ledger, stored in [`Payee`].
Expand Down Expand Up @@ -201,6 +222,30 @@ impl<T: Config> StakingLedger<T> {
}
}

/// Sets the ledger controller to its stash.
pub(crate) fn set_controller_to_stash(self) -> Result<(), Error<T>> {
let controller = self.controller.as_ref()
.defensive_proof("Ledger's controller field didn't exist. The controller should have been fetched using StakingLedger.")
.ok_or(Error::<T>::NotController)?;

ensure!(self.stash != *controller, Error::<T>::AlreadyPaired);

// check if the ledger's stash is a controller of another ledger.
if let Some(bonded_ledger) = Ledger::<T>::get(&self.stash) {
// there is a ledger bonded by the stash. In this case, the stash of the bonded ledger
// should be the same as the ledger's stash. Otherwise fail to prevent data
// inconsistencies. See <https://github.com/paritytech/polkadot-sdk/pull/3639> for more
// details.
ensure!(bonded_ledger.stash == self.stash, Error::<T>::BadState);
}

<Ledger<T>>::remove(&controller);
<Ledger<T>>::insert(&self.stash, &self);
<Bonded<T>>::insert(&self.stash, &self.stash);

Ok(())
}

/// Clears all data related to a staking ledger and its bond in both [`Ledger`] and [`Bonded`]
/// storage items and updates the stash staking lock.
pub(crate) fn kill(stash: &T::AccountId) -> Result<(), Error<T>> {
Expand Down
64 changes: 63 additions & 1 deletion substrate/frame/staking/src/mock.rs
Original file line number Diff line number Diff line change
Expand Up @@ -342,6 +342,11 @@ where
pub(crate) type StakingCall = crate::Call<Test>;
pub(crate) type TestCall = <Test as frame_system::Config>::RuntimeCall;

parameter_types! {
// if true, skips the try-state for the test running.
pub static SkipTryStateCheck: bool = false;
}

pub struct ExtBuilder {
nominate: bool,
validator_count: u32,
Expand Down Expand Up @@ -451,6 +456,10 @@ impl ExtBuilder {
self.balance_factor = factor;
self
}
pub fn try_state(self, enable: bool) -> Self {
SkipTryStateCheck::set(!enable);
self
}
fn build(self) -> sp_io::TestExternalities {
sp_tracing::try_init_simple();
let mut storage = frame_system::GenesisConfig::<Test>::default().build_storage().unwrap();
Expand Down Expand Up @@ -579,7 +588,9 @@ impl ExtBuilder {
let mut ext = self.build();
ext.execute_with(test);
ext.execute_with(|| {
Staking::do_try_state(System::block_number()).unwrap();
if !SkipTryStateCheck::get() {
Staking::do_try_state(System::block_number()).unwrap();
}
});
}
}
Expand Down Expand Up @@ -800,6 +811,57 @@ pub(crate) fn bond_controller_stash(controller: AccountId, stash: AccountId) ->
Ok(())
}

pub(crate) fn setup_double_bonded_ledgers() {
assert_ok!(Staking::bond(RuntimeOrigin::signed(1), 10, RewardDestination::Staked));
assert_ok!(Staking::bond(RuntimeOrigin::signed(2), 20, RewardDestination::Staked));
assert_ok!(Staking::bond(RuntimeOrigin::signed(3), 20, RewardDestination::Staked));
// not relevant to the test case, but ensures try-runtime checks pass.
[1, 2, 3]
.iter()
.for_each(|s| Payee::<Test>::insert(s, RewardDestination::Staked));

// we want to test the case where a controller can also be a stash of another ledger.
// for that, we change the controller/stash bonding so that:
// * 2 becomes controller of 1.
// * 3 becomes controller of 2.
// * 4 becomes controller of 3.
let ledger_1 = Ledger::<Test>::get(1).unwrap();
let ledger_2 = Ledger::<Test>::get(2).unwrap();
let ledger_3 = Ledger::<Test>::get(3).unwrap();

// 4 becomes controller of 3.
Bonded::<Test>::mutate(3, |controller| *controller = Some(4));
Ledger::<Test>::insert(4, ledger_3);

// 3 becomes controller of 2.
Bonded::<Test>::mutate(2, |controller| *controller = Some(3));
Ledger::<Test>::insert(3, ledger_2);

// 2 becomes controller of 1
Bonded::<Test>::mutate(1, |controller| *controller = Some(2));
Ledger::<Test>::insert(2, ledger_1);
// 1 is not controller anymore.
Ledger::<Test>::remove(1);

// checks. now we have:
// * 3 ledgers
assert_eq!(Ledger::<Test>::iter().count(), 3);
// * stash 1 has controller 2.
assert_eq!(Bonded::<Test>::get(1), Some(2));
assert_eq!(StakingLedger::<Test>::paired_account(StakingAccount::Stash(1)), Some(2));
assert_eq!(Ledger::<Test>::get(2).unwrap().stash, 1);

// * stash 2 has controller 3.
assert_eq!(Bonded::<Test>::get(2), Some(3));
assert_eq!(StakingLedger::<Test>::paired_account(StakingAccount::Stash(2)), Some(3));
assert_eq!(Ledger::<Test>::get(3).unwrap().stash, 2);

// * stash 3 has controller 4.
assert_eq!(Bonded::<Test>::get(3), Some(4));
assert_eq!(StakingLedger::<Test>::paired_account(StakingAccount::Stash(3)), Some(4));
assert_eq!(Ledger::<Test>::get(4).unwrap().stash, 3);
}

#[macro_export]
macro_rules! assert_session_era {
($session:expr, $era:expr) => {
Expand Down
98 changes: 95 additions & 3 deletions substrate/frame/staking/src/pallet/impls.rs
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,8 @@ impl<T: Config> Pallet<T> {
let controller = Self::bonded(&validator_stash).ok_or_else(|| {
Error::<T>::NotStash.with_weight(T::WeightInfo::payout_stakers_alive_staked(0))
})?;
let ledger = <Ledger<T>>::get(&controller).ok_or(Error::<T>::NotController)?;

let ledger = Self::ledger(StakingAccount::Controller(controller))?;
let page = EraInfo::<T>::get_next_claimable_page(era, &validator_stash, &ledger)
.ok_or_else(|| {
Error::<T>::AlreadyClaimed
Expand Down Expand Up @@ -1709,7 +1710,7 @@ impl<T: Config> StakingInterface for Pallet<T> {
) -> Result<bool, DispatchError> {
let ctrl = Self::bonded(&who).ok_or(Error::<T>::NotStash)?;
Self::withdraw_unbonded(RawOrigin::Signed(ctrl.clone()).into(), num_slashing_spans)
.map(|_| !Ledger::<T>::contains_key(&ctrl))
.map(|_| !StakingLedger::<T>::is_bonded(StakingAccount::Controller(ctrl)))
.map_err(|with_post| with_post.error)
}

Expand Down Expand Up @@ -1817,13 +1818,91 @@ impl<T: Config> Pallet<T> {
"VoterList contains non-staker"
);

Self::check_bonded_consistency()?;
Self::check_payees()?;
Self::check_nominators()?;
Self::check_exposures()?;
Self::check_paged_exposures()?;
Self::check_ledgers()?;
Self::check_count()
}

/// Invariants:
/// * A controller should not be associated with more than one ledger.
/// * A bonded (stash, controller) pair should have only one associated ledger. I.e. if the
/// ledger is bonded by stash, the controller account must not bond a different ledger.
/// * A bonded (stash, controller) pair must have an associated ledger.
/// NOTE: these checks result in warnings only. Once
/// <https://github.com/paritytech/polkadot-sdk/issues/3245> is resolved, turn warns into check
/// failures.
fn check_bonded_consistency() -> Result<(), TryRuntimeError> {
use sp_std::collections::btree_set::BTreeSet;

let mut count_controller_double = 0;
let mut count_double = 0;
let mut count_none = 0;
// sanity check to ensure that each controller in Bonded storage is associated with only one
// ledger.
let mut controllers = BTreeSet::new();

for (stash, controller) in <Bonded<T>>::iter() {
if !controllers.insert(controller.clone()) {
count_controller_double += 1;
}

match (<Ledger<T>>::get(&stash), <Ledger<T>>::get(&controller)) {
(Some(_), Some(_)) =>
// if stash == controller, it means that the ledger has migrated to
// post-controller. If no migration happened, we expect that the (stash,
// controller) pair has only one associated ledger.
if stash != controller {
count_double += 1;
},
(None, None) => {
count_none += 1;
},
_ => {},
};
}

if count_controller_double != 0 {
log!(
warn,
"a controller is associated with more than one ledger ({} occurrences)",
count_controller_double
);
};

if count_double != 0 {
log!(warn, "single tuple of (stash, controller) pair bonds more than one ledger ({} occurrences)", count_double);
}

if count_none != 0 {
log!(warn, "inconsistent bonded state: (stash, controller) pair missing associated ledger ({} occurrences)", count_none);
}

Ok(())
}

/// Invariants:
/// * A bonded ledger should always have an assigned `Payee`.
/// * The number of entries in `Payee` and of bonded staking ledgers *must* match.
/// * The stash account in the ledger must match that of the bonded acount.
fn check_payees() -> Result<(), TryRuntimeError> {
ensure!(
(Ledger::<T>::iter().count() == Payee::<T>::iter().count()) &&
(Ledger::<T>::iter().count() == Bonded::<T>::iter().count()),
"number of entries in payee storage items does not match the number of bonded ledgers",
);

Ok(())
}

/// Invariants:
/// * Number of voters in `VoterList` match that of the number of Nominators and Validators in
/// the system (validator is both voter and target).
/// * Number of targets in `TargetList` matches the number of validators in the system.
/// * Current validator count is bounded by the election provider's max winners.
fn check_count() -> Result<(), TryRuntimeError> {
ensure!(
<T as Config>::VoterList::count() ==
Expand All @@ -1842,15 +1921,22 @@ impl<T: Config> Pallet<T> {
Ok(())
}

/// Invariants:
/// * `ledger.controller` is not stored in the storage (but populated at retrieval).
/// * Stake consistency: ledger.total == ledger.active + sum(ledger.unlocking).
/// * The controller keyeing the ledger and the ledger stash matches the state of the `Bonded`
/// storage.
fn check_ledgers() -> Result<(), TryRuntimeError> {
Bonded::<T>::iter()
.map(|(_, ctrl)| Self::ensure_ledger_consistent(ctrl))
.collect::<Result<Vec<_>, _>>()?;
Ok(())
}

/// Invariants:
/// * For each era exposed validator, check if the exposure total is sane (exposure.total =
/// exposure.own + exposure.own).
fn check_exposures() -> Result<(), TryRuntimeError> {
// a check per validator to ensure the exposure struct is always sane.
let era = Self::active_era().unwrap().index;
ErasStakers::<T>::iter_prefix_values(era)
.map(|expo| {
Expand All @@ -1868,6 +1954,10 @@ impl<T: Config> Pallet<T> {
.collect::<Result<(), TryRuntimeError>>()
}

/// Invariants:
/// * For each paged era exposed validator, check if the exposure total is sane (exposure.total
/// = exposure.own + exposure.own).
/// * Paged exposures metadata (`ErasStakersOverview`) matches the paged exposures state.
fn check_paged_exposures() -> Result<(), TryRuntimeError> {
use sp_staking::PagedExposureMetadata;
use sp_std::collections::btree_map::BTreeMap;
Expand Down Expand Up @@ -1932,6 +2022,8 @@ impl<T: Config> Pallet<T> {
.collect::<Result<(), TryRuntimeError>>()
}

/// Invariants:
/// * Checks that each nominator has its entire stake correctly distributed.
fn check_nominators() -> Result<(), TryRuntimeError> {
// a check per nominator to ensure their entire stake is correctly distributed. Will only
// kick-in if the nomination was submitted before the current era.
Expand Down
Loading

0 comments on commit b73bffc

Please sign in to comment.