Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Circuit breaker remove solver #2705

Merged
merged 12 commits into from
May 31, 2024
Merged

Circuit breaker remove solver #2705

merged 12 commits into from
May 31, 2024

Conversation

sunce86
Copy link
Contributor

@sunce86 sunce86 commented May 8, 2024

Description

Related to #2667

POC implementation for using "Roles" safe module to grant special role to an EOA to sign and execute "removeSolver" function on behalf of the gpv2_authenticator manager/owner safe.

Need to add tests to see if this actually works.

Changes

  • Added Roles smart contract
  • Added EOA account as configuration
  • Implemented remove_solver function

How to test

todo

@sunce86 sunce86 added the E:7.1 Ext. solvers operating driver See https://github.com/cowprotocol/pm/issues/57 for details label May 8, 2024
@sunce86 sunce86 self-assigned this May 8, 2024
@sunce86 sunce86 requested a review from a team as a code owner May 8, 2024 17:29
Copy link
Contributor

@fleupold fleupold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's please make sure the circuit breaker is an opt-in functionality on the autopilot (ie. it's sufficiently factored as a separate background task which can be left out e.g. in local deployments).

@m-lord-renkse m-lord-renkse changed the title DRAFT: Circuit breaker remove solver Circuit breaker remove solver May 14, 2024
@m-lord-renkse m-lord-renkse requested a review from MartinquaXD May 14, 2024 09:50
Copy link
Contributor

@MartinquaXD MartinquaXD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see a CircuitBreaker struct from the get go that should store the state required to remove solvers (e.g. EOA).
Ideally this struct should be able to handle all the logic around removing solvers in a background task. If for some reason (maybe it makes the most sense to decide some things in the autopilot during the competition) that struct could expose the remove_sovler function which the autopilot could then call if needed.
But overall I think we should try to leave the autopilot untouched if possible because for proper decentralization it should be possible to ban solvers solely based on on-chain data.

crates/autopilot/src/infra/blockchain/contracts.rs Outdated Show resolved Hide resolved
@sunce86
Copy link
Contributor Author

sunce86 commented May 14, 2024

So, how I understand, the flow is following:

  1. Settlement appears onchain -> settlement event is saved to our database
  2. OnSettlementEventUpdater is triggered. It builds a domain object that will look like something like this: https://github.com/cowprotocol/services/blob/settlement-auction-struct-new/crates/autopilot/src/domain/settlement/transaction.rs#L13-L40
  3. OnSettlementEventUpdater now needs to do 2 things: 1. update db based on this object, 2. generate violations and decide to call circuit breaker if needed
  4. circuit breaker exposes removeSolver fn and once called, emits a transaction that removes solver in a fire and forget manner.

So, bottom line, circuit breaker's smallest responsibility unit is to execute remove solver and nothing else.
@fleupold @MartinquaXD what would be your suggestion to change in this flow? I understand we want this functionality isolated as much as possible but we decided to reuse EventHandler for events, domain::settlement::transaction::Tx for decoding settlement calldata and fetching auction data, so how should we make the circuit breaker more independent?

Edit: posting the question since afais based on the PR changes, @Mateo-mro understood that circuit breaker needs to observe onchain settlement on it's own and be completely disconnected from the EventHandler.

@MartinquaXD
Copy link
Contributor

MartinquaXD commented May 14, 2024

so how should we make the circuit breaker more independent?

Not sure if my gut makes sense but it seems strange to fit all this logic intoOnSettlementEventUpdater. I guess the name is vague enough that technically it could also cover circuit breaking logic but it seems to me that it could end up nicer if all the rules and logic to decide if a settlement should cause a solver to get disconnected should live inside a single unit.
So far the OnSettlementEventUpdater was more like a non-judgmental observer, I think.
However, I see that monitoring events in the circuit breaker itself might not be the best idea in terms of code reuse so maybe a slightly alternative idea could be to let the OnEventSettlementUpdater monitor and parse the settlement events and the pass reasonably formed structs into the CircuitBreaker which then decides whether or not to remove the solver.

Another thing that irks me with tying the CircuitBreaker to the autopilot is that the circuit breaker would ideally also work when the autopilot is having issues or re-indexes stuff. With the OnEventSettlementUpdater calling the shots we would remove offending solvers again when we re-index settlements, no? I know that this doesn't happen often but this actually just happened a few weeks ago. (the idea mentioned above would also suffer from this issue)

@sunce86
Copy link
Contributor Author

sunce86 commented May 14, 2024

Notes from the call:

  1. Circuit breaker to be basically a copy paste of OnSettlementEventUpdater with regards how it listens for new events and how it is triggered (see self.settlement_updater.schedule_update() calls)
  2. Needs to read latest events (tx hash). Taking just the last event from database is not good as it can happen that it is already handled or it can happen that both staging and prod settle tx in the same block so we might miss. I am ok with having an in-memory cache of handled events, and when autopilot is restarted start from the latest event and build cache going forward.
  3. Having a tx hash, build domain::settlement::transaction::Tx. Generate violations.
  4. Decide whether to act on violations
  5. Call removeSolver function in async function in a fire and forget manner (high maxFee, nothing fancy).

@Mateo-mro can you then work on these points (except (3) as I am still working on it). You can skip this step and just mock an intention to removeSolver (for example, call it for every event for now).

@m-lord-renkse m-lord-renkse force-pushed the safe-contract branch 3 times, most recently from 38ca09f to e0b47c8 Compare May 14, 2024 14:54
@sunce86 sunce86 removed their assignment May 14, 2024
@@ -359,6 +365,7 @@ impl std::fmt::Display for Arguments {
max_settlement_transaction_wait
)?;
writeln!(f, "s3: {:?}", s3)?;
writeln!(f, "circuit breaker: {:?}", circuit_breaker)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sunce86 @MartinquaXD can we display the authenticator_eoa here? or should we hide it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EOA is actually a private key so it should be hidden.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice let's please KMS for this.

Comment on lines 347 to 355
registry.append(&mut events);
(registry.len() > Self::MAX_REGISTRY_SIZE).then(|| {
(0..(registry.len() - Self::MAX_REGISTRY_SIZE)).for_each(|_| {
registry.pop_first();
})
});
}
Copy link
Contributor

@m-lord-renkse m-lord-renkse May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with @sunce86 to do it by fetching 4 events from the database and keeping in memory the processed ones, but we are open to change it of course.

Copy link
Contributor

@m-lord-renkse m-lord-renkse May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's more, I know this piece of code could be difficult to read (I am a .then() and .for_each() lover), I am open to change it.

impl FromStr for CircuitBreaker {
type Err = anyhow::Error;

fn from_str(s: &str) -> Result<Self, Self::Err> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think providing a single string that configures 2 different things is a bit weird. Instead you could pass these things separately and make them both optional. Validating the input would then basically just be throwing an error if exactly 1 thing is set.
Also it could be that we require more arguments for the circuit breaker and adding more and more stuff to that string will become messy pretty quickly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MartinquaXD This comment is on non-existing code (I deleted it).

crates/autopilot/src/infra/blockchain/contracts.rs Outdated Show resolved Hide resolved

#[allow(dead_code)]
pub struct CircuitBreaker {
eth: infra::blockchain::circuit_breaker::CircuitBreaker,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this called eth?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MartinquaXD This comment is on non-existing code (I deleted it).

crates/autopilot/src/infra/solvers/authenticator.rs Outdated Show resolved Hide resolved
crates/autopilot/src/infra/solvers/authenticator.rs Outdated Show resolved Hide resolved
.unwrap()
};

let settlement = contracts::GPv2Settlement::at(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the decision to use a custom address or a deployed address should be done at single point in the backend. For example it doesn't make a lot of sense to have 1 component use one address for the settlement contract and another component another address. Overall I think whoever instantiates the Authenticator should be responsible for picking the correct address.

Also the settlement contract doesn't really matter here. We only need the authenticator contract.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested this to @Mateo-mro so that Authenticator relies less on intermediary objects built between the autopilot startup and Authenticator constructor being called -> to be more independent of the rest of the code.

For example it doesn't make a lot of sense to have 1 component use one address for the settlement contract and another component another address.

I believe this is also irrelevant for the Authenticator as a component that we envision running as a separate process etc. Choosing the right address would be the responsibility of Authenticator.

.expect("authenticator address"),
);

let authenticator_role = contracts::Roles::at(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also why do we effectively overriding the authenticator address but not the roles address? IMO another indicator that these dependencies should be injected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. IMO all contracts that the Authenticator depends on need to be received via Addresses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but in that case shouldn't we be passing Contracts instead of Addresses? does it also make sense for the addresses which are specific for the circuit breaker? like the role or authenticator eoa?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to make the component as independent as possible, it should receive raw configuration params so it can independently build whatever it needs to work properly.

Copy link
Contributor Author

@sunce86 sunce86 May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With that in mind, Contracts is an internal autopilot component and shouldn't be forwarded to Authenticator. The reason being once we wish to move Authenticator out of the autopilot there would be no need to refactor constructor to not work with Contracts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I think that making an object depending on another object within the same crate is totally ok. In my mind, having a struct which instantiates the contracts and then those contracts are use somewhere else (in circuit breaker) would make sense 🤔

If you want to make the module as independent as possible, then we can have two builders (or new) for it, once with raw address and the other with the contract, so if we move it out of autopilot it won't require any change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to bikeshed on this too much, since it's not essential for delivery of feature (and it also won't be too problematic to refactor this if needed). That said, I am ok with both versions.

crates/autopilot/src/infra/solvers/authenticator.rs Outdated Show resolved Hide resolved
crates/autopilot/src/infra/solvers/authenticator.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@MartinquaXD MartinquaXD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last thing. The new authenticator module currently lives in the infra/solvers which seems odd. That directory holds all the code needed to talk to the solvers. /infra/blockchain seems more appropriate since it's used to send transactions to the blockchain. 🤔

@m-lord-renkse
Copy link
Contributor

Merging since I got two approvals (counting also @sunce86 's approval here: #2705 (review))

@m-lord-renkse m-lord-renkse merged commit debfa39 into main May 31, 2024
10 checks passed
@m-lord-renkse m-lord-renkse deleted the safe-contract branch May 31, 2024 07:37
@github-actions github-actions bot locked and limited conversation to collaborators May 31, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
E:7.1 Ext. solvers operating driver See https://github.com/cowprotocol/pm/issues/57 for details
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants