Notifies solver if their solution simulates successfully at least once #2077

marcovc · 2023-11-23T20:24:25Z

Description

As a solver, I am interested in automatically identifying "bad" tokens so that they can be automatically censored. A "bad" token is one that has some logic that makes it fail most of the times. When I receive a simulation failure, I run an online learning procedure to try to identify them. However, some (most?) simulation failures happen just due to volatilty, in which case this procedure should be skipped, or otherwise there can be false positives.

Changes

This PR adds a boolean to the notification sent to solvers, stating if the driver was able to simulate the solver solution at least once. If so, then this is a very strong indicator that the solution was rejected due to volatility.

I considered other options to avoid this PR:

Send the liquidity fetch block number in the instance.json. Since the notification comes with a block number, then comparing the notification block number with the liquidity fetch block number could indicate that it simulated once at least. But this would still require a PR.
Same as previous, but fetch the liquidity block number from the solver competition endpoint. However, I recall that that endpoint was (is?) significantly behind the notification.
Use tenderly myself. This seems like a waste of resources just for doing something the driver is already doing anyway.

How to test

I did not test it since I forgot how to. In fact I would be very surprised if I changed things in the right places. Anyway, if I failed to do so, consider the PR as an elaborate feature request :)

Thanks!

…once.

github-actions · 2023-11-23T20:24:39Z

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

marcovc · 2023-11-23T20:26:00Z

I have read the CLA Document and I hereby sign the CLA

marcovc · 2023-11-24T06:38:10Z

I just realized that the information I need is actually being logged internally here:

services/crates/solver/src/driver_logger.rs

Lines 210 to 215 in e185f8e

    
                           } else { 
        
                               tracing::debug!( 
        
                                   name = solver.name, 
        
                                   ?error_at_latest_block, 
        
                                   "simulation only failed on the latest block but not on the block the \ 
        
                                    auction started",

So I guess the changes suggested in the PR are not what they should :(

sunce86

Good direction.

The important part is that we are missing the implementation in the solvers crate and populating the SolverRejectionReason with the new data.

The code you referred to in solver crate is for the legacy driver and not relevant for this PR.

sunce86 · 2023-11-24T11:08:10Z

crates/driver/src/domain/competition/mod.rs

                while let Some(block) = stream.next().await {
                    if let Err(err) = self.simulate_settlement(&settlement).await {
                        observe::winner_voided(block, &err);
                        *score_ref = None;
                        *self.settlement.lock().unwrap() = None;
                        if let Some(id) = settlement.notify_id() {
-                            notify::simulation_failed(&self.solver, auction.id(), id, &err);
+                            notify::simulation_failed(&self.solver, auction.id(), id, &err, simulated_once);


No need for simulated_once, just call
notify::simulation_failed(&self.solver, auction.id(), id, &err, true)

I guess I am misinterpreting the code then. So if the condition on (new) line 195 is true in the first iteration of the loop, then it would return that it simulated successfully at least once, where in fact it hasn't, right? What am I missing?

The first ever simulation happens when the solution is encoded, so the successful simulation is the prerequisite for a solution to be encoded.

This whole code block is not even executed if the first simulation failed. So, if the simulation fails here, it means that is the second, third .... attempt.

Got you, thanks!

sunce86 · 2023-11-24T11:12:20Z

crates/driver/src/infra/notify/notification.rs

-    SimulationFailed(eth::BlockNo, Transaction),
+    /// Failed simulation during competition. Last parameter is true 
+    /// if has simulated at least once.
+    SimulationFailed(eth::BlockNo, Transaction, bool),


nit: Would suggest this to be something like
SimulationFailed(eth::BlockNo, Transaction, SucceededAtLeastOnce),
where one would define this type as
type SucceededAtLeastOnce = bool;

sunce86 · 2023-11-24T11:13:26Z

crates/driver/src/infra/solver/dto/notification.rs

@@ -103,7 +104,7 @@ pub enum Kind {
    Timeout,
    EmptySolution,
    DuplicatedSolutionId,
-    SimulationFailed(BlockNo, Tx),
+    SimulationFailed(BlockNo, Tx, bool),


Same here, substitute bool with more readable type.

marcovc · 2023-11-24T14:12:52Z

Thanks for the help. I tried to follow your directions, but I am a bit confused as it seems to be at least two different versions of the code in the same repo.

I was wondering that maybe a better name than SimulatesSuccessfullyAtLeastOnce could be SimulatesSuccesfullyWhenEncoded, or SimulatesSuccessfullyAtLiquidityBlock, since like you explained if it passes encoding than the flag is true.

Anyone, please let me know what are the next changes :)

sunce86 · 2023-11-27T15:43:33Z

The next step would be to also update the solvers side here: https://github.com/cowprotocol/services/blob/main/crates/solvers/src/api/routes/notify/dto/notification.rs#L106

marcovc · 2023-11-27T16:10:21Z

I think I did here. No?

fleupold · 2023-11-30T08:01:35Z

The change looks overall good to me. It would be good to run it end to end to see if you receive the expected notifications. We are currently in the process of creating a playground to facilitate running the whole system easily. In the meantime you can

Launch the autopilot in shadow mode: cargo run --bin autopilot -- --skip-event-sync true --node-url <YOUR_ETH_RPC> --shadow https://api.cow.fi/mainnet --native-price-estimators Baseline --enable-colocation true --drivers "mysolver|http://localhost:11088/mysolver
Launch a driver: cargo run -p driver -- --config driver.config.toml --ethrpc <YOUR_ETH_RPC>
Launch a solver engine pointing at your endpoint: cargo run -p solvers -- legacy --config ./crates/solvers/config/example.legacy.toml

For correct configuration files, please look at the example .toml files.

@sunce86 do you have any remaining comments?

sunce86

Lg

marcovc · 2023-12-01T09:53:42Z

Thanks for this walkthrough. I have some questions:

Launch the autopilot in shadow mode: cargo run --bin autopilot -- --skip-event-sync true --node-url <YOUR_ETH_RPC> --shadow https://api.cow.fi/mainnet --native-price-estimators Baseline --enable-colocation true --drivers "mysolver|http://localhost:11088/mysolver

What is "mysolver" here? I thought this was my solver, but after running it I think it is what you call "solver engine" in step 3 (which is not my solver engine)? If so, what should I set "mysolver" to? It can't be the "solver-name" inside the toml, since these have spaces.

Launch a driver: cargo run -p driver -- --config driver.config.toml --ethrpc <YOUR_ETH_RPC>

I can't find any driver.config.toml in any directory of the services repo. Am I missing it?

Launch a solver engine pointing at your endpoint: cargo run -p solvers -- legacy --config ./crates/solvers/config/example.legacy.toml

This is clear.

fleupold · 2023-12-01T15:49:19Z

What is "mysolver" here?

It's a placeholder for the name of the solver engine you are spawning in step 3. This one connects to your Quasilabs API. The colocated setup looks roughly:

autopilot -> driver -> solver engine -> legacy solver (Quasilabs)

You can cally mysolver what you want, but need to make sure the autopilot connects to the http://driver_ip:driver_port/solver_name and the driver contains a solver_name entry forwarding traffic http://solver_engine_ip:solver_engine_port

I can't find any driver.config.toml in any directory of the services repo. Am I missing it?

services/crates/driver/example.toml

Line 1 in eec5702

[[solver]]

marcovc · 2023-12-01T19:06:41Z

@fleupold Thanks for clarifying, I was able to run the full stack. Well, sort of, it seems the solver engine does too many requests per second:

0: RPC error: Error { code: ServerError(429), message: "Your app has exceeded its compute units per second capacity. If you have retries enabled, you can safely ignore this message. If not, check out https://docs.alchemy.com/reference/throughput", data: None }

Also I was wondering, since I don't have access to my solver's private key, will the (local) driver/solver engine be able to simulate the solutions I'm going to send?

marcovc · 2023-12-01T19:20:59Z

I've exhausted both my infura and my alchemy usage for the day after using the driver for 5 minutes :)

Another question, it seems the time_limit I am getting for solving are in the order of 2 seconds. I found out about the "--solve-deadline" parameter in the component of step 1 (autopilot?) but that is set to 15.

Anyway, although it is very valuable to get to know the new stack, maybe this is not the place to ask for help. Since it seems it is not so easy to set it up, I guess I'll open another issue or contact you on telegram.

fleupold · 2023-12-04T08:45:36Z

cc @mfw78 who is wokring on making running the system locally e2e much easier.

marcovc · 2023-12-04T15:08:14Z

Do you think we can merge this meanwhile?

fleupold · 2023-12-04T17:11:42Z

Sure, would you mind serving traffic in staging this week then to test this change?

marcovc · 2023-12-05T11:43:18Z

I just run

cargo clippy --locked --workspace --all-features --all-targets -- -D warnings

and it seems there are no changes to commit (?)

My cargo version seems pretty recent:
cargo 1.74.0 (ecb9851af 2023-10-18)

fleupold · 2023-12-06T09:19:29Z

You can see what it complains about in https://github.com/cowprotocol/services/actions/runs/7100150555/job/19362557587?pr=2077 and should be able to reproduce locally

cargo 1.74.0 (ecb9851af 2023-10-18)
rustfmt 1.7.0-nightly (e9013ac 2023-12-05)
clippy 0.1.74 (79e9716 2023-11-13)

cargo +nightly fmt --all -- --check

Notifies solver if their solution as simulated successfully at least …

cae1355

…once.

marcovc requested a review from a team as a code owner November 23, 2023 20:24

github-actions bot added a commit that referenced this pull request Nov 23, 2023

@marcovc has signed the CLA from Pull Request #2077

6df88ce

sunce86 reviewed Nov 24, 2023

View reviewed changes

Addresses reviewer comments.

f989f4c

sunce86 approved these changes Nov 30, 2023

View reviewed changes

fleupold and others added 2 commits December 4, 2023 18:13

Merge branch 'main' into notify-if-solution-submitted-at-least-once

8676517

Merge branch 'main' into notify-if-solution-submitted-at-least-once

47b38a7

marcovc and others added 2 commits December 6, 2023 09:35

Linter.

7f2e87a

Merge branch 'main' into notify-if-solution-submitted-at-least-once

9483604

fleupold enabled auto-merge (squash) December 6, 2023 10:44

fleupold merged commit 3b6403f into cowprotocol:main Dec 6, 2023
8 checks passed

github-actions bot locked and limited conversation to collaborators Dec 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Notifies solver if their solution simulates successfully at least once #2077

Notifies solver if their solution simulates successfully at least once #2077

marcovc commented Nov 23, 2023

github-actions bot commented Nov 23, 2023 •

edited

Loading

marcovc commented Nov 23, 2023

marcovc commented Nov 24, 2023

sunce86 left a comment •

edited

Loading

sunce86 Nov 24, 2023

marcovc Nov 24, 2023

sunce86 Nov 24, 2023

marcovc Nov 24, 2023

sunce86 Nov 24, 2023

sunce86 Nov 24, 2023

marcovc commented Nov 24, 2023

sunce86 commented Nov 27, 2023

marcovc commented Nov 27, 2023

fleupold commented Nov 30, 2023

sunce86 left a comment

marcovc commented Dec 1, 2023

fleupold commented Dec 1, 2023

marcovc commented Dec 1, 2023

marcovc commented Dec 1, 2023

fleupold commented Dec 4, 2023

marcovc commented Dec 4, 2023

fleupold commented Dec 4, 2023

marcovc commented Dec 5, 2023 •

edited

Loading

fleupold commented Dec 6, 2023 •

edited

Loading

Notifies solver if their solution simulates successfully at least once #2077

Notifies solver if their solution simulates successfully at least once #2077

Conversation

marcovc commented Nov 23, 2023

Description

Changes

How to test

github-actions bot commented Nov 23, 2023 • edited Loading

marcovc commented Nov 23, 2023

marcovc commented Nov 24, 2023

sunce86 left a comment • edited Loading

Choose a reason for hiding this comment

sunce86 Nov 24, 2023

Choose a reason for hiding this comment

marcovc Nov 24, 2023

Choose a reason for hiding this comment

sunce86 Nov 24, 2023

Choose a reason for hiding this comment

marcovc Nov 24, 2023

Choose a reason for hiding this comment

sunce86 Nov 24, 2023

Choose a reason for hiding this comment

sunce86 Nov 24, 2023

Choose a reason for hiding this comment

marcovc commented Nov 24, 2023

sunce86 commented Nov 27, 2023

marcovc commented Nov 27, 2023

fleupold commented Nov 30, 2023

sunce86 left a comment

Choose a reason for hiding this comment

marcovc commented Dec 1, 2023

fleupold commented Dec 1, 2023

marcovc commented Dec 1, 2023

marcovc commented Dec 1, 2023

fleupold commented Dec 4, 2023

marcovc commented Dec 4, 2023

fleupold commented Dec 4, 2023

marcovc commented Dec 5, 2023 • edited Loading

fleupold commented Dec 6, 2023 • edited Loading

github-actions bot commented Nov 23, 2023 •

edited

Loading

sunce86 left a comment •

edited

Loading

marcovc commented Dec 5, 2023 •

edited

Loading

fleupold commented Dec 6, 2023 •

edited

Loading