Skip to content

Commit

Permalink
Migrate bad token detection to driver (#3156)
Browse files Browse the repository at this point in the history
# Description
Historically all solvers were maintained in-house and were not very
sophisticated. That meant they didn't support more exotic tokens (e.g.
fee on transfer). These tokens regularly caused alerts because they
couldn't be handled correctly.
That's why we introduced automatic bad token detection. Since none of
the solvers supported these tokens at the time bad token detection was
added on a protocol level (i.e. orderbook / autopilot).

Since then the situation changed quite a bit. There are now a lot of
external solvers and some of them claim to support these tokens now. In
order to allow users to trade these tokens the bad token detection
should no longer happen on the protocol level but rather each solver
should determine on it's own whether a token is supported.

To ease the transition we move the current `trace_callMany` based
implementation in the driver.

# Changes
* adjusted bad token detection in `shared` to strip out logic to find
token owners since we can assume in the driver that the user has the
required balances for the trade
* implemented `bad_tokens` module with top level `Detector` struct that
wraps multiple strategies
* did not use any fancy abstractions yet to see what's actually needed
and what works
* hooked up all the config logic to instantiate the new components
* 1 shared instance for simulation based bad token detection since that
can be shared across all solvers
* 1 extra instance per solver to accommodate strategies specific to each
solver (e.g. heuristic based detection is planned as a follow up)
* updated the e2e tests to use the new component (although it doesn't do
anything because `anvil` doesn't support `trace_callMany` - but at least
it doesn't cause any issues either 🤷 )

### Performance Considerations

Possibly the majority of solvers will filter orders with the simulation
detector so I created 1 instance for all to share. It uses a `DashMap`
under the hood to reduce contention when many solver access it at the
same time.
It also uses `RequestSharing` to avoid doing duplicate work.
Before doing any heavy operations we first check the cache if the needed
data is already there.

## How to test
`anvil` doesn't support `trace_callMany` so I'll have to implement a
forked e2e test. This will happen in a separate PR to reduce the scope
of this one.

---------

Co-authored-by: MartinquaXD <[email protected]>
  • Loading branch information
m-lord-renkse and MartinquaXD authored Dec 20, 2024
1 parent 796e7bd commit 1451574
Show file tree
Hide file tree
Showing 21 changed files with 500 additions and 41 deletions.
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ bigdecimal = "0.3"
cached = { version = "0.49.3", default-features = false }
chrono = { version = "0.4.38", default-features = false }
clap = { version = "4.5.6", features = ["derive", "env"] }
dashmap = "6.1.0"
derivative = "2.2.0"
derive_more = { version = "1.0.0", features = ["full"] }
ethcontract = { version = "0.25.7", default-features = false, features = ["aws-kms"] }
Expand Down
8 changes: 4 additions & 4 deletions crates/autopilot/src/run.rs
Original file line number Diff line number Diff line change
Expand Up @@ -274,16 +274,16 @@ pub async fn run(args: Arguments) {

let trace_call_detector = args.tracing_node_url.as_ref().map(|tracing_node_url| {
CachingDetector::new(
Box::new(TraceCallDetector {
web3: shared::ethrpc::web3(
Box::new(TraceCallDetector::new(
shared::ethrpc::web3(
&args.shared.ethrpc,
&http_factory,
tracing_node_url,
"trace",
),
eth.contracts().settlement().address(),
finder,
settlement_contract: eth.contracts().settlement().address(),
}),
)),
args.shared.token_quality_cache_expiry,
args.shared.token_quality_cache_prefetch_time,
)
Expand Down
1 change: 1 addition & 0 deletions crates/driver/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ axum = { workspace = true }
bigdecimal = { workspace = true }
chrono = { workspace = true, features = ["clock"], default-features = false }
cow-amm = { path = "../cow-amm" }
dashmap = { workspace = true }
derive_more = { workspace = true }
ethabi = "18.0"
ethereum-types = { workspace = true }
Expand Down
2 changes: 1 addition & 1 deletion crates/driver/src/domain/competition/auction.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ pub struct Auction {
/// See the [`Self::id`] method.
id: Option<Id>,
/// See the [`Self::orders`] method.
orders: Vec<competition::Order>,
pub(crate) orders: Vec<competition::Order>,
/// The tokens that are used in the orders of this auction.
tokens: Tokens,
gas_price: eth::GasPrice,
Expand Down
77 changes: 77 additions & 0 deletions crates/driver/src/domain/competition/bad_tokens/cache.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
use {
crate::domain::{competition::bad_tokens::Quality, eth},
dashmap::DashMap,
std::{
sync::Arc,
time::{Duration, Instant},
},
};

/// Cache keeping track of whether or not a token is considered supported or
/// not. Internally reference counted for cheap clones and easy sharing.
/// Stores a map instead of a set to not recompute the quality of good tokens
/// over and over.
/// Evicts cached value after a configurable period of time.
#[derive(Clone)]
pub struct Cache(Arc<Inner>);

struct Inner {
cache: DashMap<eth::TokenAddress, CacheEntry>,
/// entries older than this get ignored and evicted
max_age: Duration,
}

struct CacheEntry {
/// when the decision on the token quality was made
last_updated: Instant,
/// whether the token is supported or not
quality: Quality,
}

impl Cache {
/// Creates a new instance which evicts cached values after a period of
/// time.
pub fn new(max_age: Duration) -> Self {
Self(Arc::new(Inner {
max_age,
cache: DashMap::default(),
}))
}

/// Updates whether or not a token should be considered supported.
pub fn update_quality(&self, token: eth::TokenAddress, quality: Quality, now: Instant) {
self.0
.cache
.entry(token)
.and_modify(|value| {
if quality == Quality::Unsupported
|| now.duration_since(value.last_updated) > self.0.max_age
{
// Only update the value if the cached value is outdated by now or
// if the new value is "Unsupported". This means on conflicting updates
// we err on the conservative side and assume a token is unsupported.
value.quality = quality;
}
value.last_updated = now;
})
.or_insert_with(|| CacheEntry {
quality,
last_updated: now,
});
}

pub fn evict_outdated_entries(&self) {
let now = Instant::now();
self.0
.cache
.retain(|_, value| now.duration_since(value.last_updated) < self.0.max_age);
}

/// Returns the quality of the token if the cached value has not expired
/// yet.
pub fn get_quality(&self, token: &eth::TokenAddress, now: Instant) -> Option<Quality> {
let token = self.0.cache.get(token)?;
let still_valid = now.duration_since(token.last_updated) > self.0.max_age;
still_valid.then_some(token.quality)
}
}
11 changes: 11 additions & 0 deletions crates/driver/src/domain/competition/bad_tokens/metrics.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
use {super::Quality, crate::domain::eth};

#[derive(Default)]
pub struct Detector;

impl Detector {
pub fn get_quality(&self, _token: eth::TokenAddress) -> Option<Quality> {
// TODO implement a reasonable heuristic
None
}
}
134 changes: 134 additions & 0 deletions crates/driver/src/domain/competition/bad_tokens/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
use {
crate::domain::{competition::Auction, eth},
futures::future::join_all,
itertools::{Either, Itertools},
std::{collections::HashMap, fmt, time::Instant},
};

pub mod cache;
pub mod metrics;
pub mod simulation;

#[derive(Debug, Copy, Clone, PartialEq, Eq)]
pub enum Quality {
/// Solver is likely to produce working solutions when computing
/// routes for this token.
Supported,
/// Solver will likely produce failing solutions when computing
/// routes for this token. This can have many reasons:
/// * fees on transfer
/// * token enforces max transfer amount
/// * trader is deny listed
/// * bugs in the solidity compiler make it incompatible with the settlement
/// contract - see <https://github.com/cowprotocol/services/pull/781>
/// * probably tons of other reasons
Unsupported,
}

#[derive(Default)]
pub struct Detector {
/// manually configured list of supported and unsupported tokens. Only
/// tokens that get detected incorrectly by the automatic detectors get
/// listed here and therefore have a higher precedence.
hardcoded: HashMap<eth::TokenAddress, Quality>,
simulation_detector: Option<simulation::Detector>,
metrics: Option<metrics::Detector>,
}

impl Detector {
/// Hardcodes tokens as (un)supported based on the provided config. This has
/// the highest priority when looking up a token's quality.
pub fn new(config: HashMap<eth::TokenAddress, Quality>) -> Self {
Self {
hardcoded: config,
..Default::default()
}
}

/// Enables detection of unsupported tokens via simulation based detection
/// methods.
pub fn with_simulation_detector(&mut self, detector: simulation::Detector) -> &mut Self {
self.simulation_detector = Some(detector);
self
}

/// Enables detection of unsupported tokens based on heuristics.
pub fn with_heuristic_detector(&mut self) -> &mut Self {
self.metrics = Some(metrics::Detector);
self
}

/// Removes all unsupported orders from the auction.
pub async fn filter_unsupported_orders_in_auction(&self, mut auction: Auction) -> Auction {
let now = Instant::now();

let token_quality_checks = auction.orders.into_iter().map(|order| async move {
let sell = self.get_token_quality(order.sell.token, now);
let buy = self.get_token_quality(order.buy.token, now);
match (sell, buy) {
// both tokens supported => keep order
(Some(Quality::Supported), Some(Quality::Supported)) => Either::Left(order),
// at least 1 token unsupported => drop order
(Some(Quality::Unsupported), _) | (_, Some(Quality::Unsupported)) => {
Either::Right(order.uid)
}
// sell token quality is unknown => keep order if token is supported
(None, _) => {
let Some(detector) = &self.simulation_detector else {
// we can't determine quality => assume order is good
return Either::Left(order);
};
let quality = detector.determine_sell_token_quality(&order, now).await;
match quality {
Some(Quality::Supported) => Either::Left(order),
_ => Either::Right(order.uid),
}
}
// buy token quality is unknown => keep order (because we can't
// determine quality and assume it's good)
(_, None) => Either::Left(order),
}
});
let (supported_orders, removed_uids): (Vec<_>, Vec<_>) = join_all(token_quality_checks)
.await
.into_iter()
.partition_map(std::convert::identity);

auction.orders = supported_orders;
if !removed_uids.is_empty() {
tracing::debug!(orders = ?removed_uids, "ignored orders with unsupported tokens");
}

if let Some(detector) = &self.simulation_detector {
detector.evict_outdated_entries();
}

auction
}

fn get_token_quality(&self, token: eth::TokenAddress, now: Instant) -> Option<Quality> {
if let Some(quality) = self.hardcoded.get(&token) {
return Some(*quality);
}

if let Some(detector) = &self.simulation_detector {
if let Some(quality) = detector.get_quality(&token, now) {
return Some(quality);
}
}

if let Some(metrics) = &self.metrics {
return metrics.get_quality(token);
}

None
}
}

impl fmt::Debug for Detector {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.debug_struct("Detector")
.field("hardcoded", &self.hardcoded)
.finish()
}
}
Loading

0 comments on commit 1451574

Please sign in to comment.