Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop reference to Self in GossipVerifier::new #3369

Open
tnull opened this issue Oct 15, 2024 · 13 comments · May be fixed by #3432
Open

Drop reference to Self in GossipVerifier::new #3369

tnull opened this issue Oct 15, 2024 · 13 comments · May be fixed by #3432
Milestone

Comments

@tnull
Copy link
Contributor

tnull commented Oct 15, 2024

In GossipVerifier::new we currently force an argument of type signature of gossiper: Arc<P2PGossipSync<Arc<NetworkGraph<L>>, Self, L>>, which is not only terrible in it's own right, but also enforces the type signature of everything touching P2PGossipSync at compile time.

In an LDK Node context this would mean bubbling up generics all the way to Node, which we cannot and won't do, and it heavily conflicts with the ability to choose the GossipSync (P2P vs RGS) at runtime. As a consequence, this prohibits LDK Node (and I assume other users too) from using the GossipVerifier.

We really need to drop this circular type dependency, which means in turn dropping the U: UtxoLookup generic from P2PGossipSync and everything touching it and replace it with dynamic dispatching it at runtime, i.e., Arc<dyn UtxoLookup + Send + Sync> (although we possibly might get away with dropping the additional bounds here, we'll see).

@TheBlueMatt
Copy link
Collaborator

but also enforces the type signature of everything touching P2PGossipSync at compile time.

I mean this is generally true of Rust? If you have non-dynamic dispatch you have to specify the types of the things you're calling.

We really need to drop this circular type dependency,

I'm not really sure how, sadly. We do need things to be somewhat circular here, and short of intermediating responses through an event interface driven by the BP (or PeerManager) I'm not really sure how to do that.

replace it with dynamic dispatching it at runtime

There's no need to do that in LDK itself, any downstream application that doesn't want to deal with this can just write the types out as dynamic dispatch and they should be fine?

@tnull
Copy link
Contributor Author

tnull commented Oct 16, 2024

I mean this is generally true of Rust? If you have non-dynamic dispatch you have to specify the types of the things you're calling.

Right, but it makes these (circular) types really a pain to work with.

I'm not really sure how, sadly. We do need things to be somewhat circular here, and short of intermediating responses through an event interface driven by the BP (or PeerManager) I'm not really sure how to do that.

Right, the add_utxo_lookup method is an unfortunate necessity already, but we could at least drop the circular dependency in the generics.

There's no need to do that in LDK itself, any downstream application that doesn't want to deal with this can just write the types out as dynamic dispatch and they should be fine?

Mhh, I kind of disagree, at least in the LDK Node setting where we need to be able to switch between a P2PGossipSync and RapidGossipSync variants of lightning-background-processor::GossipSync at runtime, while also allowing some None value depending on the chain source (i.e., we only want to verify gossip for bitcoind). When I leave the type signature dyn UtxoLookup + Send + Sync, the compiler will shout at me at the point where I actually try to construct the GossipVerifier and hand it the concrete P2PGossipSync type:

72  |                         Arc::clone(gossip_sync),
    |                         ---------- ^^^^^^^^^^^ expected `GossipVerifier<RuntimeSpawner, ..., ...>`, found `Arc<dyn UtxoLookup + Send + Sync>`
    |                         |
    |                         arguments to this function are incorrect
    |
    = note: expected reference `&Arc<lightning::routing::gossip::P2PGossipSync<Arc<lightning::routing::gossip::NetworkGraph<_>>, GossipVerifier<RuntimeSpawner, Arc<RpcClient>, _>, _>>`
               found reference `&Arc<P2PGossipSync<Arc<NetworkGraph<Arc<FilesystemLogger>>>, Arc<dyn UtxoLookup + Send + Sync>, Arc<FilesystemLogger>>>`

After banging my head against the typechecker for a few hours and running into more issues of similar nature I concluded we should just drop the generic, which should make it much easier to deal with (or even possible) .

@TheBlueMatt
Copy link
Collaborator

When I leave the type signature dyn UtxoLookup + Send + Sync, the compiler will shout at me at the point where I actually try to construct the GossipVerifier and hand it the concrete P2PGossipSync type:

Can you provide a reproducer? That shouldn't be the case, AFAIU (though you may have to do an explicit cast when building the Arc).

@tnull
Copy link
Contributor Author

tnull commented Oct 16, 2024

Can you provide a reproducer? That shouldn't be the case, AFAIU (though you may have to do an explicit cast when building the Arc).

Sure. I did multiple attempts from different angles, but pushed the most-recent attempt in the last commit here: https://github.com/tnull/ldk-node/tree/2024-10-add-bitcoind-support-gossipverify

@tnull
Copy link
Contributor Author

tnull commented Oct 29, 2024

@TheBlueMatt Did you get a chance to look at the reproducer? Somewhat independently from that, would you be fine with going ahead with the approach outlined above? @vincenzopalazzo had previously indicated he'd be interested in picking this up for the next release.

@TheBlueMatt
Copy link
Collaborator

With this diff upstream:

diff --git a/lightning-block-sync/src/gossip.rs b/lightning-block-sync/src/gossip.rs
index f075cf7b2..083156baa 100644
--- a/lightning-block-sync/src/gossip.rs
+++ b/lightning-block-sync/src/gossip.rs
@@ -144,7 +144,7 @@ pub struct GossipVerifier<
 {
        source: Blocks,
        peer_manager_wake: Arc<dyn Fn() + Send + Sync>,
-       gossiper: Arc<P2PGossipSync<Arc<NetworkGraph<L>>, Self, L>>,
+       gossiper: Arc<P2PGossipSync<Arc<NetworkGraph<L>>, Arc<Self>, L>>,
        spawn: S,
        block_cache: Arc<Mutex<VecDeque<(u32, Block)>>>,
 }
@@ -162,7 +162,7 @@ where
        /// This is expected to be given to a [`P2PGossipSync`] (initially constructed with `None` for
        /// the UTXO lookup) via [`P2PGossipSync::add_utxo_lookup`].
        pub fn new<APM: Deref + Send + Sync + Clone + 'static>(
-               source: Blocks, spawn: S, gossiper: Arc<P2PGossipSync<Arc<NetworkGraph<L>>, Self, L>>,
+               source: Blocks, spawn: S, gossiper: Arc<P2PGossipSync<Arc<NetworkGraph<L>>, Arc<Self>, L>>,
                peer_manager: APM,
        ) -> Self
        where

this diff on top of your branch works:

diff --git a/src/types.rs b/src/types.rs
index 9fae37e..b021b30 100644
--- a/src/types.rs
+++ b/src/types.rs
@@ -89,7 +89,7 @@ pub(crate) type Scorer = ProbabilisticScorer<Arc<Graph>, Arc<FilesystemLogger>>;

 pub(crate) type Graph = gossip::NetworkGraph<Arc<FilesystemLogger>>;

-pub(crate) type UtxoLookup = dyn lightning::routing::utxo::UtxoLookup + Send + Sync;
+pub(crate) type UtxoLookup = lightning_block_sync::gossip::GossipVerifier<crate::gossip::RuntimeSpawner, Arc<lightning_block_sync::rpc::RpcClient>, Arc<FilesystemLogger>>;

 pub(crate) type P2PGossipSync =
        lightning::routing::gossip::P2PGossipSync<Arc<Graph>, Arc<UtxoLookup>, Arc<FilesystemLogger>>;

@tnull
Copy link
Contributor Author

tnull commented Oct 30, 2024

With this diff upstream:
..
this diff on top of your branch works:
..

Nice, thanks for looking into it, that's a great first step!

However note that this would still require us to concretize the UtxoLookup type at compile time. While this would be fine right now where we our only gossip verifying chain source is the bitcoind RPC backend, we might need to keep this dynamic so that we can change it at runtime depending on the configured chain source, e.g., if we'd also want to add a REST variant, or possibly for BIP157/158, although the latter might be too inefficient anyways?

I guess at that point we could write a concrete adaptor implementing all the necessary traits and switching it internally, but given that LDK Node is likely not the only user struggling with this, why not just go the additional mile, simplify the generics, and use dynamic dispatch instead?

@TheBlueMatt
Copy link
Collaborator

Honestly maybe we should just bite the bullet and move the outbound message queue into NetworkGraph from P2PGossipSync and we can remove all the passing around of a reference to the message sender. Its pretty bad code layout but it would remove all the type insanity here.

@tnull
Copy link
Contributor Author

tnull commented Nov 15, 2024

Honestly maybe we should just bite the bullet and move the outbound message queue into NetworkGraph from P2PGossipSync and we can remove all the passing around of a reference to the message sender.

Ugh, that would be highly confusing tbh. If we need to do such a move, maybe it should be to a new QueueManager or whatever object that all can reference?

That said, maybe I should open a PR with the Arc<Self> diff to ensure we ship something that allows LDK Node to verify gossip after the next LDK release, as I imagine this refactor could turn out to be a larger endeavor?

@TheBlueMatt
Copy link
Collaborator

I'm not quite sure why that would be confusing? It wouldn't change the public API at all (aside from removing the need for the reference back to the P2PGossip entirely).

@tnull
Copy link
Contributor Author

tnull commented Nov 15, 2024

I'm not quite sure why that would be confusing? It wouldn't change the public API at all (aside from removing the need for the reference back to the P2PGossip entirely).

To me NetworkGraph is our data model for the graph, so storing internal message queues in there seems unrelated? I think we moved away from the network graph being just a data model in the node counters PR, but storing message queues seems like an entirely different affair altogether?

@TheBlueMatt
Copy link
Collaborator

To me NetworkGraph is our data model for the graph, so storing internal message queues in there seems unrelated? ...but storing message queues seems like an entirely different affair altogether?

Yes, indeed, its pretty gross for that reason, but at least its not publicly no longer "just a data model". Its not ideal but in terms of cleaning up the public API its a huge win...

I think we moved away from the network graph being just a data model in the node counters PR

Huh? The counters are just a part of the data model. Which is a weird data model, but its just a part of the data model.

@tnull tnull linked a pull request Nov 28, 2024 that will close this issue
@tnull
Copy link
Contributor Author

tnull commented Nov 28, 2024

With this diff upstream:

Now opened #3432 with this diff to make sure we ship a solution with the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants