Skip to content

Randomize user_channel_id for inbound channels #1790

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

tnull
Copy link
Contributor

@tnull tnull commented Oct 21, 2022

Previously, all inbound channels defaulted to a user_channel_id of 0, which didn't allow for them being discerned on that basis. Here, we simply randomize the identifier to fix this and enable the use of user_channel_id as a true identifier for channels (assuming an equally reasonable value is chosen for outbound channels and given upon create_channel()).

@codecov-commenter
Copy link

codecov-commenter commented Oct 21, 2022

Codecov Report

Base: 90.77% // Head: 91.67% // Increases project coverage by +0.90% 🎉

Coverage data is based on head (a2616a9) compared to base (505102d).
Patch coverage: 67.90% of modified lines in pull request are covered.

❗ Current head a2616a9 differs from pull request most recent head d458fa8. Consider uploading reports for the commit d458fa8 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1790      +/-   ##
==========================================
+ Coverage   90.77%   91.67%   +0.90%     
==========================================
  Files          87       89       +2     
  Lines       47595    55343    +7748     
  Branches    47595    55343    +7748     
==========================================
+ Hits        43204    50737    +7533     
- Misses       4391     4606     +215     
Impacted Files Coverage Δ
lightning/src/ln/channelmanager.rs 88.39% <51.92%> (+2.98%) ⬆️
lightning/src/util/events.rs 38.66% <90.00%> (+1.04%) ⬆️
lightning/src/ln/channel.rs 90.35% <100.00%> (+1.64%) ⬆️
lightning/src/ln/functional_test_utils.rs 93.46% <100.00%> (ø)
lightning/src/util/ser.rs 93.64% <100.00%> (+1.97%) ⬆️
lightning/src/util/ser_macros.rs 89.09% <100.00%> (+0.28%) ⬆️
lightning/src/chain/mod.rs 66.66% <0.00%> (-1.52%) ⬇️
lightning/src/ln/monitor_tests.rs 99.44% <0.00%> (-0.12%) ⬇️
lightning/src/lib.rs 100.00% <0.00%> (ø)
lightning/src/ln/reorg_tests.rs 100.00% <0.00%> (ø)
... and 25 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@TheBlueMatt
Copy link
Collaborator

Hmm, this is a bit awkward, given we require the user to pass an ID for outbound channel, but it gets picked at random for inbound ones? We run some risk of colliding, even if its not super high. Ideally we'd increment rather than randomize, and keep track of the last one for outbounds, if we want to do this. Do note that users can always set their own incrementing IDs if they do manual channel acceptance.

@TheBlueMatt
Copy link
Collaborator

Oh, no, I guess incrementing is inherintly race-y, we can't do that. Ugh, I guess we can randomize, but I feel really bad doing something that users may rely on (randomization being unique always) and then having it randomly fail. If its okay with your use-case it'd be nice to just have you rely on the manual acceptance, rather than relying on upstream.

@tnull
Copy link
Contributor Author

tnull commented Oct 21, 2022

Hm, but are we really worried about a collision in an 64-bit identifier space for a non security critical feature? Especially since currently the default behavior to have a collision in ~50% of cases? Also, correct me if I'm wrong, but I couldn't find any part of the code where we would rely on the 0 magic value, and hopefully no one else does, too?

So I'd argue randomization is just a plain improvement over the status quo, even though you are correct, there is a negligible chance of collisions. That said, if we were to have a null default value, this should probably be an Option<u64> rather than having a 0 magic value.

@TheBlueMatt
Copy link
Collaborator

Hm, but are we really worried about a collision in an 64-bit identifier space for a non security critical feature?

I would definitely call it "security critical", having users get confused between different channels definitely sounds like a potentially critical issue. That said, maybe we don't need to care? Mentally, my model is always (a) 32-bit -> dont use, (b) 64-bit -> fine for counters, even if a counterparty can cause you to increment it at a high rate, which they can here, (c) 128-bit -> fine if you dont want to care about collisions, (d) 256-bit -> just do it. But, in this case, 64-bit random numbers - if a counterparty is generating random inbound channels to try to cause collision, after 100million channels you still only have a ~0.02-0.03% chance of collisions. Its not impossible, but very very low, maybe sufficient that it will never happen in prod anywhere.

So I'd argue randomization is just a plain improvement over the status quo, even though you are correct, there is a negligible chance of collisions.

I think this is the wrong way of thinking about it - if there is a low-but-possible-edge-case of collisions, we'd rather cause collisions to be the "norm" so that users either handle it or avoid it via manual acceptance. Super rare bugs that could cause issues are worse than making it the "norm" where devs will see it during testing.

@tnull
Copy link
Contributor Author

tnull commented Oct 21, 2022

But, in this case, 64-bit random numbers - if a counterparty is generating random inbound channels to try to cause collision, after 100million channels you still only have a ~0.02-0.03% chance of collisions. Its not impossible, but very very low, maybe sufficient that it will never happen in prod anywhere.

Right, and it's not as if channel creation is a high-frequency action for which we blast through 100million events.

I think this is the wrong way of thinking about it - if there is a low-but-possible-edge-case of collisions, we'd rather cause collisions to be the "norm" so that users either handle it or avoid it via manual acceptance. Super rare bugs that could cause issues are worse than making it the "norm" where devs will see it during testing.

It's not as if we force users to supply their own identifiers, we simply notify them in the docs that the identifiers are all 0.
I'd argue the likelihood of a developer not reading the docs and just running into a bug in production because all inbound having the same identifier is much, much higher that having and actual collision.

@TheBlueMatt
Copy link
Collaborator

TheBlueMatt commented Oct 21, 2022

Right, and it's not as if channel creation is a high-frequency action for which we blast through 100million events.

If there's an attack with duplicate IDs, it absolutely is - a node can send open_channel messages really fast :)

I'd argue the likelihood of a developer not reading the docs and just running into a bug in production because all inbound having the same identifier is much, much higher that having and actual collision.

I don't understand this - if a user relies on the IDs being unique, they won't just hit it in prod, they'll hit it in their third day of testing, at the latest. Collisions you'll never hit in testing.

@tnull
Copy link
Contributor Author

tnull commented Oct 21, 2022

I don't understand this - if a user relies on the IDs being unique, they won't just hit it in prod, they'll hit it in their third day of testing, at the latest.

That's quite optimistic. To me that sounds like the kind of bug that could easily slip through eventually. Also we still could have a note there explaining the risk and that users should roll their own IDs if possible, just that the default would be just a bit saner.

@TheBlueMatt
Copy link
Collaborator

Oh? Getting a second inbound channel from the LSP seems like something that any dev working with an LSP would test?

In any case, maybe all of this just means our user_channel_id abstraction makes no sense. We had a similar one for payments but ended up ripping it out entirely (and eventually, basically, replacing it with PaymentId). I wonder if we shouldn't try to do something similar here - rip out the fields and have some LDK-provided 32-byte value, or an LDK-provided counter, or...?

@G8XSU
Copy link
Contributor

G8XSU commented Oct 21, 2022

I would feel much more comfortable here if its something normally used as unique identifier in high scale systems, for example something like uuid which is 128-bit and regularly used as key in database systems at very high scale.

@tnull
Copy link
Contributor Author

tnull commented Oct 24, 2022

Alright, so why not simply switch the user_channel_id to a u128 and randomize it? This would allow users to fit a UUID in there if the wanted, and to quote Matt:

(c) 128-bit -> fine if you dont want to care about collisions

@TheBlueMatt
Copy link
Collaborator

I'm fine with that. Sadly its not "trivially backwards compatible" because TLV reads must read the full expected byte count, so we'll need to write a separate "high bits" TLV.

@tnull
Copy link
Contributor Author

tnull commented Oct 25, 2022

Sadly its not "trivially backwards compatible" because TLV reads must read the full expected byte count, so we'll need to write a separate "high bits" TLV.

Yeah, figured so too, which is why there is no mention of "trivially backwards compatible" in above post anymore 😁

@tnull tnull force-pushed the 2022-10-inbound-user-channel-id-randomization branch 2 times, most recently from 43403bf to 1150480 Compare October 25, 2022 09:33
@TheBlueMatt
Copy link
Collaborator

Ah, I was responding to the email/initial copy, which was edited out from under me :)

@valentinewallace
Copy link
Contributor

I think this fixes fuzz CI:

diff --git a/fuzz/src/full_stack.rs b/fuzz/src/full_stack.rs
index 7edba558..322b1480 100644
--- a/fuzz/src/full_stack.rs
+++ b/fuzz/src/full_stack.rs
@@ -404,7 +404,7 @@ pub fn do_test(data: &[u8], logger: &Arc<dyn Logger>) {
        // Adding new calls to `KeysInterface::get_secure_random_bytes` during startup can change all the
        // keys subsequently generated in this test. Rather than regenerating all the messages manually,
        // it's easier to just increment the counter here so the keys don't change.
-       keys_manager.counter.fetch_sub(2, Ordering::AcqRel);
+       keys_manager.counter.fetch_sub(3, Ordering::AcqRel);
        let our_id = PublicKey::from_secret_key(&Secp256k1::signing_only(), &keys_manager.get_node_secret(Recipient::Node).unwrap());
        let network_graph = Arc::new(NetworkGraph::new(genesis_block(network).block_hash(), Arc::clone(&logger)));
        let gossip_sync = Arc::new(P2PGossipSync::new(Arc::clone(&network_graph), None, Arc::clone(&logger)));

@tnull tnull force-pushed the 2022-10-inbound-user-channel-id-randomization branch from 1150480 to eacf4ef Compare October 26, 2022 15:48
@tnull
Copy link
Contributor Author

tnull commented Oct 26, 2022

I think this fixes fuzz CI:
...

Thanks, I should start to remember that. 🙏

Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after squash

@tnull tnull force-pushed the 2022-10-inbound-user-channel-id-randomization branch from eacf4ef to d26e4b5 Compare October 26, 2022 16:49
@tnull
Copy link
Contributor Author

tnull commented Oct 26, 2022

Squashed commits.

@TheBlueMatt
Copy link
Collaborator

LGTM, feel free to squash.

@@ -463,6 +463,7 @@ macro_rules! impl_writeable_primitive {
}
}

impl_writeable_primitive!(u128, 16);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, so we should remove this - note that you broke backwards compat on the ChannelDetails serialization. It'd be very nice to be able to avoid breaking out the macro for this, though...Maybe we define a new macro read type that's, like, custom_adapter and has a conversion method? Ugh...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LMK if you want me to take a look at this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working on it, will give an update ASAP. Still not sure if it won't be easier to break the macro though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline I explored a number of approaches, e.g., utilizing a custom adapter in conjunction with handing through a decode_custom_tlv function. They seemed to be almost working on the decoding end (but don't really), and the encoding end is even trickier. Open for any suggestions how to move forward on this, otherwise I now broke the macro and now do custom de/ser as of 3a7bd26.

@tnull tnull force-pushed the 2022-10-inbound-user-channel-id-randomization branch from 80699d9 to d06e17b Compare November 8, 2022 09:05
@tnull
Copy link
Contributor Author

tnull commented Nov 8, 2022

Rebased on main after #1743 was merged.

@tnull tnull force-pushed the 2022-10-inbound-user-channel-id-randomization branch from d06e17b to 5093eba Compare November 8, 2022 09:25
@tnull tnull added this to the 0.0.113 milestone Nov 8, 2022
Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thought about it more, I don't think we should try to shove the whole split-int thing into the broader impl_writeable_tlv_based macro, but we I think there's at least one option for cleaning this up below.

@tnull tnull force-pushed the 2022-10-inbound-user-channel-id-randomization branch from 3a7bd26 to 8899a83 Compare November 15, 2022 13:58
We introduce a new macro that inits and reads tlv fields and DRY up
`impl_writeable_tlv_based` and other macros.
Previously, all inbound channels defaulted to a `user_channel_id` of 0,
which didn't allow for them being discerned on that basis. Here, we
simply randomize the identifier to fix this and enable the use of
`user_channel_id` as a true identifier for channels (assuming an equally
reasonable value is chosen for outbound channels and given upon
`create_channel()`).
@tnull tnull force-pushed the 2022-10-inbound-user-channel-id-randomization branch from 8899a83 to a2616a9 Compare November 15, 2022 14:10
@TheBlueMatt
Copy link
Collaborator

There are still a handful of incorrect docs in events.rs that still says user_channel_id will be 0 for inbound channels. Otherwise this looks basically good to me.

@tnull tnull force-pushed the 2022-10-inbound-user-channel-id-randomization branch from 7371a52 to d458fa8 Compare November 15, 2022 19:14
@tnull
Copy link
Contributor Author

tnull commented Nov 15, 2022

There are still a handful of incorrect docs in events.rs that still says user_channel_id will be 0 for inbound channels. Otherwise this looks basically good to me.

Whoops, updated the docs.

@TheBlueMatt
Copy link
Collaborator

Feel free to squash, IMO.

We increase the `user_channel_id` type from `u64` to `u128`. In order to
maintain backwards compatibility, we have to de-/serialize it as two
separate `u64`s in `Event` as well as in the `Channel` itself.
@tnull tnull force-pushed the 2022-10-inbound-user-channel-id-randomization branch from d458fa8 to dc3ff54 Compare November 15, 2022 19:41
@tnull
Copy link
Contributor Author

tnull commented Nov 15, 2022

Squashed without further changes.

@@ -632,13 +632,13 @@ pub enum Event {
/// The `user_channel_id` value passed in to [`ChannelManager::create_channel`] for outbound
/// channels, or to [`ChannelManager::accept_inbound_channel`] for inbound channels if
/// [`UserConfig::manually_accept_inbound_channels`] config flag is set to true. Otherwise
/// `user_channel_id` will be 0 for an inbound channel.
/// `user_channel_id` will be randomized for an inbound channel.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big deal, but could say that the version it starts being randomized in

Copy link
Contributor Author

@tnull tnull Nov 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will make sure to include it in a follow-up, probably when having a look at #1800!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in #1855.

@TheBlueMatt
Copy link
Collaborator

Gonna merge, will let @tnull tackle #1790 (comment) in a followup if desired.

@TheBlueMatt TheBlueMatt merged commit 8d8ee55 into lightningdevkit:main Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants