Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(iroh-net): Add a Watchable struct for use in the Endpoint API #2806

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

flub
Copy link
Contributor

@flub flub commented Oct 15, 2024

Description

This implements a Watchable struct and a Watcher which provides access to the watched value in several ways, including some streams.

Breaking Changes

Not yet, adopting it on the API will break things.

Notes & open questions

  • Currently I think this is racy, see the TODO items in the code.
  • At the moment this emits unused compiler warnings.

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.

matheus23 and others added 2 commits October 1, 2024 17:41
Copy link

github-actions bot commented Oct 15, 2024

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/2806/docs/iroh/

Last updated: 2024-11-21T13:41:33Z

@mepi262
Copy link

mepi262 commented Nov 8, 2024

@flub
Any update on this pull request?

@flub
Copy link
Contributor Author

flub commented Nov 8, 2024

@mepi262 the current implementation is racy and needs fixing.

What makes you interested in this?

@mepi262
Copy link

mepi262 commented Nov 9, 2024

The reason in which I'm interested is followings.

  • I'm very much looking forward to the iroh v1.0.0.
  • This pull request is related to the issue, which is contained milestone v1.0.0
  • I'm worry about whether this pull request is merged, because this pull request has not been active since last month.
    • People forget memory
    • Merging pull request is difficult when time has gone, because of risk of conflict.

I wish you good health, good luck and success.

Comment on lines 62 to 72
// TODO(flub): Pretty sure this can miss a write because the epoch and wakers are
// separate:
// - thread 1 runs poll_next
// - thread 2 runs set
// 1. thread 1: load epoch
// 2. thread 2: lock value, replace value, unlock value
// 3. thread 2: store epoch
// 4. thread 2: lock wakers, drain wakers, unlock wakers
// 5. thread 1: lock wakers, install waker, unlock wakers
//
// I believe the epoch and wakers need to be stored in the same RwLock.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're completely right about the wakers possibly missing a wakeup once a value is available.
I've tested this implementation of Watchable with loom, and can confirm that it gets stuck waiting for watcher.initialized() in some cases, even though there's a watchable.set() call.
(Interestingly, reproducing it practically requires loom usage, since it's very timing-sensitive and doesn't reproduce easily with running the same test case in a tight loop a couple thousand times.)

You're incorrect about the fact that the epoch and wakers needing to be stored on the same RwLock.

Similar to this example, after writing the waker, you can re-check the epoch to see if there was a write in-between.

Adding this second check, it passes loom.

@matheus23
Copy link
Contributor

Made some changes:

  • Added loom to watchable.rs (conditionally on the iroh_loom cfg)
  • Added a loom-based test that was failing in exactly the way @flub predicted
  • Fixed the race condition
  • Changed wakers from using a RwLock to using a Mutex, because we were only using RwLock::write
  • Removed Either by making initialized simpler

For anyone who's interested in running the loom test, you get some fun output like this:

LOOM_LOG=trace RUSTFLAGS="--cfg iroh_loom" cargo test --package iroh-net --lib -- util::watchable::tests::test_initialized_always_resolves --exact --nocapture

@matheus23
Copy link
Contributor

matheus23 commented Nov 21, 2024

Should we (well, I) maybe address #2860 (comment) in here too?

What's the relationship between Endpoint::conn_type_stream and Connection::remote_address? It seems like remote_address should be deprecated and replaced with a method that returns a ConnectionType based on the current state of the connection.

Copy link
Contributor Author

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Any reason this is still draft?

Comment on lines +25 to +29
#[derive(thiserror::Error, Debug)]
pub enum Error {
#[error("Watch lost connection to underlying Watchable, it was dropped")]
WatchableClosed,
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any point to having this as an enum instead of struct WatchableClosed?

I assume it would be extensibility, but then it should be marked as non-exhaustive? Doing that also makes using it harder though. So my vote is to make this a struct WatchableClosed for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah thought about that, too. Thanks for chiming in. I'll call it Disconnected similar to what it's called in the watchable crate.

Comment on lines +36 to +37
/// Note that the `Option` is only there to allow initialization.
/// Once initialized the value can never be cleared again.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment now belongs in the State struct?

if let Some(value) = state.value.clone() {
// Once initialized our Option is never set back to None, but nevertheless
// this code is safer without relying on that invariant.
return Poll::Ready((epoch, value));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to our installed waker now? We might get a spurious wakeup I guess? Which is probably the easiest way to go about this rather than trying to remove our waker again?

impl<T: Clone + Eq> Watcher<T> {
/// Returns the currently held value.
///
/// Returns `None` if the value was not set yet.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Returns `None` if the value was not set yet.
/// Returns `None` if the value was not yet initialised.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You keep spelling "initialized" wrong 😜

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, it's so hard!

Ok(shared.get())
}

/// Returns a future completing once the value is initialized.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Returns a future completing once the value is initialized.
/// Returns a future completing once the value is initialized.
///
/// The future will complete immediately if the value was already initialized.

/// used to operate on the most recent value. If the stream is not yet initialized the
/// first item of the stream will not be readily available.
///
/// Note however that only the last item is stored. If the stream is not polled when an
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Note however that only the last item is stored. If the stream is not polled when an
/// Note however, that only the last item is stored. If the stream is not polled when an

WatchNextFut { watcher: self }
}

/// Returns a stream which will yield an items for the most recent value.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Returns a stream which will yield an items for the most recent value.
/// Returns a stream which will yield items containing the most recent value.

WatchNextStream { watcher: self }
}

/// Returns a stream which will yield an item for changes to the watched value.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Returns a stream which will yield an item for changes to the watched value.
/// Returns a stream which will yield items for changes to the watched value.

}

#[derive(Debug)]
#[repr(transparent)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? I thought this was a thing only needed for C APIs.


#[derive(Debug)]
#[repr(transparent)]
pub struct WatchNextFut<'a, T> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a doc comment

@flub
Copy link
Contributor Author

flub commented Nov 22, 2024

Should we (well, I) maybe address #2860 (comment) in here too?

What's the relationship between Endpoint::conn_type_stream and Connection::remote_address? It seems like remote_address should be deprecated and replaced with a method that returns a ConnectionType based on the current state of the connection.

I think this PR just introduces the watchable, follow-up PRs can start using it.

@matheus23
Copy link
Contributor

Looks good! Any reason this is still draft?

I think this PR just introduces the watchable, follow-up PRs can start using it.

Ah okay. Well my plan was to learn from the refactor of making iroh-net use this new watchable.

And I'm already finding things. E.g. do we want the Watchable to have shutdown functionality.

So, I'm still learning!
In any case, I'll split up the PR into watchable introduction & the refactor anyways.

@flub
Copy link
Contributor Author

flub commented Nov 22, 2024

Looks good! Any reason this is still draft?

I think this PR just introduces the watchable, follow-up PRs can start using it.

Ah okay. Well my plan was to learn from the refactor of making iroh-net use this new watchable.

And I'm already finding things. E.g. do we want the Watchable to have shutdown functionality.

So, I'm still learning! In any case, I'll split up the PR into watchable introduction & the refactor anyways.

Makes sense as well. No big deal if you'd like to do this all in one place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 👀 In review
Development

Successfully merging this pull request may close these issues.

3 participants