Skip to content

Commit

Permalink
simplify SourceID Hash (#14800)
Browse files Browse the repository at this point in the history
### What does this PR try to resolve?

Despite being interned `SourceId::Eq` is not a `ptr::eq`. Which in turn
is because `SourceId`s concept of identity is a complete mess. The code
relies on having to IDs that are `Eq` but do not have the same values
for their fields.

As one measure of this `SourceId` has an `impl Hash` which does
something different from `fn full_hash` and `fn stable_hash`. Separately
`SourceIdInner` has a different implementation. Similar levels of
complexity exist for `Eq`. Every one of these `impl`s was added due to a
real bug/issue we've had that needs to stay fixed. Not all of witch are
reproducible enough to have made it into our test suite.

I [have some
ideas](#14665 (comment))
for how to reorganize the types so that this is easier to reason about
and faster. But given the history and the complexity I want to move
extremely carefully.

### How should we test and review this PR?

The test pass, and it's a one line change, but this still needs careful
review.

### Additional information

r? @ehuss I remember you and Alex working very hard to track down most
of these bugs.
  • Loading branch information
weihanglo authored Dec 23, 2024
2 parents cfa27f2 + 9c19032 commit 0c157e0
Showing 1 changed file with 10 additions and 9 deletions.
19 changes: 10 additions & 9 deletions src/cargo/core/source_id.rs
Original file line number Diff line number Diff line change
Expand Up @@ -532,7 +532,10 @@ impl SourceId {
url == CRATES_IO_INDEX || url == CRATES_IO_HTTP_INDEX || is_overridden_crates_io_url(url)
}

/// Hashes `self`.
/// Hashes `self` to be used in the name of some Cargo folders, so shouldn't vary.
///
/// For git and url, `as_str` gives the serialisation of a url (which has a spec) and so
/// insulates against possible changes in how the url crate does hashing.
///
/// For paths, remove the workspace prefix so the same source will give the
/// same hash in different locations, helping reproducible builds.
Expand All @@ -550,7 +553,11 @@ impl SourceId {
return;
}
}
self.hash(into)
self.inner.kind.hash(into);
match self.inner.kind {
SourceKind::Git(_) => (&self).inner.canonical_url.hash(into),
_ => (&self).inner.url.as_str().hash(into),
}
}

pub fn full_eq(self, other: SourceId) -> bool {
Expand Down Expand Up @@ -665,16 +672,10 @@ impl fmt::Display for SourceId {
}
}

/// The hash of `SourceId` is used in the name of some Cargo folders, so shouldn't
/// vary. `as_str` gives the serialisation of a url (which has a spec) and so
/// insulates against possible changes in how the url crate does hashing.
impl Hash for SourceId {
fn hash<S: hash::Hasher>(&self, into: &mut S) {
self.inner.kind.hash(into);
match self.inner.kind {
SourceKind::Git(_) => self.inner.canonical_url.hash(into),
_ => self.inner.url.as_str().hash(into),
}
self.inner.canonical_url.hash(into);
}
}

Expand Down

0 comments on commit 0c157e0

Please sign in to comment.