-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a deterministic FxHashMap
wrapper
#63713
Comments
Hi @bjorn3 ! |
I dont think somebody else is working on this, so go for it. This wrapper should be put in the |
Hi @bjorn3 do you have name in mind for this wrapper? Also, is it OK if I take some time on this or is this an urgent requirement? |
Not really.
You can probably take some time. |
Semi-related, I'd also like to see an "accumulative" |
…=Mark-Simulacrum data_structures: Add deterministic FxHashMap and FxHashSet wrappers StableMap A wrapper for FxHashMap that allows to `insert`, `remove`, `get`, `get_mut` and convert a hashmap into a sorted vector using the method `into_sorted_vector` but no iteration support. StableSet A wrapper for FxHashSet that allows to `insert`, `remove`, `get` and convert a hashset into a sorted vector using the method `into_sorted_vector` but no iteration support. Addresses issue rust-lang#63713
I was told once that |
Cc @eddyb; I think it was them who told me that (but I might misremember) |
That seems to be correct: https://docs.rs/rustc-hash/1.0.1/src/rustc_hash/lib.rs.html#59-61 |
But then, what is the purpose of the new |
If an entry is added to a |
Yes I think it can. |
+1
Yeah, I guess so. It is just confusing and annoying that adding a single extra element to a |
Fair. Even for the same domain (set of keys), if the sequence of inserts/removes is different, iteration order might differ. |
It would probably still be nice to disallow misusing a hashmap without performance losses, regardless of what that would mean theoretically. |
Sure. And |
How about using LinkedHashMap or IndexMap with |
Note that |
See #47833 (comment) and #65043 for examples of issues caused by relying on the platform-dependent iteration order. |
#64131 added |
Since this issue has gone stale, I can attempt to pick it up. Is that okay, @ibraheemdev? |
@hameerabbasi I have some work done locally, but I think the better approach is to add an internal lint as @Mark-Simulacrum mentioned. If I haven't opened a PR in a week or feel free to pick this up :) |
This came up in #91943 |
Can someone post a summary of what needs to be done? Just change all instances of Or is a lint the preferred way forward? I'll have time to work on this starting 23rd, but I realise it's the end of the year and no one may be available to mentor. |
Miri was also bitten by this, see rust-lang/miri#1937. |
@rustbot claim |
@hameerabbasi, I'd like to see some progress with this. I can't spend a huge amount of time on it -- but if that's not a problem, I'd be happy to do the mentoring. |
@michaelwoerister If you could post a summary of what to do and what files to touch that'd be great! |
OK, I'll try to do that some time next week. |
OK, so the main goal here is to implement a map and a set type that satisfies the Currently HashMap, HashSet, BTreeMap, and BTreeSet all implement HashStable -- however, this actually isn't sound because they expose an observable ordering of their elements but that order is ignored by both their Eq and their HashStable implementations. The consequence is that the query system assumes that a value has not changed (because it still has the same fingerprint provided by HashStable), even though re-running the query would indeed produce a different result. In the worst case this can lead to miscompilations (most of which should be caught by fingerprint verification at this point though). StableMap and StableSet were introduced to remedy this but they don't quite fit the bill yet. They do provide an So, my overall plan of action would be:
These steps can later be repeated for |
Shouldn't the BTree collections be fine, since their iteration order is given by |
I think this part of the comment addresses why
As an add-on: I believe I will have time for this this weekend. I'll see how far I can get. |
Hm, but that same concern could also apply to |
If I understand the problem correctly, the correct solution seems to be excluding certain fields from comparison completely. Perhaps the |
Normally, yes. But in the context of incremental compilation, we'd like to have a stronger form of stability, spanning across multiple compilation sessions. For things that implement Overall the situation there is quite unsatisfactory, especially around "pointer-like" things like It's fair to say that there might be a place for both: a Map/Set type that just takes care of regular iteration order determinism (like BTreeMap) and one that is also suited for having a proper |
I've begun work on this in #94188. |
Mirroring the comment on the sibling issue. I see some work happened in #102698, how is it relevant to this issue. Can this be closed? |
In my opinion, this particular issue can be closed. The deterministic wrappers exist in the form of UnordMap and UnordSet. They haven't been deployed everywhere yet -- but widely enough to say with some confidence that they do the job (together with FxIndexMap and FxIndexSet and the indeterminism lint). The other issue being linked needs to be checked separately though. In #106977 and #108312 I've found that most uses of FxHashMap can be replaced pretty easily with UnordMap (or FxIndexMap if that is a better fit). |
@michaelwoerister thanks for the feedback! Will close this one, please let me know if there are opposing opinions |
It should not provide iteration support, but only insert/remove/get/get_mut and convertion to a sorted vec. We should then use it where possible
This could prevent accidentially causing hashmap related non determinism in most cases.
Originally posted by @bjorn3 in #58281 (comment)
The text was updated successfully, but these errors were encountered: