Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neater hashing interface #4524

Merged
merged 26 commits into from
Dec 18, 2024
Merged

Neater hashing interface #4524

merged 26 commits into from
Dec 18, 2024

Conversation

widlarizer
Copy link
Collaborator

@widlarizer widlarizer commented Aug 6, 2024

Please read docs/source/yosys_internals/hashing.rst for what the interface is now.

We want to be able to plug-and-play hash functions to improve hashlib structure collision rates and hashing speed. Currently, in some ways, hashes are incorrectly handled: for deep structure hashing, each substructure constructs a hash from scratch, and these hashes are then combined with addition, XOR, sometimes xorshifted to make this less iffy, but overall, this is risky, as it may degrade various hash functions to varying degrees. It seems to me that the correct combination of hashes is to have a hash state that is mutated with each datum hashed in sequence. That's what this PR does.

  • check performance impact since this PR isn't NFC as it changes how structures are hashed
  • clean up things left over from experiments with inheriting from Hashable
  • fix pyosys
  • finish the plugin compatibility part of the doc

@widlarizer
Copy link
Collaborator Author

Cool, I got hit in the face with a downright gcc bug

@widlarizer
Copy link
Collaborator Author

With ibex and jpeg synthesized with ORFS, I'm seeing a 1% performance regression with this PR. This is probably because we're actually using the seed function more directly, with less xorshifting involved. I wonder if a quick swap of the hashing function would change the result. However, I'm also seeing a 1% memory usage improvement with jpeg, which is pretty interesting

@widlarizer
Copy link
Collaborator Author

Passes like extract_fa and opt_dff take 5-10% longer with this change. This is definitely a problem

@povik
Copy link
Member

povik commented Oct 29, 2024

Leaving a note so I don't forget it: We should provide a way for plugins to be compatible with both pre and post this change. I am thinking a define to advertise the new API.

@widlarizer widlarizer force-pushed the emil/hashlib-interface branch from cf5585e to 5584e74 Compare November 14, 2024 13:44
Copy link
Member

@jix jix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite the many comments I left, I really like the new hashing interface and think it's a big improvement over the previous one.

I found a bunch of nits, made a few suggestions on how to potentially improve some of the hash_eat implementations, am slightly confused by the current choice of state update in the used hasher implementation and found one instance of unconditionally recomputing an already cached hash.

edit: Oh, and to clarify, I'm fine with deferring my suggestions that would need some more benchmarking to a follow up PR and merge this once everything that's a clear issue right now or a clear benefit to change is addressed.

kernel/drivertools.h Outdated Show resolved Hide resolved
kernel/drivertools.h Show resolved Hide resolved
kernel/drivertools.h Outdated Show resolved Hide resolved
public:
void hash32(uint32_t i) {
state = djb2_xor(i, state);
state = mkhash_xorshift(fudge ^ state);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting to only see a djb hash update in here, but it is followed up by a xorshift. What's the reason for doing both? Without benchmarking I would be guessing that performing a djb hash update and a xorshift update is slower than either alone without necessarily having better overall runtime behavior than the better behaved one of both.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In an intermediate state, I found it necessary for lowering collisions and it doesn't cost so much as to cause a measurable regression against main on the ASIC designs I used to check. Without it, DJB2 doesn't have an avalanche property. Also see this observation basically about patterns like hash = mkhash(xorshift(mkhash(a, b)), c)

kernel/hashlib.h Outdated Show resolved Hide resolved
#undef YOSYS_NO_IDS_REFCNT

// the global id string cache
struct RTLIL::IdString
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly moving this turns the diff into a mess. Are there any functional changes to this besides updating the hashing implementation?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that rtlil.h both creates per-type partial specializations for hash_top_ops defined in hashlib.h as well as uses data structures defined in hashlib.h. No functional changes other than hash_top_ops and hash_eat.

inline Hasher hash_eat(Hasher h) const {
// TODO hash size
for (auto b : *this)
h.eat(b);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assuming my suggestion of hashing strings in chunks is implemented, the vector of State enums could also be hashed as if it were a string. If this is using a string representation the same is true. I'm assuming this would cause issues when we have two equal consts, one represented as states one as string. I still have to take a closer look at how the new string representation for constants is implemented to say more on this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a string represented as string vs as bit vector to have equal hashes, so we have to go bit by bit. To hash a contiguous chunk, for string represented Const, we'd need to first construct a new bit vector to run it on that. So it doesn't sound like it pays off

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's what I meant with "I'm assuming this would cause issues when we have two equal consts, one represented as states one as string". I have some ideas on how to compute a hash that processes more than a bit at a time while still agreeing between the two representations. That would be quite a bit more complex though, so probably only worth it if hashing consts is a significant overall cost. Do you happen to have any data on that specifically? In any case, this would be something we can do in a later PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, I misread that. This is an interesting challenge. I haven't noticed hashing Consts as a hotspot so far

kernel/rtlil.h Outdated Show resolved Hide resolved
return hash_ops<uintptr_t>::hash_eat((uintptr_t) a, h);
} else if constexpr (std::is_same_v<T, std::string>) {
for (auto c : a)
h.hash32(c);
Copy link
Member

@jix jix Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I wrote for the c string hash_eat below above also applies here, but I initially missed that std::string is handled here

kernel/drivertools.h Outdated Show resolved Hide resolved

inline unsigned int T::hash() const {
Hasher h;
return (unsigned int)hash_eat(h).yield();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand, if I compile my plugin against v0.47 or earlier I won't have Hasher

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced this with an interface versioning macro solution

@@ -0,0 +1,153 @@
Hashing and associative data structures in Yosys
------------------------------------------------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving a note that I read this file


DJB2 lacks these properties. Instead, since Yosys hashes large numbers of data
structures composed of incrementing integer IDs, Yosys abuses the predictability
of DJB2 to get lower hash collisions, with regular nature of the hashes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it to get lower hash collisions or to get better locality?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hash collisions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this come from observations, or something Claire mentioned was intention? I know some of the primitives were used in a way to get better locality

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes from my observations when counting hash collisions in hashlib per std::source_location and clear correlation with extra runtime overhead in some opt and extract_fa passes where the collisions were happening

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, cpython also gives reducing collisions as a reason for using small integer values directly as hash value: https://github.com/python/cpython/blob/7538e7f5696408fa0aa02fce8a413a7dfac76a04/Objects/dictobject.c#L289

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the resource. I'm perfectly fine with hashing an int trivially, but retaining patterns in the output when hashing a combination of them sketches me out. cpython:

>>> bin(hash((1, 2)))
'-0b11000101000100010101000101001111011111101000000110000010111101'
>>> bin(hash((1, 3)))
'-0b1001111111110101001100100011001110110010011111111100101100100'

But for SigBit we get this:

>>> print(djb2_add(1, 2))
35
>>> print(djb2_add(1, 3))
36
>>> print(djb2_add(2, 3))
69

@widlarizer widlarizer force-pushed the emil/hashlib-interface branch from bd783c3 to e028510 Compare November 20, 2024 16:06
@widlarizer
Copy link
Collaborator Author

Thanks @jix and @povik for your reviews of the large amount of code change. I don't think I've left anything unaddressed. Since we're nearing the end of the release cycle, we'll have to target the next one

@widlarizer widlarizer force-pushed the emil/hashlib-interface branch from e028510 to e2f1cf6 Compare November 26, 2024 09:56
@widlarizer widlarizer force-pushed the emil/hashlib-interface branch from e2f1cf6 to 026e9da Compare December 18, 2024 14:09
@widlarizer widlarizer merged commit f6e435f into main Dec 18, 2024
44 checks passed
@povik
Copy link
Member

povik commented Dec 31, 2024

This change hid hash_ptr_ops from the Yosys namespace causing an avoidable API breakage. See povik/yosys-slang@a2e91d4. We can walk that back before the release.

@povik
Copy link
Member

povik commented Dec 31, 2024

See povik/yosys-slang@a2e91d4.

This patch isn’t enough to be compatible post the API change. I am not sure how to declare the dicts in a way which works on both sides of the API change without resorting to preprocessor logic.

@widlarizer
Copy link
Collaborator Author

I could globally add using hashlib but that would defeat the whole point of a namespace I guess. Just beware that prior to this PR, YS_HASHING_VERSION was undefined, not set to 0

@povik
Copy link
Member

povik commented Jan 1, 2025

Just beware that prior to this PR, YS_HASHING_VERSION was undefined, not set to 0

I assume you are mentioning this because I've written #if YS_HASHING_VERSION <= 0. That should be fine, I've read the C preprocessor is specified to interpret undefined macros as zeroes in expressions.

AIUI to make the below, which has worked before the PR, to also work post the hashing change

	Yosys::pool<const ast::Statement *, Yosys::hash_ptr_ops> loops;

we need to (1) put using hashlib::hash_ptr_ops; back into the Yosys namespace and (2) adapt the pool and other templates to accept a hash_ops-like type, not the hash_top_ops type

@widlarizer
Copy link
Collaborator Author

I guess I could rename hash_ops to hash_eat_ops and hash_top_ops to hash_ops. It wouldn't hurt boilerplate very much within the project. I would then provide using within the Yosys namespace for neither. Plugin devs who need it could - conditionally on YS_HASHING_VERSION - raise hash_ops out of Yosys::hashlib into Yosys like this. This way we retain the distinction between top level object hashing and component hashing and still retain encapsulation

@povik
Copy link
Member

povik commented Jan 2, 2025

I’m not clear on how that helps. It would still be an error to put hash_ptr_ops for a template argument of pool or dict, right?

@widlarizer
Copy link
Collaborator Author

Right. I don't know of a way to fix that then

@widlarizer
Copy link
Collaborator Author

adapt the pool and other templates to accept a hash_ops-like type, not the hash_top_ops type

This would be in conflict with the purpose of this whole PR, if I'm understanding your suggestion correctly

@povik
Copy link
Member

povik commented Jan 2, 2025

Can't we make the template take an OPS argument with a hash_into method?

@widlarizer
Copy link
Collaborator Author

Since the declaration of dict is currently template<typename K, typename T, typename OPS = hash_top_ops<K>> class dict; I can't deny a theoretical possibility of resolving this but I don't personally see any way of overriding an internally used top-like interface by passing in a template parameter of a different interface

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants