-
-
Notifications
You must be signed in to change notification settings - Fork 31.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-111545: Add Py_HashDouble() function #113115
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
Add :c:func:`Py_HashDouble` function to hash a C double number. Patch by | ||
Victor Stinner. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -84,17 +84,20 @@ static Py_ssize_t hashstats[Py_HASH_STATS_MAX + 1] = {0}; | |
*/ | ||
|
||
Py_hash_t | ||
_Py_HashDouble(PyObject *inst, double v) | ||
Py_HashDouble(double v) | ||
{ | ||
int e, sign; | ||
double m; | ||
Py_uhash_t x, y; | ||
|
||
if (!Py_IS_FINITE(v)) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if remove this and keep only There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If prefer to have a deterministic behavior and always return the same hash value (0) if value is NaN. There are legit use cases to treat NaN as hash value 0. With the following change, only check for, diff --git a/Python/pyhash.c b/Python/pyhash.c
index f64edde4043..23aa2dac7cc 100644
--- a/Python/pyhash.c
+++ b/Python/pyhash.c
@@ -90,14 +90,8 @@ Py_HashDouble(double v)
double m;
Py_uhash_t x, y;
- if (!Py_IS_FINITE(v)) {
- if (Py_IS_INFINITY(v)) {
- return (v > 0 ? _PyHASH_INF : -_PyHASH_INF);
- }
- else {
- assert(Py_IS_NAN(v));
- return 0;
- }
+ if (Py_IS_INFINITY(v)) {
+ return (v > 0 ? _PyHASH_INF : -_PyHASH_INF);
}
m = frexp(v, &e); With the following change, Py_HashDouble() returns diff --git a/Python/pyhash.c b/Python/pyhash.c
index f64edde4043..a853d6dad99 100644
--- a/Python/pyhash.c
+++ b/Python/pyhash.c
@@ -91,13 +91,8 @@ Py_HashDouble(double v)
Py_uhash_t x, y;
if (!Py_IS_FINITE(v)) {
- if (Py_IS_INFINITY(v)) {
- return (v > 0 ? _PyHASH_INF : -_PyHASH_INF);
- }
- else {
- assert(Py_IS_NAN(v));
- return 0;
- }
+ // v can be NaN
+ return (v > 0 ? _PyHASH_INF : -_PyHASH_INF);
}
m = frexp(v, &e); There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My first attempt (first patch in my comment) leads to a hang if you pass NaN. Why do you want to avoid There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Recipe of What's New in Python 3.13: Py_hash_t hash_double(PyObject *obj, double value)
{
if (!Py_IS_NAN(value)) {
return Py_HashDouble(value);
}
else {
return Py_HashPointer(obj);
}
} Using this recipe and the current implementation, there are 3 code paths:
I don't think that it's a big deal to add 1 or 2 tests per float point number. I care more about the API, having a deterministic behavior for the 3 cases. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I want to avoid any promises about NaN. It should be recommended to not use this function for NaN. |
||
if (Py_IS_INFINITY(v)) | ||
return v > 0 ? _PyHASH_INF : -_PyHASH_INF; | ||
else | ||
return _Py_HashPointer(inst); | ||
if (Py_IS_INFINITY(v)) { | ||
return (v > 0 ? _PyHASH_INF : -_PyHASH_INF); | ||
} | ||
else { | ||
assert(Py_IS_NAN(v)); | ||
return 0; | ||
} | ||
} | ||
|
||
m = frexp(v, &e); | ||
|
@@ -129,6 +132,19 @@ _Py_HashDouble(PyObject *inst, double v) | |
return (Py_hash_t)x; | ||
} | ||
|
||
Py_hash_t | ||
_Py_HashDouble(PyObject *obj, double value) | ||
{ | ||
assert(obj != NULL); | ||
|
||
if (!Py_IS_NAN(value)) { | ||
return Py_HashDouble(value); | ||
} | ||
else { | ||
return Py_HashPointer(obj); | ||
} | ||
} | ||
|
||
Py_hash_t | ||
Py_HashPointer(const void *ptr) | ||
{ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It exposes too much implementation details why already exposed in different place. Why not simply say that it is equivalent to hash() of Python float object if it is not a NaN? And if it is a NaN, you should use other value to avoid collisions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you talking about the note, or describing the 3 cases and return values? I can just remove the note. My idea is to suggest using Py_IS_NAN() to treate NaN differently. But I'm not sure which implementation to suggest.
@zooba says that if you have a Python object, just call
PyObject_Hash(obj)
on it 😁There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About describing all 3 cases. It should already be described in other place (documentation for
sys.hash_info
orfloat
orhash()
), and if it is not described in details, than it is not necessary for users. You should only document that for non-NaN values it returns the same result as for hash() for Python float object.