-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rounding-off comparison with B-spline's bounds #116
base: master
Are you sure you want to change the base?
Conversation
Using Can you give some more details about how you ran into this and why you need it? |
Its the classic float-comparison error that prompted me to push this change while using
Now clearly I see you make a valid point that isclose is arbitrary but the other option I thought was |
Are you using the default knots and bounds? Are you able to share the data that's triggering this? |
Yes the knots and bounds used are both default. However |
Looking at the code, it appears that what we're doing is comparing Can you try adding print(inner_knots)
print(knot_quantiles) to |
Its actually not returning a smaller value, you can see that the knot reported in the output above is just greater than the bound. Nonetheless, I will paste these values shortly if I can reproduce it. |
Ah, no, I missed that, but it's just an issue with the numpy array printing rounding off the output. Floats do follow rules, you don't just randomly get However, this means that my suggestion of np.set_printoptions(precision=20)
print(inner_knots)
print(knot_quantiles) ? |
Sorry I can't seem to get the data back again. Floats returning wrong results won't happen always for you, I think it depends on what values you are dealing with (binary conversion has to kick in).
I guess I will wait till I can reproduce the issue again and get back |
That's because these two values got rounded to be |
numpy/numpy#10373 gives an example of a case where this could happen. I'm thinking the right solution here is to clip the auto-generated knots to fall inside the lower/upper bounds. |
I doubt if clipping or any absolute comparison methods would help (otherwise we won't get an incorrect percentile at first place). I think at the end of the day it boils down to having approximate comparison somewhere. |
We probably get the incorrect percentile because of some rounding error when the You seem to be under the impression that floating point math is all about vague clouds of points and impossible to predict results. This is a common and simplification, but it's not true :-). Each floating point value represents a specific real number. When you perform some operation like multiplication or addition (like when computing a weighted average) then there is always an exact mathematical answer which is also a real number... it's just that it might not be one of the real numbers that can be written down as a floating point number, so the operation instead picks some nearby number that is representable. But then that is represented exactly. Comparisons are performed exactly on the given values. Etc. If we used something like |
Just to mention that I have seen the same exception a few times. My guess was that there are a few nulps differences in the floating point computations. AFAIR it doesn't happen with "nice" numbers but it happened with some simulated sin/cos data. What I did as workaround was to set the boundary points so that they are 1e-15 (or something like that) outside of the data interval defined by min and max. |
b07ba3f
to
48fd2e4
Compare
fyi: This is not an issue in formulaic, which allows extrapolation of the basis splines beyond the boundary knots when so configured. It's likely that this patch would work fine, but I would have to look more closely at the code to see. I also note that patsy's implementation has some weird behaviour on the boundaries anyway (I haven't looked into whether it is worth fixing). Let me know if you are still interested in getting this merged into patsy! |
4f8a70c
to
691eb4e
Compare
This one's a minor fix imposing approximate comparison in floats.