Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable Vern6 and Vern9 tests on Haswell CPUs #101

Closed
wants to merge 1 commit into from
Closed

Conversation

devmotion
Copy link
Member

This PR fixes #97 by disabling the two failing tests of Vern6 and Vern9 on Haswell CPUs only (compare previous builds https://travis-ci.org/JuliaDiffEq/DelayDiffEq.jl/jobs/482509932 with https://travis-ci.org/JuliaDiffEq/DelayDiffEq.jl/jobs/491089819).

Tests fail due to SciML/OrdinaryDiffEq.jl#656.

@ChrisRackauckas
Copy link
Member

Is this just a too low tolerance thing? Why would Haswell matter?

@devmotion
Copy link
Member Author

Actually I don't know. It's difficult to debug this issue since I cannot reproduce the failures locally. As mentioned in #97, the errors are CPU specific and so I assume they are caused by floating point issues and muladd, similar to what I observed in #73.

I think it's helpful to get rid of these test errors since they could mask other errors such as in https://travis-ci.org/JuliaDiffEq/DelayDiffEq.jl/jobs/487827113.

@ChrisRackauckas
Copy link
Member

How far off is it on Haswell, small? Or large? If small, we should if Haswell and then just increase the tolerance a bit. That could be because of @muladd not applying on some Haswell chips. If it's large, that's worrisome.

@devmotion
Copy link
Member Author

I did some debugging here: https://travis-ci.org/JuliaDiffEq/DelayDiffEq.jl/jobs/491106802

For Vern6 the printed values of the following time points (out of 60 time points) are different:

julia> t1[t1 .!== t2]
6-element Array{Float64,1}:
 5.59595
 6.77066
 7.54998
 8.1625 
 9.09498
 9.86356

julia> t2[t1 .!== t2]
6-element Array{Float64,1}:
 5.59609
 6.76899
 7.55189
 8.16518
 9.10246
 9.87058

With the following u values:

julia> u1[u1 .!== u2]
6-element Array{Float64,1}:
  0.223364  
 -0.0157241 
 -0.109774  
 -0.0916953 
  0.00505008
  0.0513813 

julia> u2[u1 .!== u2]
6-element Array{Float64,1}:
  0.223349  
 -0.0153893 
 -0.109834  
 -0.0914809 
  0.00577223
  0.0515117 

Interestingly the printed values at 6.0, 7.0, and 10.0 are the same.

@devmotion
Copy link
Member Author

I get similar results for Vern9 (out of 27 time points):

julia> t1[t1 .!== t2]
2-element Array{Float64,1}:
 6.83966
 7.6147 

julia> t2[t1 .!== t2]
2-element Array{Float64,1}:
 6.83979
 7.55545

julia> u1[u1 .!== u2]
2-element Array{Float64,1}:
 -0.0292791
 -0.111386 

julia> u2[u1 .!== u2]
2-element Array{Float64,1}:
 -0.0293037
 -0.109977 

The printed values at time points 7.0, 8.0, 9.0, and 10.0 are the same.

By increasing the relative tolerances to 1e-3 for t and u with Vern6 and u with Vern9 and 1e-2 for t with Vern9 I can achieve approximate equality.

@ChrisRackauckas
Copy link
Member

Yeah seems like it could just be floating point accumulation when @muladd isn't done and the tolerances are high. Tests still fail though.

@devmotion
Copy link
Member Author

The test failures are caused by SciML/OrdinaryDiffEq.jl#656. The logs show that the "lazy interpolants" test pass on Haswell now.

@devmotion
Copy link
Member Author

I think a better fix might be to test for approximate equality of the solution at fixed time points, e.g., t = 0:0.1:10. I did some tests here https://travis-ci.org/JuliaDiffEq/DelayDiffEq.jl/jobs/491214135 and it seems that with Haswell CPUs norm(first.(sol1(ts)) - sol2(ts)) is around 5.431765878617846e-9 and 3.1559293119007873e-13 for Vern6 and Vern9, respectively. Of course, a disadvantage of that approach would be that it's not clear anymore that both implementations are equivalent.

@devmotion devmotion closed this Feb 13, 2019
@devmotion devmotion deleted the haswell branch February 13, 2019 21:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test failures of lazy interpolants
2 participants