Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests 66, 68, 69 fail on development branch on one machine but pass on another #170

Open
jerett-cc opened this issue Sep 6, 2024 · 3 comments
Labels

Comments

@jerett-cc
Copy link

On the most recent version of the development branch, tests 66, 68, 69 fail on my school machine. But not my laptop.

At school, I am compiling on x86_64 Intel, with OpenMPI 4.1.4
On my laptop, I am compiling on similar architecture but different Intel CPU with OpenMPI 4.1.2

The diffs are:

66 shallow_water/verification-paraboloid_1d-erk33-l7.release:

----------------
##9       #:2   <== 1345.774540174449
##9       #:2   ==> 1345.846277059402
@ Absolute error = 7.1736884953e-2, Relative error = 5.3305277230e-5
----------------
##10      #:2   <== 0.0001164820398633047
##10      #:2   ==> 0.0001147281980001955
@ Absolute error = 1.7538418631e-6, Relative error = 1.5286929401e-2

68 shallow_water/verification-smooth_vortex-erk33-l6.release:

----------------
##10      #:2   <== 0.03571394823661699
##10      #:2   ==> 0.03737613586854281
@ Absolute error = 1.6621876319e-3, Relative error = 4.6541693484e-2
----------------
##11      #:2   <== 0.0006325612013505061
##11      #:2   ==> 0.0007562265440589786
@ Absolute error = 1.2366534271e-4, Relative error = 1.9549941167e-1
----------------
##12      #:2   <== 0.003420776846038435
##12      #:2   ==> 0.003488469908911931
@ Absolute error = 6.7693062873e-5, Relative error = 1.9788798253e-2

69 shallow_water/verification-steady_incline-erk33-l9.release:

----------------
##9       #:2   <== 1.000593578808362
##9       #:2   ==> 1.000320814661278
@ Absolute error = 2.7276414708e-4, Relative error = 2.7267666841e-4
----------------
##10      #:2   <== 2.388278346583212e-14
##10      #:2   ==> 0.002619689632507278
@ Absolute error = 2.6196896325e-3, Relative error = 1.0968946045e+11
----------------
##11      #:2   <== 4.287451614926996e-15
##11      #:2   ==> 0.0002049750897826185
@ Absolute error = 2.0497508978e-4, Relative error = 4.7808140636e+10
----------------
##12      #:2   <== 5.452329602107318e-15
##12      #:2   ==> 0.0006221684713152953
@ Absolute error = 6.2216847131e-4, Relative error = 1.1411057598e+11

I can show more of the output files. Not sure what information you may need most. Almost all the diffs stem from t being different in small ways. Let me know if there are other facts about the machines that may be relevant.

@tamiko
Copy link
Member

tamiko commented Sep 9, 2024

@jerett-cc I am a bit worried about this one in your last comparison:

##11      #:2   <== 4.287451614926996e-15
##11      #:2   ==> 0.0002049750897826185
@ Absolute error = 2.0497508978e-4, Relative error = 4.7808140636e+10

These values should pretty much be zero and they aren't. Can you post the detailed.log file of the deal.II version that you compile against? Most importantly, do you compile with avx256 or avx512 support?

@jerett-cc
Copy link
Author

jerett-cc commented Sep 10, 2024

@tamiko Yes, I can.
detailed.log

As far as avx support, I believe so, but am not knowledgeable enough to tell you for sure, nor which one. I know I compile with -march-native

@tamiko
Copy link
Member

tamiko commented Sep 12, 2024

@jerett-cc You are compiling with avx2 on this machine. Let me investigate - we had some weird behavior of some gcc version close to gcc-12 on other machines (with miscompilation).

@tamiko tamiko added the bug label Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants