-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel: Shake Error after couple minutes. Serial: No error. #9
Comments
Shake errors are as far as I experienced them always errors in your inputs, not getting them with one set-up just means you are (un-)lucky. I checked your stuff and see that you have hot atoms in your run -> check those and fix the parameter problems with them. |
There is no error in the input. What you looked at is the f0f004.log which contains the logfile of the mpi run, which crashed. The hot atoms occur ONLY when run in parallel. Hence I dont believe that this error has anything in common with what you describe. |
2865 warnings in your log file, starting with atom 5128. Run parallel with dcd at 10 steps per frame and look what is exploding. Then you can say if there is a parameter issue or not. |
Obviously, there must be warnings in the Logfile, or it would not have crashed. Look into singletest (non parallel run), and check out f0f004.log, there is no shake error. Not a single warning until I killed the simulation at step 150000. Why should I get hot atoms in the parallel run, and nothing the in the serial version? If this is expected, then we should add a test to make sure that all our parallel versions die, and all our serial version dont die, when restarted from f0f002.re. I will reopen this bug a final time, so I can check it out later when I have time. Feel free to close and I will forget about it. |
I just ran the test with a different compiled version of qdyn5p (master) and also get no errors (I send you the file by mail), so it is not the difference between serial and parallel. I will leave this open for you, but I think there is nothing for me to fix. |
OK, I have some addition to this: Running Qdyn5p (master) on Tintin gave me one segmentation fault and one shake failure, too, with the setup that should be working and ran ~1000 runs on both Abisko and Triolith. I will have to test the reproducibility of those errors and will look into the issue again. |
When running f0f004.inp, Qdyn5p will end with a Shake Error after a couple minutes.
When running with qdyn5, (serial version), there is no shake error.
I am not sure is this is a bug, or if this is very unlucky, of even if this is exptected.
Files and more information can be found in:
https://www.dropbox.com/sh/5oboetmc2mz2yyp/AADIVOCtw6WH3vigghBg4M4xa?dl=0
password: Bug9
The text was updated successfully, but these errors were encountered: