-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
testing for portblas portfft #495
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With Romain's comments, LGTM.
We'll look into ways to reduce the time it takes to run the tests. We may need to cache the SYCL kernels across the tests. |
The build times are comparable to MKL backend. MKL tests run in < 5 minutes. CI failed with a timeout because both portblas and portfft tests take > 4 hours to run. Are you saying that driver compile time is the cause? I can add a regex to select a subset of the tests to run, with a target of running for 5 minutes. What do you recommend? |
Yes the JIT compilation may be the issue, we'll need to confirm. I think in the current situation we would need to disable too many tests so I would suggest to wait until we know more. Running all the tests under 5 minutes seems very optimistic for portFFT or portBLAS though. |
I changed it so we build all the unit tests for portfft/portblas but only run a handful of tests. Everything passes and port* finishes before MKL. It is not what we want for the long term, but we can enable more as we get faster machines and hopefully find a way to reduce the JIT time. Are you OK with committing what is there now? |
An indirectly related question: I noticed that previous workflows were run on my forked repo. Do you think a condition like this can avoid the runs on forked repos? |
That will work, but you have to put the condition in every job for every workflow. I don't think there is a file-level way to do it. An alternative is to disable github actions for your fork. It is in settings/actions/general, then look near the top. |
I see. I hope GitHub can provide a more maintainable way for this. At the same time, we may need to communicate this with contributors, but this is out of scope of this PR. |
@rscohn2 from our local testing we have confirmed that using AOT compilation of the kernels speeds up the testing a lot. This is best to use if we know we will run on a specific device. For the portFFT backend you might as well directly set the target to We'll try to improve the documentation for this. We need to find a meaningful default behavior while still letting the user tune for more specific use-cases. |
The github runner has 2 cores/4 threads, but I think we will have more cores available soon. For the 8 minute test, were they parallel or serial? @dnhsieh-intel prefers serial tests because the logs are easier to read. |
We just ran |
Right, that makes sense actually portFFT is using spec constants. Let's revert the target to |
With AOT, portblas takes 1.5 hours to compile and 5 minutes to run. Runners only have 2 cores, but we are getting runners with more cores so parallel build will be faster. 2nd longest is mkl blas, which takes 1 hour with most time spent in compile. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor comment, looks good to me otherwise, thanks
Co-authored-by: Romain Biessy <[email protected]>
Co-authored-by: Romain Biessy <[email protected]>
Co-authored-by: Romain Biessy <[email protected]>
…get (oneapi-src#511) * The portFFT backends targets Nvidia by default * In some situations, this causes a failure whilst compiling (CUDA might not be installed) * This checks the target is supported before use This solves the issue in PR oneapi-src#495
Description
Add Unit tests for portfft, portblas to CI