-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Status of AVX 512 ? #65
Comments
Hey, No AVX-512 instructions yet (I think my underlying SIMD bindings still don't support it, and even if they did I don't have hardware to test it on), but this crate should compile and work on the latest rust. |
Hi ! Thanks for answer. I would like to ask you what are the advantages of explicitly writing vectorization instructions versus compiler auto-vectorization ? I understand that in complex cases that the compiler does not realize this can be useful, but if I have some simple program, using this type of crates improves ? |
Always benchmark your programs both before and after optimization. Explicit SIMD is often faster, sometimes the same, and sometimes quite a bit slower than what the compiler can come up with. You're going up against hundreds of Ph.Ds with access to detailed documentation on the processor you're compiling for, so you're going to whiff sometimes. LLVM can handle a lot of situations well, but if you're doing anything more complex than simple mapping/reduction, explicit SIMD will help you hit your performance targets more consistently. It's also guaranteed to stay that way - LLVM takes performance seriously and fixes regressions that do come up, but it can be hard to detect when a program is running 5% slower because autovectorization stopped working in some subroutine. |
I'd be happy to provide a cloud VM with AVX-512 if that will re-ignite your interest in this crate :) |
Hello, I want to check if this crate works with avx 512 instructions. And also if they recommend to use it, due to the inactivity of it.
Are there other crates ? This one seems to me to be very good.
Thanks
The text was updated successfully, but these errors were encountered: