Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVX2: Use mulcache for base multiplication #477

Open
hanno-becker opened this issue Dec 3, 2024 · 0 comments · May be fixed by #624
Open

AVX2: Use mulcache for base multiplication #477

hanno-becker opened this issue Dec 3, 2024 · 0 comments · May be fixed by #624
Labels
enhancement New feature or request x86_64

Comments

@hanno-becker
Copy link
Contributor

The AVX2 backend should use a mulcache like the C and AArch64 backends. This essentially means moving

* TODO: This could be precomputed in the mulcache */
vmovdqa (%r9),%ymm0
vmovdqa 32(%r9),%ymm1
vpmullw %ymm0,%ymm10,%ymm2
vpmullw %ymm0,%ymm12,%ymm3
vpmulhw %ymm1,%ymm10,%ymm10
vpmulhw %ymm1,%ymm12,%ymm12
vpmulhw %ymm8,%ymm2,%ymm2
vpmulhw %ymm8,%ymm3,%ymm3
vpsubw %ymm2,%ymm10,%ymm10 # rb0d0
vpsubw %ymm3,%ymm12,%ymm12 # rb1d1
vpaddw %ymm5,%ymm9,%ymm9
vpaddw %ymm7,%ymm11,%ymm11
vpsubw %ymm13,%ymm10,%ymm13
vpsubw %ymm12,%ymm6,%ymm6
into a new poly_mulcache_compute_native routine.

@hanno-becker hanno-becker added enhancement New feature or request x86_64 labels Dec 3, 2024
@dkostic dkostic linked a pull request Jan 8, 2025 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request x86_64
Projects
None yet
1 participant