Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added more FMA functions with tests #89

Closed
wants to merge 12 commits into from

Conversation

JishinMaster
Copy link
Collaborator

This pull request is the last split of the original #83.

I have added fmsub_ps, fmnadd_ps, fmnsub_ps, and the associated tests.
I had to lower the input vector in the test functions and use an epsilon of 0.0001f because those functions are less precise than just using mul and add, and they seem to diverge with big float numbers (I use qemu, it might be worth checking on real hardware).

@jserv jserv requested a review from marktwtn July 26, 2020 14:38
Copy link
Member

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase latest master branch

sse2neon.h Outdated Show resolved Hide resolved
sse2neon.h Outdated Show resolved Hide resolved
sse2neon.h Outdated Show resolved Hide resolved
sse2neon.h Outdated Show resolved Hide resolved
@marktwtn marktwtn requested a review from jserv August 2, 2020 00:52
@jserv
Copy link
Member

jserv commented Aug 2, 2020

Header <immintrin.h> is used for AVX, AVX2 and FMA instructions. We need to figure out the way to manipulate. Non-SSE intrinsics are considered as the additional work to SSE2NEON as #82 stated.

@JishinMaster
Copy link
Collaborator Author

Can we use something like "ifdef USE_FMA"‚ and enable only for those who want it?
Or use a completely different "sse2neon_optional.h" file ? Doing so we could also implement some MMX instructions from older programs which have not migrated to SSE2.

@jserv jserv force-pushed the master branch 3 times, most recently from eb8e6ef to c902b5e Compare June 4, 2021 18:33
@jserv
Copy link
Member

jserv commented Jun 4, 2021

Let's close this pull request in favor of #82.

@jserv jserv closed this Jun 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants