Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added corrections to re-enable reciprocal test in math_brute_force suite for relaxed math mode #2221

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

shajder
Copy link
Contributor

@shajder shajder commented Jan 10, 2025

fixes #2145

As suggested by @svenvh reciprocal has different precision requirements than divide. This PR introduces special path for reciprocal for binar_float_operator to test reciprocal with relaxed math. If this PR will get approvals, invalidate PR #2162

@svenvh svenvh self-requested a review January 10, 2025 08:26
@bashbaug
Copy link
Contributor

Possibly dumb question: why are we introducing a special path through binary_operator_float to test reciprocal? Reciprocal is a unary operation, not a binary operation...

@shajder
Copy link
Contributor Author

shajder commented Jan 10, 2025

Possibly dumb question: why are we introducing a special path through binary_operator_float to test reciprocal? Reciprocal is a unary operation, not a binary operation...

Yep, good point. I went that way just because I wasn't able to choose a better spot. There is no unary float operator, right? To create one I would have to duplicate a lot of code. Any other clues?

@svenvh
Copy link
Member

svenvh commented Jan 13, 2025

Possibly dumb question: why are we introducing a special path through binary_operator_float to test reciprocal? Reciprocal is a unary operation, not a binary operation...

Yep, good point. I went that way just because I wasn't able to choose a better spot. There is no unary float operator, right? To create one I would have to duplicate a lot of code. Any other clues?

Treating reciprocal as a binary operator with its first argument being the constant 1 seems the least worst solution for now. I'm in the process of reducing code duplication in the math bruteforce suite, so eventually we may be able to move it into its own place. But for now I'm okay with @shajder's chosen solution.

-reciprocal activated for relaxed float math
-reciprocal test added for fp16 and fp64
INFINITY,
FTZ_OFF,
RELAXED_ON,
binaryOperatorF },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, I'm fine with this as-is, but one side effect of testing reciprocal as part of binaryOperatorF is that we seem to be testing more than we need to. As one extreme example, if I only test the fp16 reciprocal, I'm seeing 128K kernel enqueues of 32K work-items.... even though there are only 64K possible fp16 values to take a reciprocal of. If I test a true fp16 unary function (e.g. asin) instead, I only see two 32K kernel enqueues, which covers all 64K fp16 values.

For a fp32 reciprocal, we're testing a total of 4B work-items (256K kernel enqueues of 16K work-items), but the data is generated randomly, so we're not guaranteed to have full coverage of all possible fp32 inputs (although we will certainly have fairly good coverage). I think if we properly considered reciprocal to be a "unary function" instead, we could guarantee full coverage of all fp16 and fp32 values. We will still need random sampling for fp64 testing, but that's unavoidable.

PS: With two input buffers for the "binary operator", we're also transferring twice as much input data as we need to. For some of the devices I'm testing it takes more time to transfer the data than it does to execute the reciprocal kernel, so the extra transfers are significant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Re-enable reciprocal test in math_brute_force suite
4 participants