-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCLomatic] Enable migration of CUBLASLT: EPILOGUE_BIAS, EPILOGUE_GELU, EPILOGUE_GELU_AUX, EPILOGUE_GELU_AUX_BIAS #2460
base: SYCLomatic
Are you sure you want to change the base?
Conversation
} else if (compute_desc->_epilogue == epilogue_t::bias) { | ||
matmul_ops.append_binary(::dnnl::algorithm::binary_add, bias_md); | ||
} else if (compute_desc-> epilogue == epilogue::gelu_aux_bias) { | ||
matmul_ops.append_eltwise(::dnnl::algorithm::eltwise_gelu_erf, 0.f, 0.f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CUBLASLT_EPILOGUE_GELU_AUX_BIAS
needs to output thebias
result toCUBLASLT_MATMUL_DESC_EPILOGUE_AUX_POINTER
. But here, your implementation does not.- Please write an E2E test to validate your implementation. I tried something similar before but encountered a runtime error or an incorrect result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Please help to review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refine
0a657ab
to
c5b19af
Compare
} else if (compute_desc-> epilogue == epilogue::gelu_aux_bias) { | ||
matmul_ops.append_eltwise(::dnnl::algorithm::eltwise_gelu_erf, 0.f, 0.f); | ||
matmul_ops.append_binary(::dnnl::algorithm::binary_add, bias_md); | ||
compute_desc->set_attribute(matmul_desc_t::attribute::epilogue_aux_pointer, bias_mem); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to copy the data in bias_mem
to _epilogue_aux_pointer
. Meanwhile, please also consider the data layout (row_major/col_major) and the leading dimension (CUBLASLT_MATMUL_DESC_EPILOGUE_AUX_LD
).
And according to the cublaslt doc, CUBLASLT_MATMUL_DESC_BIAS_POINTER
is also related to this epilogue. Please check the usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I directly added matrix memcpy to copy epilogue_aux_pointer contents from bias_mem pointer, using the epilogue_aux_ld is and new_ldc (bias ld). This usage is for matrix memcpy, and will try to see if tests are ok .
Please suggest in meantime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refine
Also could you please prepare an E2E test first? like https://github.com/oneapi-src/SYCLomatic-test/blob/SYCLomatic/features/feature_case/cublasLt/matmul.cu . |
6b14fa2
to
a2dc3f5
Compare
} else if (compute_desc->_epilogue == epilogue_t::gelu_aux_bias) { | ||
matmul_ops.append_eltwise(::dnnl::algorithm::eltwise_gelu_erf, 0.f, 0.f); | ||
matmul_ops.append_binary(::dnnl::algorithm::binary_add, bias_md); | ||
dpct::blas::matrix_mem_copy(matmul_desc_t::attribute::epilogue_aux_pointer, bias_mem, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
matmul_desc_t::attribute::epilogue_aux_pointer is a enum, not a pointer. bias_mem is a pointer of dnnl::memory, not a pointer to data.
matmul_ops.append_eltwise(::dnnl::algorithm::eltwise_gelu_erf, 0.f, 0.f); | ||
} else if (compute_desc->_epilogue == epilogue_t::gelu_aux) { | ||
matmul_ops.append_eltwise(::dnnl::algorithm::eltwise_gelu_erf, 0.f, 0.f); | ||
dpct::blas::matrix_mem_copy(compute_desc->_epilogue_aux_pointer, new_c, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forget one issue. This dpct::blas::matrix_mem_copy
does not return sycl::event. So we cannot control the dependency chain when using non-USM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another issue is new_c is the bias, not the result of (A*B+C). I did not find there is a method to get that immediate result when using oneDNN post-op.
Maybe we need to use a separate oneDNN elewise primitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes correct, for the second point, I see it is raised in JIRA. Currently I could not find a way to extract immediate result of A*B+C for the aux memcpy.
For the first case, do I have to force use USM always then ? What do you suggest ?
if (compute_desc->_epilogue == epilogue_t::relu) { | ||
matmul_ops.append_eltwise(::dnnl::algorithm::eltwise_relu, 0.f, 0.f); | ||
} else if (compute_desc->_epilogue == epilogue_t::gelu) { | ||
matmul_ops.append_eltwise(::dnnl::algorithm::eltwise_gelu_erf, 0.f, 0.f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
matmul_ops.append_eltwise(::dnnl::algorithm::eltwise_gelu_erf, 0.f, 0.f); | |
matmul_ops.append_eltwise(::dnnl::algorithm::eltwise_gelu_tanh, 0.f, 0.f); |
compute_desc->_epilogue_aux_ld, new_ldc, m, n, | ||
sizeof(size_t) , dpct::device_to_device, queue); | ||
} else if (compute_desc->_epilogue == epilogue_t::bias) { | ||
matmul_ops.append_binary(::dnnl::algorithm::binary_add, bias_md); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bias_md represents the C in the D=A*B+C, not the extra bias in epilogue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to create new bias descriptor. A*B+C+bias.?
@zhiweij1 @zhimingwang36