-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] iree-hal-hip-di SIGSEGV for llama 8b fp8 model #19809
Comments
I just cmake debug mode without tracy on top of master,
iree-complie and iree-runt-module without tracy, I got type issue since numpy does not support bf16 type.
In this way, if I want to input bf16 type, how can I create the input? One of the way is to read the f32.npy then save it to pytorch.pt, since pytorch can support bf16 type. But does iree support the input as .pt format? |
you can write the data to binary files and pass those in: numpy does not support bf16 (without a fork), but some implementations are starting to use that - we could make our numpy loader use |
@benvanik I wrote a script numpy2TorchBf16Bin.py to convert f32.npy to torch_bf16 then write it into .bin. When run with iree-run-module, it said only .npy supported.
|
When you pass binary data, you need to tell the runtime how to interpret that data, using for example |
|
don't turn your things that should be i64 into bf16? I'm guessing your tokens aren't bf16 values? |
Try to only cast the cs_f16 to bf16, everything else I left there same. Because as I print (numpy2TorchBf16Bin.py) from raw npy, everything else is int. @dan-garvey I also doubt if all the input should be bf16, and is the size same?
|
these are some basic errors with you passing in the wrong values - this issue has bounced between asserts in tracy, bfloat16 numpy support, and missized inputs - it'd be good to break these down and isolate things so we can actually make some progress. all are issues, but together it's too hard to track. |
Originally reported seg fault issue (exposed in Llama3.1_8b_f16_tp8 model) has been resolved by runtime fix 1bf7249, fix has been verified in both the cases. |
The tracy issue now fixed. With the corrected SizexDtype input just generated by Dan at (SharkMI300, /sharedfile/prefill/), the INVALID_ARGUMENT issue also fixed.
|
@AmosLewis I think the inputs might still be incorrect. Using Also, I'm not sure #19564 is related. It was producing a similar error but due to the incorrect input. The mentioned runtime commit fixed a secondary issue #19564 (comment). |
I think the bin files have significant size overhead, multiple 1 value files are over 1kb. |
they are invalid if so - they are supposed to be the literal data - a 4 byte value should be 4 bytes on disk. |
@AmosLewis have you successfully used any ".bin" files produced via torch.save? |
yeah its certainly not one value. But if a bin file is meant to be just a sequence of literal values that should be pretty easy to produce. In the meantime @AmosLewis you can just load these using the torch api and then pass via the iree python api, none of them are bf16 so the numpy intermediary won't be a problem until output. |
No. I just use them first time for iree-run-module. |
Torch.save adds a bunch of metadata and does not generate bin files as
expected for iree. We can use numpy like the generate_data.py in sharktank
for everything but the fp8 input, for that we can view as uint8 then cover
to numpy and then bin file, I think, I was planning to test this today.
…On Tue, Jan 28, 2025, 8:09 PM Chi_Liu ***@***.***> wrote:
@AmosLewis <https://github.com/AmosLewis> have you successfully used any
".bin" files produced via torch.save?
No. I just use them first time for iree-run-module.
—
Reply to this email directly, view it on GitHub
<#19809 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AIHDSYHYA67DXRRGHAOXP6T2NBA6ZAVCNFSM6AAAAABV2J32CSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMRQGUZDQMBTHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Yeah don't mix file types. .bin is just binary data, no magic, no metadata, nothing framework-specific. At the base level that is all IREE sees at the boundaries anyways - buffers of data. |
With dan create new .bin input in pr nod-ai/shark-ai#885, the iree-run-module run successfully and tracy file generated. But now we got numeric issue now, I will file a new issue #19859 for numeric separately.
|
What happened?
Follow up of #19785
When run tracy iree-run-module for llama 8b float8 model on amd gpu, I got a seg fault.
gdb bt:
Steps to reproduce your issue
Here is the input mlir llama_8b_fp8.mlir
The input mlir is generated with shark-ai https://github.com/nod-ai/shark-ai/commits/users/dan_garvey/fp8_staging
What component(s) does this issue relate to?
Runtime
Version information
4215100
Additional context
SharkMI300X
Tracy related bug solution: #19826
The text was updated successfully, but these errors were encountered: