Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nvComp: Unknown output data type #235

Open
imbalu007 opened this issue Nov 26, 2024 · 3 comments
Open

nvComp: Unknown output data type #235

imbalu007 opened this issue Nov 26, 2024 · 3 comments
Labels

Comments

@imbalu007
Copy link

imbalu007 commented Nov 26, 2024

The following snippet fails in encoding step:

from nvidia import nvcomp
import torch

codec = nvcomp.Codec(algorithm="ANS", data_type="f16")
data_gpu = torch.randn(4096*4096, dtype=torch.float16).cuda()
nvarr_d = nvcomp.as_array(data_gpu)
com_arr = codec.encode(nvarr_d)

Fails in encode step with error:
ValueError: Unknown output data type

nvcomp version: 4.1.0
nvcomp cuda version: 12050
OS: Ubuntu

@JanuszL JanuszL added the nvCOMP label Nov 26, 2024
@wjd6910502
Copy link

it looks linke not support data_type params if you delet this params it works。
i met the same problem is decode “Traceback (most recent call last):
File "/code/sglang_v6/test_nbf16_new.py", line 40, in
decoded_array = codec1.decode(encoded_array, data_type='f4')
ValueError: Unknown output data type

@imbalu007
Copy link
Author

Without data_type, the input is treated as int8 (default). Not sure if that will impact compression rate or throughput for fp16 arrays

@baliika
Copy link

baliika commented Dec 5, 2024

Thank you @imbalu007 for raising this point. We had a typo in our documentation, and with the new patch release of nvCOMP (4.1.1) we also updated this.

2-byte floats should be referred to as <f2, i.e., 2-byte little-endian float using the Python array-protocol type strings. Note, that indicating endianness is also required. With this change, your example changes to:

from nvidia import nvcomp
import torch

codec = nvcomp.Codec(algorithm="ANS", data_type="<f2")
data_gpu = torch.randn(4096*4096, dtype=torch.float16).cuda()
nvarr_d = nvcomp.as_array(data_gpu)
com_arr = codec.encode(nvarr_d)

We also created a table where we match nvCOMP types to array-protocol type strings for more clarity while using the Python API:
https://docs.nvidia.com/cuda/nvcomp/py_api.html#data-type-association

Please let us know if you have other questions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants