You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning:
The secret HF_TOKEN does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
warnings.warn(
config.json: 100%
959/959 [00:00<00:00, 24.6kB/s]
model.safetensors.index.json: 100%
171k/171k [00:00<00:00, 2.36MB/s]
Downloading shards: 100%
5/5 [09:16<00:00, 104.07s/it]
model-00001-of-00005.safetensors: 100%
4.99G/4.99G [01:59<00:00, 42.2MB/s]
model-00002-of-00005.safetensors: 100%
4.99G/4.99G [01:59<00:00, 41.2MB/s]
model-00003-of-00005.safetensors: 100%
4.99G/4.99G [02:00<00:00, 42.7MB/s]
model-00004-of-00005.safetensors: 100%
4.99G/4.99G [01:58<00:00, 42.4MB/s]
model-00005-of-00005.safetensors: 100%
3.17G/3.17G [01:15<00:00, 42.2MB/s]
Loading checkpoint shards: 100%
5/5 [00:52<00:00, 6.42s/it]
generation_config.json: 100%
242/242 [00:00<00:00, 13.7kB/s]
WARNING:accelerate.big_modeling:Some parameters are on the meta device because they were offloaded to the disk and cpu.
tokenizer_config.json: 100%
1.29k/1.29k [00:00<00:00, 74.3kB/s]
vocab.json: 100%
2.78M/2.78M [00:00<00:00, 8.50MB/s]
merges.txt: 100%
1.67M/1.67M [00:00<00:00, 12.5MB/s]
tokenizer.json: 100%
7.03M/7.03M [00:00<00:00, 21.2MB/s]
Device set to use cuda:0
/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py:1965: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
warnings.warn(
/usr/local/lib/python3.10/dist-packages/aqlm/inference_kernels/cuda_kernel.py:20: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch. @torch.library.impl_abstract("aqlm::code1x16_matmat")
/usr/local/lib/python3.10/dist-packages/aqlm/inference_kernels/cuda_kernel.py:33: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch. @torch.library.impl_abstract("aqlm::code1x16_matmat_dequant")
/usr/local/lib/python3.10/dist-packages/aqlm/inference_kernels/cuda_kernel.py:48: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch. @torch.library.impl_abstract("aqlm::code1x16_matmat_dequant_transposed")
/usr/local/lib/python3.10/dist-packages/aqlm/inference_kernels/cuda_kernel.py:62: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch. @torch.library.impl_abstract("aqlm::code2x8_matmat")
/usr/local/lib/python3.10/dist-packages/aqlm/inference_kernels/cuda_kernel.py:75: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch. @torch.library.impl_abstract("aqlm::code2x8_matmat_dequant")
/usr/local/lib/python3.10/dist-packages/aqlm/inference_kernels/cuda_kernel.py:88: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch. @torch.library.impl_abstract("aqlm::code2x8_matmat_dequant_transposed")
[{'generated_text': [{'role': 'user', 'content': 'Who are you?'},
{'role': 'assistant',
'content': 'I am Qwen, a large language model created by Alibaba Cloud. I am here to assist you'}]}]
The text was updated successfully, but these errors were encountered:
from transformers import pipeline
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe = pipeline("text-generation", model="ISTA-DASLab/Qwen2-72B-AQLM-PV-1bit-1x16", trust_remote_code=True, device_map="auto")
pipe(messages)
/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning:
The secret HF_TOKEN does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
warnings.warn(
config.json: 100%
959/959 [00:00<00:00, 24.6kB/s]
model.safetensors.index.json: 100%
171k/171k [00:00<00:00, 2.36MB/s]
Downloading shards: 100%
5/5 [09:16<00:00, 104.07s/it]
model-00001-of-00005.safetensors: 100%
4.99G/4.99G [01:59<00:00, 42.2MB/s]
model-00002-of-00005.safetensors: 100%
4.99G/4.99G [01:59<00:00, 41.2MB/s]
model-00003-of-00005.safetensors: 100%
4.99G/4.99G [02:00<00:00, 42.7MB/s]
model-00004-of-00005.safetensors: 100%
4.99G/4.99G [01:58<00:00, 42.4MB/s]
model-00005-of-00005.safetensors: 100%
3.17G/3.17G [01:15<00:00, 42.2MB/s]
Loading checkpoint shards: 100%
5/5 [00:52<00:00, 6.42s/it]
generation_config.json: 100%
242/242 [00:00<00:00, 13.7kB/s]
WARNING:accelerate.big_modeling:Some parameters are on the meta device because they were offloaded to the disk and cpu.
tokenizer_config.json: 100%
1.29k/1.29k [00:00<00:00, 74.3kB/s]
vocab.json: 100%
2.78M/2.78M [00:00<00:00, 8.50MB/s]
merges.txt: 100%
1.67M/1.67M [00:00<00:00, 12.5MB/s]
tokenizer.json: 100%
7.03M/7.03M [00:00<00:00, 21.2MB/s]
Device set to use cuda:0
/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py:1965: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
warnings.warn(
/usr/local/lib/python3.10/dist-packages/aqlm/inference_kernels/cuda_kernel.py:20: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch.
@torch.library.impl_abstract("aqlm::code1x16_matmat")
/usr/local/lib/python3.10/dist-packages/aqlm/inference_kernels/cuda_kernel.py:33: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch.
@torch.library.impl_abstract("aqlm::code1x16_matmat_dequant")
/usr/local/lib/python3.10/dist-packages/aqlm/inference_kernels/cuda_kernel.py:48: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch.
@torch.library.impl_abstract("aqlm::code1x16_matmat_dequant_transposed")
/usr/local/lib/python3.10/dist-packages/aqlm/inference_kernels/cuda_kernel.py:62: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch.
@torch.library.impl_abstract("aqlm::code2x8_matmat")
/usr/local/lib/python3.10/dist-packages/aqlm/inference_kernels/cuda_kernel.py:75: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch.
@torch.library.impl_abstract("aqlm::code2x8_matmat_dequant")
/usr/local/lib/python3.10/dist-packages/aqlm/inference_kernels/cuda_kernel.py:88: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch.
@torch.library.impl_abstract("aqlm::code2x8_matmat_dequant_transposed")
[{'generated_text': [{'role': 'user', 'content': 'Who are you?'},
{'role': 'assistant',
'content': 'I am Qwen, a large language model created by Alibaba Cloud. I am here to assist you'}]}]
The text was updated successfully, but these errors were encountered: