-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V100 run video understanding #29
Comments
We've implemented support for eager attention. Could you please test the following code and let me know if you encounter any issues? @gehong-coder model = AutoModelForCausalLM.from_pretrained(
"rhymes-ai/Aria",
device_map="auto",
torch_dtype=torch.bfloat16,
trust_remote_code=True, # Corrected 'true' to 'True'
attn_implementation="eager",
) |
Hello, this problem occurs after I use the above settings. It seems that setting attn_implementation = eager here cannot use eager internally. File "/mnt/nfs/bj4-v100-1/data1/hong.ge/miniconda3/envs/aria/lib/python3.10/site-packages/transformers/models/idefics2/modeling_idefics2.py", line 467, in forward So I went into modeling_idefics2 and changed line 442 of self.self_attn = IDEFICS_VISION_ATTENTION_CLASSESconfig._attn_implementation and config._attn_implementation to eager. |
@gehong-coder Is your local model updated to the latest rhymes-ai/Aria repo? We updated it yesterday |
I have updated the model, but it still appears. Is it because grouped_gemm is not installed? grouped_gemm
|
Eager attention is not working, and not be able to run the model on V100s. Could you please help with this feature? |
@gehong-coder I can't reproduce this error on my local machine. Could you provide some minimal code to reproduce this bug? And what is the version of your |
python 3.10 this is my code def load_model(): model, processor = load_model()
def create_image_gallery(images, columns=3, spacing=20, bg_color=(200, 200, 200)):
def get_placeholders_for_videos(frames: List, timestamps=[]): def infer(contents):
frames, frame_timestamps = load_video("/mnt/nfs/bj4-v100-1/data1/hong.ge/workspace/data/test_data/test_caption/video/飞机.mp4", num_frames=128) |
@gehong-coder The recommended way to load the latest Aria is to load it from the online official site |
@aria-hacker |
@gehong-coder In most cases, you should not edit the code inside the transformers if you don't understand its whole context. I looked into it, and the modification you made in the wrong way which caused the error. The attention mask is built based on the type of attention name. However, you just directly modified the attention implementation, and the configuration stays in the You should only modify the config for the vision encoder and model, that's how we passed attn_implementation in the latest code. |
V100 cannot use flash attention, so I changed to using eager to calculate attention,
self.self_attn = IDEFICS_VISION_ATTENTION_CLASSES"eager"
but the following error occurred:
File "/mnt/nfs/bj4-v100-1/data1/hong.ge/miniconda3/envs/aria/lib/python3.10/site-packages/transformers/models/idefics2/modeling_idefics2.py", line 630, in forward
encoder_outputs = self.encoder(
File "/mnt/nfs/bj4-v100-1/data1/hong.ge/miniconda3/envs/aria/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/mnt/nfs/bj4-v100-1/data1/hong.ge/miniconda3/envs/aria/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/nfs/bj4-v100-1/data1/hong.ge/miniconda3/envs/aria/lib/python3.10/site-packages/accelerate/hooks.py", line 170, in new_forward
output = module._old_forward(*args, **kwargs)
File "/mnt/nfs/bj4-v100-1/data1/hong.ge/miniconda3/envs/aria/lib/python3.10/site-packages/transformers/models/idefics2/modeling_idefics2.py", line 555, in forward
layer_outputs = encoder_layer(
File "/mnt/nfs/bj4-v100-1/data1/hong.ge/miniconda3/envs/aria/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/mnt/nfs/bj4-v100-1/data1/hong.ge/miniconda3/envs/aria/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/nfs/bj4-v100-1/data1/hong.ge/miniconda3/envs/aria/lib/python3.10/site-packages/accelerate/hooks.py", line 170, in new_forward
output = module._old_forward(*args, **kwargs)
File "/mnt/nfs/bj4-v100-1/data1/hong.ge/miniconda3/envs/aria/lib/python3.10/site-packages/transformers/models/idefics2/modeling_idefics2.py", line 467, in forward
hidden_states, attn_weights = self.self_attn(
File "/mnt/nfs/bj4-v100-1/data1/hong.ge/miniconda3/envs/aria/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/mnt/nfs/bj4-v100-1/data1/hong.ge/miniconda3/envs/aria/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/nfs/bj4-v100-1/data1/hong.ge/miniconda3/envs/aria/lib/python3.10/site-packages/accelerate/hooks.py", line 170, in new_forward
output = module._old_forward(*args, **kwargs)
File "/mnt/nfs/bj4-v100-1/data1/hong.ge/miniconda3/envs/aria/lib/python3.10/site-packages/transformers/models/idefics2/modeling_idefics2.py", line 245, in forward
raise ValueError(
ValueError: Attention mask should be of size (128, 1, 1225, 1225), but is torch.Size([128, 1225])
The text was updated successfully, but these errors were encountered: