-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make benchmark_throughput static support single image input #718
base: habana_main
Are you sure you want to change the base?
Conversation
2ae39d5
to
820b5dd
Compare
@kdamaszk Can you also help review this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michalkuligowski @kzawora-intel please review, this PR modifies native vllm benchmark. Is that ok?
Signed-off-by: yan ma <[email protected]>
Signed-off-by: yan ma <[email protected]>
except Exception as e: | ||
print(f"Failed to download image from {mm_data_url}: {e}") | ||
raw_image = None | ||
mm_data = {"image": raw_image} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,multimodal data can be run by using json file and not setting the 'dataset' parameter, with sample_requests method. Isn't that sufficient in the required scenarios?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide a detailed command for this test method as you describe? As I looked through the code, you may talk about the sharegpt4v_instruct_gpt4-vision_cap100k.json
which should work as a real dataset test. In the pr, we would like to append a image to test multi-modal path with FIXED
input length and output length, the full command likes python benchmarks/benchmark_throughput.py --model=meta-llama/Llama-3.2-11B-Vision-Instruct --max_model_len=4096 --input-len=1024 --output-len=2048 --num-prompts=1000 --max-num-seqs=64 --mm-data
.
No description provided.