-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: 10% perf drop on mixtral8x22b due to commit b62fba85ac03326e9f466d8d37e91ae1b14a6511 #305
Comments
Please re-check perf with #301 |
Unfortunately, performance has not improved, and the data looks identical before and after applying the patch. It seems that dummy creation and list extension are not the root cause of the performance drop. Instead, the issue appears to stem from changes to the dummy metadata, which are affecting subsequent calling path changes. |
Please share traces or steps for reproduction |
before the change, the output throughput is about 2500 tokens/s, after the change it becomes 2200 tokens/s. #!/bin/bash export DOCKER_IMAGE=${DOCKER_IMAGES:-vault.habana.ai/gaudi-docker/1.17.0/ubuntu22.04/habanalabs/pytorch-installer-2.3.1:latest} print_help(){ if [[ "$1" == "-h" || "$1" == "--help" ]]; then if [ ! "${HABANA_VISIBLE_DEVICES}" == "all" ]; then container_existing=$(docker ps -a --filter "name=^/${CONTAINER_NAME}$" --format '{{.Names}}') if [[ "$1" == "1" ]] || [[ -z "$container_existing" ]]; then |
Your current environment
🐛 Describe the bug
seq_group_metadata_list.extend(
self.create_dummy_seq_group_metadata(0, 0, is_prompt)
for _ in range(batch_size_padding))
this piece of code introduces metadata certation in loop, and observe 10% perf drop. is this code change intentional?
The text was updated successfully, but these errors were encountered: