[Bug]: 10% perf drop on mixtral8x22b due to commit b62fba85ac03326e9f466d8d37e91ae1b14a6511 #305

hlin99 · 2024-09-20T01:11:32Z

Your current environment

The output of `python collect_env.py`

🐛 Describe the bug

seq_group_metadata_list.extend(
self.create_dummy_seq_group_metadata(0, 0, is_prompt)
for _ in range(batch_size_padding))

this piece of code introduces metadata certation in loop, and observe 10% perf drop. is this code change intentional?

The text was updated successfully, but these errors were encountered:

iboiko-habana · 2024-09-20T07:24:14Z

Please re-check perf with #301

hlin99 · 2024-09-23T06:18:57Z

Unfortunately, performance has not improved, and the data looks identical before and after applying the patch. It seems that dummy creation and list extension are not the root cause of the performance drop. Instead, the issue appears to stem from changes to the dummy metadata, which are affecting subsequent calling path changes.

iboiko-habana · 2024-09-23T09:14:56Z

Please share traces or steps for reproduction

hlin99 · 2024-09-23T10:56:32Z

Below is my docker configuration with VLLM environments setup.
Then, in the docker environment, goto vllm/benchmark
run benchmark cmd
python benchmark_throughput.py --backend vllm --dataset ./ShareGPT_V3_unfiltered_cleaned_split.json --tensor-parallel-size 8 --model mistralai/Mixtral-8x22B-Instruct-v0.1 --device hpu --dtype bfloat16 --gpu-memory-utilization 0.7 --max-num-batched-tokens 262144

before the change, the output throughput is about 2500 tokens/s, after the change it becomes 2200 tokens/s.

#!/bin/bash

export DOCKER_IMAGE=${DOCKER_IMAGES:-vault.habana.ai/gaudi-docker/1.17.0/ubuntu22.04/habanalabs/pytorch-installer-2.3.1:latest}
export CONTAINER_NAME=${CONTAINER_NAME:-vllm-server-mixtral-8x22b}
export DATA_DIR=${DATA_DIR:-/data0}
export SSH_PORT=${SSH_PORT:-3022}
export HABANA_VISIBLE_DEVICES=${HABANA_VISIBLE_DEVICES:-all}
export HF_TOKEN=${HF_TOKEN}

print_help(){
echo "Usage: $0 [options]"
echo "This script create and setup the docker container or $CONTAINER_NAME"
echo "Enter the container bash shell if no option specified."
echo
echo "Options:"
echo " -h, --help Show this help message and exit."
echo " 1 Create and setup the base container and exit."
echo " 2 Setup the container based on the setup.sh and exit"
echo " 0 Stop the container"
echo " -1 Stop and remove the container"
}

if [[ "$1" == "-h" || "$1" == "--help" ]]; then
print_help
exit 0
fi

if [ ! "${HABANA_VISIBLE_DEVICES}" == "all" ]; then
index_module_data=$(hl-smi --query-aip=index,module_id --format=csv)
echo "$index_module_data"
declare -A index_module_map
while IFS=", " read -r index module_id; do
index_module_map[$index]=$module_id
done <<< "$(echo "$index_module_data" | tail -n +2)"
indices=(${HABANA_VISIBLE_DEVICES//,/ })
module_ids=()
for index in "${indices[@]}"; do
module_ids+=(${index_module_map[$index]})
done
visible_modules=$( IFS=,; echo "${module_ids[*]}")
echo HABANA_VISIBLE_DEVICES=${HABANA_VISIBLE_DEVICES}
echo HABANA_VISIBLE_MODULES=${visible_modules}
else
visible_modules="0,1,2,3,4,5,6,7"
fi

container_existing=$(docker ps -a --filter "name=^/${CONTAINER_NAME}$" --format '{{.Names}}')
container_running=$(docker ps --filter "name=^/${CONTAINER_NAME}$" --format '{{.Names}}')

if [[ "$1" == "1" ]] || [[ -z "$container_existing" ]]; then
if [ ! -z "$container_existing" ]; then
echo "Error: Container ${CONTAINER_NAME} exists. Remove the existing container first."
exit -1
fi
docker run --runtime=habana --name ${CONTAINER_NAME} -td
-e HABANA_VISIBLE_DEVICES=${HABANA_VISIBLE_DEVICES}
-e OMPI_MCA_btl_vader_single_copy_mechanism=none
--cap-add=sys_nice --net=host --ipc=host
--env http_proxy=${http_proxy}
--env https_proxy=${https_proxy}
--env no_proxy=${no_proxy}
--env HF_HOME=${DATA_DIR}/huggingface
--env DATA_DIR=${DATA_DIR}
--env WORKSPACE_ROOT=/workspace
--env HABANA_VISIBLE_MODULES=${visible_modules}
--env "HUGGING_FACE_HUB_TOKEN=${HF_TOKEN}"
--env PT_HPU_ENABLE_LAZY_COLLECTIVES=true
--env PT_HPUGRAPH_DISABLE_TENSOR_CACHE=1
--env VLLM_GRAPH_RESERVED_MEM=0.6
--env VLLM_GRAPH_PROMPT_RATIO=0
--env VLLM_DECODE_BLOCK_BUCKET_MAX=2048
--env VLLM_PROMPT_BS_BUCKET_STEP=128
--env VLLM_PROMPT_BS_BUCKET_MAX=256
--volume pwd:/workspace
--volume ${DATA_DIR}:${DATA_DIR}
--name ${CONTAINER_NAME}
${DOCKER_IMAGE} bash

hlin99 added the bug Something isn't working label Sep 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: 10% perf drop on mixtral8x22b due to commit b62fba85ac03326e9f466d8d37e91ae1b14a6511 #305

[Bug]: 10% perf drop on mixtral8x22b due to commit b62fba85ac03326e9f466d8d37e91ae1b14a6511 #305

hlin99 commented Sep 20, 2024

iboiko-habana commented Sep 20, 2024

hlin99 commented Sep 23, 2024

iboiko-habana commented Sep 23, 2024

hlin99 commented Sep 23, 2024

[Bug]: 10% perf drop on mixtral8x22b due to commit b62fba85ac03326e9f466d8d37e91ae1b14a6511 #305

[Bug]: 10% perf drop on mixtral8x22b due to commit b62fba85ac03326e9f466d8d37e91ae1b14a6511 #305

Comments

hlin99 commented Sep 20, 2024

Your current environment

🐛 Describe the bug

iboiko-habana commented Sep 20, 2024

hlin99 commented Sep 23, 2024

iboiko-habana commented Sep 23, 2024

hlin99 commented Sep 23, 2024