Add PR test workflow and check-in more testcases (#1208)

* refactor test case * refactor test case * refactor testcase * fix cuda allocate * fix cuda-prefix in pr run * Update daily_ete_test.yml * chage internlm2 model coverage to 20b in testcase * chage internlm2 model coverage to 20b in testcase * chage internlm2 model coverage to 20b in testcase * fix mp blocked by allocate cuda * add kvint8 and w4a16 chat cover * modify timeout for each step * fix lint * update prompt and pr trigger * update runner config * Update daily_ete_test.yml * change job name
InternLM · Mar 1, 2024 · 0430349 · 0430349
1 parent cc06bba
commit 0430349
Show file tree

Hide file tree

Showing 40 changed files with 1,846 additions and 1,480 deletions.
diff --git a/.github/workflows/daily_ete_test.yml b/.github/workflows/daily_ete_test.yml
@@ -3,7 +3,7 @@ name: daily_ete_test
 on:
   workflow_dispatch:
   schedule:
-    - cron:  '00 23 * * *'
+    - cron:  '00 18 * * *'
 
 env:
   HOST_PIP_CACHE_DIR: /nvme/github-actions/pip-cache
@@ -13,7 +13,7 @@ env:
 jobs:
   test_functions:
     runs-on: [self-hosted, linux-a100]
-    timeout-minutes: 240
+    timeout-minutes: 420
     env:
       REPORT_DIR: /nvme/qa_test_models/test-reports
     container:
@@ -68,36 +68,66 @@ jobs:
         run: |
           python3 -m pip list
           lmdeploy check_env
-      - name: Test lmdeploy - quantization
+      - name: Test lmdeploy - quantization w4a16
         continue-on-error: true
         run: |
-          pytest autotest -m '(quantization or quantization_w8a8) and not Baichuan2_7B_Chat and not Baichuan2_13B_Chat' -n 8 --alluredir=allure-results --clean-alluredir
+          pytest autotest/tools/quantization/test_quantization_w4a16.py -m 'not pr_test' -n 8 --alluredir=allure-results --clean-alluredir
+      - name: Test lmdeploy - quantization kv int8
+        continue-on-error: true
+        run: |
+          pytest autotest/tools/quantization/test_quantization_kvint8.py -n 8 --alluredir=allure-results
+      - name: Test lmdeploy - quantization w8a8
+        continue-on-error: true
+        run: |
+          pytest autotest/tools/quantization/test_quantization_w8a8.py -n 8 --alluredir=allure-results
+      - name: Test lmdeploy - quantization kv int8 and w4a16
+        continue-on-error: true
+        run: |
+          pytest autotest/tools/quantization/test_quantization_kvint8_w4a16.py -n 8 --alluredir=allure-results
       - name: Test lmdeploy - convert
         continue-on-error: true
         run: |
-          pytest autotest -m 'convert and not Baichuan2_7B_Chat and not Baichuan2_13B_Chat' -n 6 --alluredir=allure-results
-      - name: Test lmdeploy - pipeline
+          pytest autotest/tools/convert -m 'not pr_test' -n 6 --alluredir=allure-results --dist loadgroup
+      - name: Test lmdeploy - interface turbomind case
         continue-on-error: true
-        timeout-minutes: 60
-        run: pytest autotest -m '(pipeline_chat) and not Baichuan2_7B_Chat and not Baichuan2_13B_Chat' --alluredir=allure-results
-      - name: Test lmdeploy - restful
+        timeout-minutes: 20
+        run: |
+          pytest autotest/interface/pipeline/test_pipeline_turbomind_func.py -m 'not pr_test' --alluredir=allure-results
+      - name: Test lmdeploy - pipeline turbomind
         continue-on-error: true
-        run: pytest autotest -m restful_api --alluredir=allure-results
-      - name: Test lmdeploy - chat
+        timeout-minutes: 45
+        run: pytest autotest/tools/pipeline/test_pipeline_chat_turbomind.py -m 'not pr_test' --alluredir=allure-results
+      - name: Test lmdeploy - pipeline torch
+        continue-on-error: true
+        timeout-minutes: 75
+        run: pytest autotest/tools/pipeline/test_pipeline_chat_pytorch.py -m 'not pr_test' --alluredir=allure-results
+      - name: Test lmdeploy - restful turbomind
         continue-on-error: true
         timeout-minutes: 60
+        run: pytest autotest/tools/restful/test_restful_chat_turbomind.py -m 'not pr_test' --alluredir=allure-results
+      - name: Test lmdeploy - restful torch
+        continue-on-error: true
+        timeout-minutes: 80
+        run: pytest autotest/tools/restful/test_restful_chat_pytorch.py -m 'not pr_test' --alluredir=allure-results
+      - name: Test lmdeploy - chat workspace
+        continue-on-error: true
+        timeout-minutes: 30
         run: |
-          pytest autotest -m '(command_chat or command_chat_hf or command_chat_pytorch) and not Baichuan2_7B_Chat and not Baichuan2_13B_Chat' -n 4 --alluredir=allure-results
-      - name: Downgrade transformers
-        run: python3 -m pip install transformers==4.33.0
-      - name: Test lmdeploy - run Baichuan
+          pytest autotest/tools/chat/test_command_chat_workspace.py -m 'not pr_test' -n 4 --alluredir=allure-results
+      - name: Test lmdeploy - chat hf turbomind
         continue-on-error: true
-        timeout-minutes: 50
+        timeout-minutes: 45
         run: |
-          pytest autotest -m '(Baichuan2_7B_Chat or Baichuan2_13B_Chat) and not pipeline_chat_pytorch' --alluredir=allure-results
-      - name: Test lmdeploy - rerun fail cases
+          pytest autotest/tools/chat/test_command_chat_hf_turbomind.py -m 'not pr_test' -n 4 --alluredir=allure-results
+      - name: Test lmdeploy - chat hf torch
+        continue-on-error: true
+        timeout-minutes: 60
+        run: |
+          pytest autotest/tools/chat/test_command_chat_hf_pytorch.py -m 'not pr_test' -n 4 --alluredir=allure-results
+      - name: Test lmdeploy - rerun all fail cases
+        timeout-minutes: 60
         run: |
-          pytest autotest --alluredir=allure-results --lf
+          pytest autotest --lf --alluredir=allure-results
       - name: Generate reports
         if: always()
         run: |

diff --git a/.github/workflows/pr_ete_test.yml b/.github/workflows/pr_ete_test.yml
@@ -0,0 +1,99 @@
+name: pr_ete_test
+
+on:
+  pull_request:
+    paths:
+      - ".github/workflows/pr_ete_test.yml"
+      - "cmake/**"
+      - "src/**"
+      - "autotest/**"
+      - "3rdparty/**"
+      - "lmdeploy/**"
+      - "requirements/**"
+      - "requirements.txt"
+      - "CMakeLists.txt"
+      - "setup.py"
+  workflow_dispatch:
+
+
+env:
+  HOST_PIP_CACHE_DIR: /nvme/github-actions/pip-cache
+  HOST_LOCALTIME: /usr/share/zoneinfo/Asia/Shanghai
+
+
+jobs:
+  pr_functions_test:
+    runs-on: [self-hosted, linux-a100-pr]
+    timeout-minutes: 120
+    env:
+      REPORT_DIR: /nvme/qa_test_models/test-reports
+    container:
+      image: nvcr.io/nvidia/tritonserver:22.12-py3
+      options: "--gpus=all --ipc=host --user root -e PIP_CACHE_DIR=/root/.cache/pip"
+      volumes:
+        - /nvme/share_data/github-actions/pip-cache:/root/.cache/pip
+        - /nvme/share_data/github-actions/packages:/root/packages
+        - /nvme/qa_test_models:/nvme/qa_test_models
+        - /usr/share/zoneinfo/Asia/Shanghai:/etc/localtime:ro
+    steps:
+      - name: Setup systems
+        run: |
+          rm /etc/apt/sources.list.d/cuda*.list
+          apt-get update && apt-get install -y --no-install-recommends rapidjson-dev \
+              libgoogle-glog-dev libgl1 openjdk-8-jre-headless
+          dpkg -i /root/packages/allure_2.24.1-1_all.deb
+          rm -rf /var/lib/apt/lists/*
+      - name: Clone repository
+        uses: actions/checkout@v2
+      - name: Install pytorch
+        run: |
+          python3 -m pip cache dir
+          python3 -m pip install torch==2.1.0 torchvision==0.16.0 --index-url https://download.pytorch.org/whl/cu118
+      - name: Build lmdeploy
+        run: |
+          python3 -m pip install cmake
+          python3 -m pip install -r requirements/build.txt
+          mkdir build
+          cd build
+          cmake .. \
+              -DCMAKE_BUILD_TYPE=RelWithDebInfo \
+              -DCMAKE_EXPORT_COMPILE_COMMANDS=1 \
+              -DCMAKE_INSTALL_PREFIX=/opt/tritonserver \
+              -DBUILD_PY_FFI=ON \
+              -DBUILD_MULTI_GPU=ON \
+              -DCMAKE_CUDA_FLAGS="-lineinfo" \
+              -DUSE_NVTX=ON \
+              -DSM=80 \
+              -DCMAKE_CUDA_ARCHITECTURES=80 \
+              -DBUILD_TEST=OFF
+          make -j$(nproc) && make install
+      - name: Install lmdeploy
+        run: |
+          python3 -m pip install packaging protobuf transformers_stream_generator transformers datasets
+          # manually install flash attn
+          # the install packeage from. https://github.com/Dao-AILab/flash-attention/releases/download/v2.3.6/flash_attn-2.3.6+cu118torch2.0cxx11abiFALSE-cp38-cp38-linux_x86_64.whl
+          python3 -m pip install /root/packages/flash_attn-2.3.6+cu118torch2.1cxx11abiFALSE-cp38-cp38-linux_x86_64.whl
+          python3 -m pip install -r requirements.txt -r requirements/test.txt
+          python3 -m pip install .
+      - name: Check env
+        run: |
+          python3 -m pip list
+          lmdeploy check_env
+      - name: Test lmdeploy
+        timeout-minutes: 120
+        run: CUDA_VISIBLE_DEVICES=5,6 pytest autotest -m pr_test --alluredir=allure-results --clean-alluredir
+      - name: Generate reports
+        if: always()
+        run: |
+          export date_today="$(date +'%Y%m%d-%H%M%S')"
+          export report_dir="$REPORT_DIR/$date_today"
+          echo "Save report to $ALLURE_DIR"
+          allure generate -c -o $report_dir
+      - name: Clear workfile
+        if: always()
+        run: |
+          export workdir=$(pwd)
+          cd ..
+          rm -rf $workdir
+          mkdir $workdir
+          chmod -R 777 $workdir
diff --git a/autotest/README.md b/autotest/README.md
@@ -2,6 +2,16 @@
 
 We provide a autotest caseset to do regression.
 
+## Preparation before testing
+
+To improve the efficiency of test case execution, we have downloaded the hf model files to a specific path in advance for easy use in test cases. The path where the model files are stored is defined in the `autotest/config.yaml` file with parameter `model_path`.
+
+Since the test cases involve converting the hf model using convert, the converted model storage path is defined in the `autotest/config.yaml` file parameter `dst_path`.
+
+The `autotest/config.yaml` file also defines the supported model table and corresponding model categories, such as the `model_map` parameter, as well as the log storage path `log_path` used during test case execution.
+
+If you want to create a test environment, you need to prepare the above content and modify the config.yaml file as needed.
+
 ## How to run testcases
 
 Install required dependencies using the following command line:
@@ -10,10 +20,12 @@ Install required dependencies using the following command line:
 python3 -m pip install -r requirements/test.txt
 ```
 
-Run pytest command line with case filtering through -m flag. eg: `-m internlm_chat_7b` Filter cases related to internlm_chat_7b. The corresponding results will be stored in the `allure-results` directory.
+Run pytest command line with case filtering through -m flag or folder name. eg: `-m convert` Filter cases related to convert or `autotest/tools/convert` for the case in the folder. The corresponding results will be stored in the `allure-results` directory.
 
 ```bash
-pytest autotest -m internlm_chat_7b --clean-alluredir --alluredir=allure-results
+pytest autotest -m convert --clean-alluredir --alluredir=allure-results
+pytest autotest/tools/convert --clean-alluredir --alluredir=allure-results
+
 ```
 
 If you need to generate reports and display report features, you need to install allure according to the [install documentation of allure](https://allurereport.org/docs/gettingstarted-installation/#install-via-the-system-package-manager-for-linux). You can also install it directly using the following command:
@@ -32,53 +44,37 @@ allure generate -c -o allure-reports
 allure open ./allure-reports
 ```
 
-## Preparation before testing
-
-To improve the efficiency of test case execution, we have downloaded the hf model files to a specific path in advance for easy use in test cases. The path where the model files are stored is defined in the `autotest/config.yaml` file with parameter `model_path`.
-
-Since the test cases involve converting the hf model using convert, the converted model storage path is defined in the `autotest/config.yaml` file parameter `dst_path`.
-
-The `autotest/config.yaml` file also defines the supported model table and corresponding model categories, such as the `model_map` parameter, as well as the log storage path `log_path` used during test case execution.
-
-If you want to create a test environment, you need to prepare the above content and modify the config.yaml file as needed.
-
 ## Test case functionality coverage
 
-The test cases cover the following functionalities:
+The testcases are including following models:
+
+tools model - related to tutorials, the case is basic
 
-![image](https://github.com/InternLM/lmdeploy/assets/145004780/85d6a2d3-cc4f-459c-8dc1-22c17b69954f)
+interface model - interface function cases of pipeline、 restful api and triton server api
 
 The relationship between functionalities and test cases is as follows:
 
-|        Function         |          Test Case File           |
-| :---------------------: | :-------------------------------: |
-|   w4a16 quantization    |    test_order1_quantization_w4    |
-|    w8a8 quantization    |   test_order1_quantization_w8a8   |
-|         convert         |        test_order2_convert        |
-|      pipeline chat      |     test_order3_pipeline_chat     |
-| pipeline chat - pytorch | test_order3_pipeline_chat_pytorch |
-|    restful_api chat     |     test_order3_restful_chat      |
-|   command chat - cli    |     test_order3_command_chat      |
-|    command chat - hf    |    test_order3_command_chat_hf    |
-| command chat - pytorch  | test_order3_command_chat_pytorch  |
-
-The modules and models currently covered by the test cases are listed below:
-
-|                                   Models                                   | w4a16 quantization | w8a8 quantization | kvint8 quantization | convert | pipeline chat | pipeline chat - pytorch | restful_api chat | command chat - cli | command chat - hf | command chat - pytorch |
-| :------------------------------------------------------------------------: | :----------------: | :---------------: | :-----------------: | :-----: | :-----------: | :---------------------: | :--------------: | :----------------: | :---------------: | :--------------------: |
-|   [internlm2_chat_7b](https://huggingface.co/internlm/internlm2-chat-7b)   |         No         |        No         |         No          |   Yes   |      Yes      |           Yes           |        No        |        Yes         |        Yes        |          Yes           |
-|  [internlm2_chat_20b](https://huggingface.co/internlm/internlm2-chat-20b)  |        Yes         |        Yes        |         No          |   Yes   |      Yes      |           No            |       Yes        |        Yes         |        Yes        |          Yes           |
-|    [internlm_chat_7b](https://huggingface.co/internlm/internlm-chat-7b)    |         No         |        No         |         No          |   Yes   |      Yes      |           Yes           |       Yes        |        Yes         |        Yes        |           No           |
-|   [internlm_chat_20b](https://huggingface.co/internlm/internlm-chat-20b)   |        Yes         |        No         |         No          |   Yes   |      Yes      |           No            |        No        |        Yes         |        Yes        |           No           |
-|   [llama2_chat_7b_w4](https://huggingface.co/lmdeploy/llama2-chat-7b-w4)   |         No         |        No         |         No          |   Yes   |      Yes      |           No            |        No        |        Yes         |        Yes        |           No           |
-|          [Qwen_7B_Chat](https://huggingface.co/Qwen/Qwen-7B-Chat)          |        Yes         |        No         |         No          |   Yes   |      Yes      |           No            |        No        |        Yes         |        Yes        |           No           |
-|         [Qwen_14B_Chat](https://huggingface.co/Qwen/Qwen-14B-Chat)         |        Yes         |        No         |         No          |   Yes   |      Yes      |           No            |        No        |        Yes         |        Yes        |           No           |
-| [Baichuan2_7B_Chat](https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat) |        Yes         |        No         |         No          |   Yes   |      Yes      |           No            |        No        |        Yes         |        Yes        |           No           |
-|    [llama_2_7b_chat](https://huggingface.co/meta-llama/Llama-2-7b-chat)    |        Yes         |        No         |         No          |   Yes   |      Yes      |           No            |        No        |        Yes         |        Yes        |           No           |
+| case model |             Function             |                    Test Case File                    |
+| :--------: | :------------------------------: | :--------------------------------------------------: |
+|   tools    |       quantization - w4a16       |    tools/quantization/test_quantization_w4a16.py     |
+|   tools    |       quantization - w8a8        |     tools/quantization/test_quantization_w8a8.py     |
+|   tools    |      quantization - kv int8      |    tools/quantization/test_quantization_kvint8.py    |
+|   tools    | quantization - kv int8 and w4a16 | tools/quantization/test_quantization_kvint8_w4a16.py |
+|   tools    |             convert              |            tools/convert/test_convert.py             |
+|   tools    |    pipeline chat - turbomind     |    tools/pipeline/test_pipeline_chat_turbomind.py    |
+|   tools    |     pipeline chat - pytorch      |     tools/pipeline/test_pipeline_chat_pytorch.py     |
+|   tools    |   restful_api chat - turbomind   |    tools/pipeline/test_restful_chat_turbomind.py     |
+|   tools    |    restful_api chat - pytorch    |     tools/pipeline/test_restful_chat_pytorch.py      |
+|   tools    |     command chat - workspace     |      tools/chat/test_command_chat_workspace.py       |
+|   tools    |   command chat - hf turbomind    |     tools/chat/test_command_chat_hf_turbomind.py     |
+|   tools    |    command chat - hf pytorch     |      tools/chat/test_command_chat_hf_pytorch.py      |
+| interface  |    command chat - hf pytorch     |      tools/chat/test_command_chat_hf_pytorch.py      |
+
+The modules and models currently covered by the turbomind and pytorch backend is in `autotest/config.yaml` by using turbomind_model and pytorch_model.
 
 ## How to add a testcase
 
-you need to confirm that the corresponding model is ready <a href="##Preparation before testing">Jump to prepare Section</a>, then you can copy the existing case in the corresponding function test file. Please modify case mark, case story, case name and parameters if need.
+If you want add a new model into tool testcase, you should repare the model in your machine <a href="##Preparation before testing">Jump to prepare Section</a> then add it into `autotest/config.yaml`.
 
 ## How to add a chatcase template