Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Baseline for SGLang Benchmark Test #602

Merged
merged 48 commits into from
Dec 4, 2024
Merged
Changes from 10 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
3c21be0
Add benchmark using sglang server,
stbaione Nov 22, 2024
31398a5
Fix import path in `shortfin_benchmark_test`,
stbaione Nov 22, 2024
4d0323f
Merge branch 'main' into sgl-benchmark-add-baseline
stbaione Nov 22, 2024
fc78284
Change `ci-sglang-benchmark/integration` to use `mi300x-4`,
stbaione Nov 22, 2024
0909e8f
Fix github runner label
stbaione Nov 22, 2024
d7cc539
Add installation steps, since test does require some functionality fr…
stbaione Nov 22, 2024
cf16e54
Fix typo in model names
stbaione Nov 22, 2024
86058b8
Add container name,
stbaione Nov 22, 2024
acbedb0
Temporarily remove `--rm` to try and obtain container logs after failure
stbaione Nov 22, 2024
34c8410
Remove quotes around HF_TOKEN
stbaione Nov 22, 2024
0d5574d
Try using env var for HF_SECRET
stbaione Nov 23, 2024
c9f4d33
Move secrets.HF_TOKEN back to command
stbaione Nov 25, 2024
4fa094c
Add temporary command to see if HF_TOKEN is being set properly
stbaione Nov 25, 2024
c33ef75
Add back command to rm container once stopped
stbaione Nov 25, 2024
6986765
Merge branch 'main' of https://github.com/nod-ai/shark-ai into users/…
stbaione Nov 25, 2024
fea2655
Allow for full e2e verification
stbaione Nov 25, 2024
3641445
Update hash for pip cache in benchmark and integration tests
stbaione Nov 25, 2024
d82d9df
Remove version pinning for `iree-base-compiler` and `iree-base-runtime`
stbaione Nov 25, 2024
e843281
Add `--pre` to iree installations in SGLang tests
stbaione Nov 26, 2024
7fe76d2
Merge branch 'main' into users/stbaione/sgl-benchmark-add-baseline
stbaione Nov 26, 2024
ea65936
Merge branch 'main' into users/stbaione/sgl-benchmark-add-baseline
stbaione Dec 2, 2024
01da13c
Slightly lower threshold in integration tests, to allow still valid, …
stbaione Dec 2, 2024
ed37ef1
Fix `publish_dir` in `Deploy to Github Pages` step
stbaione Dec 2, 2024
d1e434f
Merge branch 'main' into users/stbaione/sgl-benchmark-add-baseline
stbaione Dec 2, 2024
d29c7bb
Remove unneeded deps for SGLang benchmark,
stbaione Dec 2, 2024
4529e09
Comment out `needs` line for CI validation
stbaione Dec 2, 2024
09e0fb6
Remove temporary disablements,
stbaione Dec 2, 2024
422729f
Remove `Get Current Date` step in shortfin benchmark job
stbaione Dec 2, 2024
969b608
Add `README` description to top of CI file
stbaione Dec 2, 2024
f67e399
Add job to merge html reports from both benchmark jobs and upload to …
stbaione Dec 3, 2024
6909edc
Fix upload/download paths
stbaione Dec 3, 2024
aa35176
Split download into two steps
stbaione Dec 3, 2024
9578acc
Ensure all html files are in same dir
stbaione Dec 3, 2024
b9b9ea5
Remove PR trigger
stbaione Dec 3, 2024
526194f
Remove `sharktank` installation from SGLang benchmark,
stbaione Dec 3, 2024
57babdf
Use hf to download tokenizer in `sglang_benchmark_test`
stbaione Dec 3, 2024
8f7f0fb
Small cleanup of sglang ci deps section
stbaione Dec 3, 2024
305c4b0
Make shortfin/sglang benchmark such that they still run sequentially,…
stbaione Dec 3, 2024
d78ab73
Remove dep on shortfin benchmark in sgl benchmark,
stbaione Dec 3, 2024
29a8221
Make sure `merge_and_upload_reports` waits for prior jobs to finish
stbaione Dec 3, 2024
35960e0
Merge branch 'main' into users/stbaione/sgl-benchmark-add-baseline
stbaione Dec 3, 2024
7efecb8
Move code checkout to first step in `benchmark_sglang`
stbaione Dec 3, 2024
d28bf01
Merge branch 'users/stbaione/sgl-benchmark-add-baseline' of https://g…
stbaione Dec 3, 2024
1e90573
Remove PR trigger
stbaione Dec 3, 2024
3f6564f
Merge branch 'main' into users/stbaione/sgl-benchmark-add-baseline
stbaione Dec 4, 2024
b1ec485
Repin `iree-base-compiler` and `iree-base-runtime` due to `abort` issue
stbaione Dec 4, 2024
56d3a5c
Merge branch 'main' into users/stbaione/sgl-benchmark-add-baseline
stbaione Dec 4, 2024
5d406c8
Merge branch 'main' into users/stbaione/sgl-benchmark-add-baseline
stbaione Dec 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
107 changes: 72 additions & 35 deletions .github/workflows/ci-sglang-benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,18 @@
# See https://llvm.org/LICENSE.txt for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

# =================================== README ===================================
# The `benchmark_sglang` job in this CI is mostly dependent on code outside
# of the `shark-ai` repo itself. By including it here, we are able to maintain
# an apples-to-apples comparison between shortfin and SGLang performance in a
# centralized location, as we place more effort in shortfin LLM performance, and
# WHILE WE WORK TOWARDS A BETTER ALTERNATIVE.

# We should not be generally repeating this pattern, and should never repeat
# this pattern outside of specifically benchmarking shortfin apps against
# external projects, as part of an organized and clearly defined effort.
# ==============================================================================

name: SGLang Llama Benchmarking Tests

on:
Expand Down Expand Up @@ -35,10 +47,6 @@ jobs:
env:
PIP_CACHE_DIR: "${{ github.workspace }}/.pip-cache"
steps:
- name: Get Current Date
id: date
run: echo "::set-output name=date::$(date +'%Y-%m-%d')"

- name: "Setting up Python"
id: setup_python
uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5.3.0
Expand All @@ -53,7 +61,7 @@ jobs:
id: cache-pip
with:
path: ${{ env.PIP_CACHE_DIR }}
key: pip-${{ steps.setup_python.outputs.python-version }}-${{ hashFiles('*requirements*.txt','shortfin/requirements*.txt','sharktank/requirements*.txt') }}
key: pip-${{ matrix.version }}-${{ hashFiles('*requirements*.txt','shortfin/requirements*.txt','sharktank/requirements*.txt') }}

- name: Install pip deps
run: |
Expand All @@ -78,15 +86,13 @@ jobs:
run: pip install "git+https://github.com/nod-ai/sglang.git#subdirectory=python"

- name: Run Shortfin Benchmark Tests
run: pytest -v app_tests/benchmark_tests/llm/sglang_benchmarks/shortfin_benchmark_test.py --log-cli-level=INFO --html=out/llm/shortfin/index.html
run: pytest -v app_tests/benchmark_tests/llm/sglang_benchmarks/shortfin_benchmark_test.py --log-cli-level=INFO --html=shortfin_index.html --self-contained-html

- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@4f9cc6602d3f66b9c108549d475ec49e8ef4d45e # v4.0.0
- name: Upload pytest report
uses: actions/upload-artifact@b4b15b8c7c6ac21ea08fcf65892d2ee8f75cf882
with:
github_token: ${{ secrets.SHARK_PLATFORM_GH_TOKEN }}
publish_dir: ./out/llm/shortfin
destination_dir: ./llm/sgl_benchmark/shortfin
keep_files: true
name: shortfin_benchmark
path: shortfin_index.html

benchmark_sglang:
if: ${{ github.repository_owner == 'nod-ai' || github.event_name != 'schedule' }}
Expand All @@ -103,10 +109,6 @@ jobs:
env:
PIP_CACHE_DIR: "${{ github.workspace }}/.pip-cache"
steps:
- name: Get Current Date
id: date
run: echo "::set-output name=date::$(date +'%Y-%m-%d')"

- name: "Setting up Python"
id: setup_python
uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5.3.0
Expand All @@ -121,26 +123,14 @@ jobs:
id: cache-pip
with:
path: ${{ env.PIP_CACHE_DIR }}
key: pip-${{ steps.setup_python.outputs.python-version }}-${{ hashFiles('*requirements*.txt','shortfin/requirements*.txt','sharktank/requirements*.txt') }}
key: pip-${{ matrix.version }}-${{ hashFiles('*requirements*.txt','sharktank/requirements*.txt') }}

- name: Install pip deps
run: |
python -m pip install --no-compile --upgrade pip
# Note: We install in three steps in order to satisfy requirements
# from non default locations first. Installing the PyTorch CPU
# wheels saves multiple minutes and a lot of bandwidth on runner setup.
pip install --no-compile -r pytorch-cpu-requirements.txt
pip install --no-compile -f https://iree.dev/pip-release-links.html --src deps \
-e "git+https://github.com/iree-org/iree-turbine.git#egg=iree-turbine"
pip install --no-compile -r requirements.txt -e sharktank/ shortfin/

# Try with the latest nightly releases, not what iree-turbine pins.
# We could also pin to a known working or stable version.
# This should eventually stabilize. Do the best we can for now.
pip install -f https://iree.dev/pip-release-links.html --upgrade --pre \
iree-base-compiler \
iree-base-runtime \
"numpy<2.0"
# Note: Only sharktank is required to use `hf_datasets` script
# for downloading model weights.
pip install --no-compile -r requirements.txt -e sharktank/

- name: Install SGLang
run: pip install "git+https://github.com/nod-ai/sglang.git#subdirectory=python"
Expand Down Expand Up @@ -186,18 +176,65 @@ jobs:

- name: Run SGLang Benchmark Tests
run: |
pytest -v app_tests/benchmark_tests/llm/sglang_benchmarks/sglang_benchmark_test.py --port 30000 --log-cli-level=INFO --html=out/llm/sglang/index.html
pytest -v app_tests/benchmark_tests/llm/sglang_benchmarks/sglang_benchmark_test.py --port 30000 --log-cli-level=INFO --html=sglang_index.html --self-contained-html

- name: Stop sglang-server
run: docker stop sglang-server || true # Stop container if it's running

# Deleting image after run due to large disk space requirement (83 GB)
- name: Cleanup SGLang Image
run: docker image rm lmsysorg/sglang:v0.3.5.post1-rocm620

- name: Upload pytest report
uses: actions/upload-artifact@b4b15b8c7c6ac21ea08fcf65892d2ee8f75cf882
with:
name: sglang_benchmark
path: sglang_index.html

merge_and_upload_reports:
if: ${{ github.repository_owner == 'nod-ai' || github.event_name != 'schedule' }}
name: "Merge and upload benchmark reports"
needs: [benchmark_shortfin, benchmark_sglang]
stbaione marked this conversation as resolved.
Show resolved Hide resolved
strategy:
matrix:
version: [3.11]
fail-fast: false
stbaione marked this conversation as resolved.
Show resolved Hide resolved
runs-on: ubuntu-24.04
defaults:
run:
shell: bash
steps:
- name: "Setting up Python"
id: setup_python
uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5.3.0
with:
python-version: ${{matrix.version}}

- name: Install pytest-html-merger
run: pip install pytest-html-merger

- name: Download shortfin report
uses: actions/download-artifact@fa0a91b85d4f404e444e00e005971372dc801d16
with:
name: shortfin_benchmark
path: reports

- name: Download sglang report
uses: actions/download-artifact@fa0a91b85d4f404e444e00e005971372dc801d16
with:
name: sglang_benchmark
path: reports

- name: Create merged report directory
run: mkdir merged_reports

- name: Merge html reports
run: pytest_html_merger -i reports -o merged_reports/index.html
stbaione marked this conversation as resolved.
Show resolved Hide resolved

- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@4f9cc6602d3f66b9c108549d475ec49e8ef4d45e # v4.0.0
with:
github_token: ${{ secrets.SHARK_PLATFORM_GH_TOKEN }}
publish_dir: ./out/llm/sglang
destination_dir: ./llm/sgl_benchmark/sglang
publish_dir: merged_reports
destination_dir: ./llm/sglang
keep_files: true
Loading