[New Feature] Add the serving support #279

cyber-pioneer · 2024-11-26T06:53:01Z

New Feature

Flagscale supports serve.

Description

This pull request introduces support for deploying large models with FlagScale, leveraging the Ray framework for efficient orchestration and scalability. Currently, this implementation supports the Qwen model, enabling users to easily deploy and manage large-scale machine learning services.

Future Key features include:

Easy distributed Serve on base of eamless integration with Ray.
Optimized resource management for serve excution of multiple tasks.
Simplified deployment process for the LLM and Multimodal models.

More details in Serve.

ceci3 · 2024-11-27T06:27:38Z

examples/qwen/serve/qwen2.5_7b.yaml

+  gpu-memory-utilization: 0.9
+  max-model-len: 32768
+  max-num-seqs: 256
+  port: 4567


port也是vllm里需要的参数吗？

是的 vllm字段内的参数，与原生的vllm serve命令行保持一致

ceci3 · 2024-11-27T06:28:09Z

examples/qwen/serve/qwen2.5_7b.yaml

+  max-model-len: 32768
+  max-num-seqs: 256
+  port: 4567
+  action-args:


action-args和上面的字段有什么区别？什么字段会放在这里

举例说明：vllm serve /models/Qwen2.5-7B-Instruct --tensor-parallel-size=1 --gpu-memory-utilization=0.9 --max-model-len=32768 --max-num-seqs=256 --port=4567 --trust-remote-code --enable-chunked-prefill

--trust-remote-code --enable-chunked-prefill 这两个参数是预设的行为，命令行里是禁止传入参数值的，属于action args。

本PR将增加文档说明部署参数选项，降低开发者的接入成本。

ceci3 · 2024-11-27T06:47:26Z

flagscale/runner/runner_serve.py

+        else:
+            run_local_command(f"bash {host_run_script_file}", dryrun)
+
+    def run(self, with_test=False, dryrun=False):


with_test的含义是要不要放在后台执行吗？

ceci3 · 2024-11-27T06:49:08Z

flagscale/runner/runner_serve.py

+                "localhost",
+                available_addr,
+                available_port,
+                1,


local的话只支持单节点单卡吗？

本pr里仅支持单卡，多卡场景支持中

ceci3 · 2024-11-27T06:51:24Z

flagscale/serve/README.md

+pip install modelscope
+modelscope download --model Qwen/Qwen2.5-7B-Instruct --local_dir /models/


准备模型的流程和上面一样加一个小标题

ceci3 · 2024-11-27T06:52:10Z

flagscale/serve/README.md

+python run.py --config-path examples/qwen/ --config-name config action=run
+```
+
+


后续可以加一个文档说明怎么替换模型和数据，用户怎么使用工具来部署自己的模型

好的，本pr会补充

ceci3 · 2024-11-27T06:53:29Z

flagscale/serve/run_simple_vllm.py

+    logger.info("Standard Output:")
+    logger.info(stdout)
+    # logger.info(stdout.decode())
+    logger.info("Standard Error:")
+    logger.info(stderr)
+    # logger.info(stderr.decode())
+


可以精简一下

[Serve] Flagscale support serve.

631888a

cyber-pioneer requested a review from a team as a code owner November 26, 2024 06:53

cyber-pioneer added 2 commits November 26, 2024 14:59

update doc

030cbd1

update doc

dd8a68f

cyber-pioneer changed the title ~~[No Merge][Serve] New Feature: Flagscale support serve!~~ [No Merge][Serve] New Feature: Flagscale supports serve! Nov 26, 2024

cyber-pioneer added 4 commits November 26, 2024 15:37

updata doc

4246ebf

polish code

5a4d557

debug python path

4f50ff8

fix vll python path

e5de5a8

ceci3 reviewed Nov 27, 2024

View reviewed changes

cyber-pioneer added 14 commits November 28, 2024 15:51

debug2

f48cbf9

debug3

b105ffa

debug3

71c08f0

update readme

6f3c080

polish code

617fb64

polish code

7d481ff

update readme

97db612

update readme

f6afabc

fix log

c01cb64

fix code

62703bc

fix note

e473d46

polish log

dd50182

polish code

eee4bb0

fix code

3c6b5ef

cyber-pioneer force-pushed the main branch from 3acaef3 to 3c6b5ef Compare December 2, 2024 03:16

cyber-pioneer changed the title ~~[No Merge][Serve] New Feature: Flagscale supports serve!~~ [Serve] New Feature: Flagscale supports serve! Dec 2, 2024

format code

88b5eac

aoyulong changed the title ~~[Serve] New Feature: Flagscale supports serve!~~ [New Feature] Add the serving support Dec 2, 2024

cyber-pioneer added 2 commits December 3, 2024 09:55

Merge branch 'main' into main

f21801d

Merge branch 'main' into main

ebcc9de

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[New Feature] Add the serving support #279

[New Feature] Add the serving support #279

cyber-pioneer commented Nov 26, 2024 •

edited

Loading

ceci3 Nov 27, 2024

cyber-pioneer Nov 28, 2024

ceci3 Nov 27, 2024

cyber-pioneer Nov 28, 2024 •

edited

Loading

ceci3 Nov 27, 2024

cyber-pioneer Nov 28, 2024

ceci3 Nov 27, 2024

cyber-pioneer Nov 28, 2024

ceci3 Nov 27, 2024

cyber-pioneer Nov 28, 2024

ceci3 Nov 27, 2024

cyber-pioneer Nov 28, 2024

ceci3 Nov 27, 2024

cyber-pioneer Nov 28, 2024

		pip install modelscope
		modelscope download --model Qwen/Qwen2.5-7B-Instruct --local_dir /models/

		python run.py --config-path examples/qwen/ --config-name config action=run
		```

[New Feature] Add the serving support #279

Are you sure you want to change the base?

[New Feature] Add the serving support #279

Conversation

cyber-pioneer commented Nov 26, 2024 • edited Loading

New Feature

Description

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cyber-pioneer Nov 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cyber-pioneer commented Nov 26, 2024 •

edited

Loading

cyber-pioneer Nov 28, 2024 •

edited

Loading