[Feat] SGLang SRT commands in one go, async input for openai server #212

kcz358 · 2024-08-27T05:35:50Z

Before you open a pull-request, please check if a similar issue already exists or has been closed before.

When you open a pull-request, please be sure to include the following

A descriptive title: [xxx] XXXX
A detailed description

Thank you for your contributions!

This PR is to support sglang srt model to evaluate llava in one command. Now no longer needs to use a separate command to set up the backend server. An example command would be the following

# After update, there is no need to use an extra command to setup backend server
# the server will be initialized in the init process

# launch lmms-eval srt_api model
CKPT_PATH=$1
TASK=$2
MODALITY=$3
TP_SIZE=$4
echo $TASK
TASK_SUFFIX="${TASK//,/_}"
echo $TASK_SUFFIX

python3 -m lmms_eval \
    --model srt_api \
    --model_args modality=$MODALITY,model_version=$CKPT_PATH,tp=$TP_SIZE,host=127.0.0.1,port=30000,timeout=600 \
    --tasks $TASK \
    --batch_size 1 \
    --log_samples \
    --log_samples_suffix $TASK_SUFFIX \
    --output_path ./logs/

Also, the PR uses async to submit request to sglang server so that requests can be processed in batches. The speed for running MME reaches around 3 its /sec for num_processes=48. The speed is around 1.5 seconds /it when using single batch

kcz358 added 4 commits August 27, 2024 03:31

Add sglang launch server in srt init process

cbc8599

Add async support for openai server to enable faster speed

0d02bad

Update srt commands

1dfea2e

Update srt run_examples

f3c73b8

Luodian approved these changes Aug 27, 2024

View reviewed changes

Luodian merged commit 0d7ffcc into main Aug 27, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat] SGLang SRT commands in one go, async input for openai server #212

[Feat] SGLang SRT commands in one go, async input for openai server #212

kcz358 commented Aug 27, 2024

[Feat] SGLang SRT commands in one go, async input for openai server #212

[Feat] SGLang SRT commands in one go, async input for openai server #212

Conversation

kcz358 commented Aug 27, 2024

When you open a pull-request, please be sure to include the following