From 6e7ba8df9365d4d9b2b366a2b1d7ff0f844e6bbb Mon Sep 17 00:00:00 2001
From: Shangming Cai <caishangming@linux.alibaba.com>
Date: Mon, 16 Dec 2024 15:54:05 +0800
Subject: [PATCH] [Doc] Update state of mooncake transfer engine integration
 with vLLM.

Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
---
 README.md                               |  1 +
 doc/en/vllm-integration-v0.2-nightly.md | 11 +++++------
 2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/README.md b/README.md
index bdd0ff6..a6e8c4c 100644
--- a/README.md
+++ b/README.md
@@ -10,6 +10,7 @@ This repository also hosts its technical report and the open sourced traces.
 
 <h2 id="updates">🔄 Updates</h2>
 
+ - **Dec 16, 2024**: vLLM officially supports Mooncake Transfer Engine for disaggregated prefilling and KV cache transfer.
  - **Nov 28, 2024**: We open sourced the Transfer Engine, the central component of Mooncake. We also provide two demonstrations of Transfer Engine: a P2P Store and vLLM integration.
  - **July 9, 2024**: We open sourced the trace as a <a href="https://github.com/kvcache-ai/Mooncake/blob/main/mooncake_trace.jsonl" target="_blank">jsonl file</a>!.
  - **June 27, 2024**: We present a series of Chinese blogs with more discussions on <a href="https://zhuanlan.zhihu.com/p/705754254">zhihu 1</a>, <a href="https://zhuanlan.zhihu.com/p/705910725">2</a>, <a href="https://zhuanlan.zhihu.com/p/706204757">3</a>, <a href="https://zhuanlan.zhihu.com/p/707997501">4</a>.
diff --git a/doc/en/vllm-integration-v0.2-nightly.md b/doc/en/vllm-integration-v0.2-nightly.md
index f9b982b..4a6717c 100644
--- a/doc/en/vllm-integration-v0.2-nightly.md
+++ b/doc/en/vllm-integration-v0.2-nightly.md
@@ -1,24 +1,23 @@
 # vLLM Disaggregated Prefill/Decode Demo
 
 ## Overview
-This is the nightly version of mooncake-transfer-engine integration with the vLLM project based on [PR 10502](https://github.com/vllm-project/vllm/pull/10502) (vllm version: v0.6.4.post1/main) to accelerate KVCache transfer for inter-node disaggregated Prefill/Decode scenario. We have run some experiments to obtain some [preview benchmark results](vllm-benchmark-results-v0.2.md). More benchmark results will be released in due time.
+This is the latest version of mooncake-transfer-engine integration doc with the vLLM project based on [PR 10502](https://github.com/vllm-project/vllm/pull/10502) and [PR 10884](https://github.com/vllm-project/vllm/pull/10884) (vllm version: v0.6.4.post1/main) to accelerate KVCache transfer for inter-node disaggregated Prefill/Decode scenario. We have run some experiments to obtain some [preview benchmark results](vllm-benchmark-results-v0.2.md). More benchmark results will be released in due time.
 
-**_Please note that this is not a fully ready version and will be modified anytime based on feedback from the vLLM community._**
+**_Please note that this is still an experimental version and will be modified anytime based on feedback from the vLLM community._**
 
 ## Installation
 ### Prerequisite
 Please install the Mooncake Transfer Engine according to the [instructions](build.md) first.
 
-### Install an experimental version of vLLM
-#### 1. Clone vLLM from an indicated repo
+### Install the latest version of vLLM
+#### 1. Clone vLLM from official repo
 ```bash
-git clone git@github.com:kvcache-ai/vllm.git
+git clone git@github.com:vllm-project/vllm.git
 ```
 #### 2. Build
 ##### 2.1 Build from source (Include C++ and CUDA code)
 ```bash
 cd vllm
-git checkout upstream-mooncake-integration
 pip3 uninstall vllm -y
 pip3 install -e .
 ```