Merge pull request kvcache-ai#40 from ShangmingCai/update_integration…

…_doc [Doc] Update the integration state of Mooncake Transfer Engine with vLLM.
ShangmingCai · Dec 16, 2024 · 481fea6 · 481fea6
2 parents c8c4ce0 + 6e7ba8d
commit 481fea6
Show file tree

Hide file tree

Showing 2 changed files with 6 additions and 6 deletions.
diff --git a/README.md b/README.md
@@ -10,6 +10,7 @@ This repository also hosts its technical report and the open sourced traces.
 
 <h2 id="updates">🔄 Updates</h2>
 
+ - **Dec 16, 2024**: vLLM officially supports Mooncake Transfer Engine for disaggregated prefilling and KV cache transfer.
  - **Nov 28, 2024**: We open sourced the Transfer Engine, the central component of Mooncake. We also provide two demonstrations of Transfer Engine: a P2P Store and vLLM integration.
  - **July 9, 2024**: We open sourced the trace as a <a href="https://github.com/kvcache-ai/Mooncake/blob/main/mooncake_trace.jsonl" target="_blank">jsonl file</a>!.
  - **June 27, 2024**: We present a series of Chinese blogs with more discussions on <a href="https://zhuanlan.zhihu.com/p/705754254">zhihu 1</a>, <a href="https://zhuanlan.zhihu.com/p/705910725">2</a>, <a href="https://zhuanlan.zhihu.com/p/706204757">3</a>, <a href="https://zhuanlan.zhihu.com/p/707997501">4</a>.

diff --git a/doc/en/vllm-integration-v0.2-nightly.md b/doc/en/vllm-integration-v0.2-nightly.md
@@ -1,24 +1,23 @@
 # vLLM Disaggregated Prefill/Decode Demo
 
 ## Overview
-This is the nightly version of mooncake-transfer-engine integration with the vLLM project based on [PR 10502](https://github.com/vllm-project/vllm/pull/10502) (vllm version: v0.6.4.post1/main) to accelerate KVCache transfer for inter-node disaggregated Prefill/Decode scenario. We have run some experiments to obtain some [preview benchmark results](vllm-benchmark-results-v0.2.md). More benchmark results will be released in due time.
+This is the latest version of mooncake-transfer-engine integration doc with the vLLM project based on [PR 10502](https://github.com/vllm-project/vllm/pull/10502) and [PR 10884](https://github.com/vllm-project/vllm/pull/10884) (vllm version: v0.6.4.post1/main) to accelerate KVCache transfer for inter-node disaggregated Prefill/Decode scenario. We have run some experiments to obtain some [preview benchmark results](vllm-benchmark-results-v0.2.md). More benchmark results will be released in due time.
 
-**_Please note that this is not a fully ready version and will be modified anytime based on feedback from the vLLM community._**
+**_Please note that this is still an experimental version and will be modified anytime based on feedback from the vLLM community._**
 
 ## Installation
 ### Prerequisite
 Please install the Mooncake Transfer Engine according to the [instructions](build.md) first.
 
-### Install an experimental version of vLLM
-#### 1. Clone vLLM from an indicated repo
+### Install the latest version of vLLM
+#### 1. Clone vLLM from official repo
 ```bash
-git clone [email protected]:kvcache-ai/vllm.git
+git clone [email protected]:vllm-project/vllm.git
 ```
 #### 2. Build
 ##### 2.1 Build from source (Include C++ and CUDA code)
 ```bash
 cd vllm
-git checkout upstream-mooncake-integration
 pip3 uninstall vllm -y
 pip3 install -e .
 ```