diff --git a/docs/generative-ai/llm-serving.md b/docs/generative-ai/llm-serving.md
index 54dc5d6b..cd9241c4 100644
--- a/docs/generative-ai/llm-serving.md
+++ b/docs/generative-ai/llm-serving.md
@@ -55,7 +55,9 @@ On top of the Caikit+TGIS or TGIS built-in runtimes, the following custom runtim
 
 #### Standalone Inference Servers
 
-- [vLLM](https://github.com/rh-aiservices-bu/llm-on-openshift/blob/main/llm-servers/vllm/README.md){:target="_blank"}: how to deploy vLLM, the "Easy, fast, and cheap LLM serving for everyone".
+- vLLM: how to deploy vLLM, the "Easy, fast, and cheap LLM serving for everyone".
+    - on [GPU](https://github.com/rh-aiservices-bu/llm-on-openshift/blob/main/llm-servers/vllm/gpu/README.md){:target="_blank"}: 
+    - on [CPU](https://github.com/rh-aiservices-bu/llm-on-openshift/blob/main/llm-servers/vllm/cpu/README.md){:target="_blank"}:
 - [Hugging Face TGI](https://github.com/rh-aiservices-bu/llm-on-openshift/blob/main/llm-servers/hf_tgi/README.md){:target="_blank"}: how to deploy the Text Generation Inference server from Hugging Face.
 - [Caikit-TGIS-Serving](https://github.com/opendatahub-io/caikit-tgis-serving){:target="_blank"}: how to deploy the Caikit-TGIS-Serving stack, from OpenDataHub.