Refine embedding microservice document (#1199)

* Update embedding microservice document Signed-off-by: lvliang-intel <[email protected]>
opea-project · Jan 24, 2025 · 67a5fca · 67a5fca
1 parent 90b53b7
commit 67a5fca
Show file tree

Hide file tree

Showing 4 changed files with 347 additions and 0 deletions.
diff --git a/comps/embeddings/src/README.md b/comps/embeddings/src/README.md
@@ -13,3 +13,15 @@ Key Features:
 **Customizable**: Supports configuration and customization to meet specific use case requirements, including different embedding models and preprocessing techniques.
 
 Users are albe to configure and build embedding-related services according to their actual needs.
+
+## Embeddings Microservice with TEI
+
+For details, please refer to [readme](./README_tei.md).
+
+## Embeddings Microservice with Prediction Guard
+
+For details, please refer to this [readme](./README_predictionguard.md).
+
+## Embeddings Microservice with Multimodal
+
+For details, please refer to this [readme](./README_bridgetower.md).
diff --git a/comps/embeddings/src/README_bridgetower.md b/comps/embeddings/src/README_bridgetower.md
@@ -0,0 +1,106 @@
+# Multimodal Embeddings Microservice
+
+The Multimodal Embedding Microservice is designed to efficiently convert pairs of textual string and image into vectorized embeddings, facilitating seamless integration into various machine learning and data processing workflows. This service utilizes advanced algorithms to generate high-quality embeddings that capture the joint semantic essence of the input text-and-image pairs, making it ideal for applications in multi-modal data processing, information retrieval, and similar fields.
+
+Key Features:
+
+**High Performance**: Optimized for quick and reliable conversion of textual data and image inputs into vector embeddings.
+
+**Scalability**: Built to handle high volumes of requests simultaneously, ensuring robust performance even under heavy loads.
+
+**Ease of Integration**: Provides a simple and intuitive API, allowing for straightforward integration into existing systems and workflows.
+
+**Customizable**: Supports configuration and customization to meet specific use case requirements, including different embedding models and preprocessing techniques.
+
+Users are albe to configure and build embedding-related services according to their actual needs.
+
+## 📦 1. Start Microservice
+
+### 🔹 1.1 Build Docker Image
+
+#### Build bridgetower multimodal embedding service
+
+- For Gaudi HPU:
+
+```bash
+cd ../../../
+docker build -t opea/embedding-multimodal-bridgetower-hpu:latest --build-arg EMBEDDER_PORT=$EMBEDDER_PORT --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/third_parties/bridgetower/src/Dockerfile.intel_hpu .
+```
+
+- For Xeon CPU:
+
+```bash
+cd ../../../
+docker build -t opea/embedding-multimodal-bridgetower:latest --build-arg EMBEDDER_PORT=$EMBEDDER_PORT --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/third_parties/bridgetower/src/Dockerfile .
+```
+
+#### Build Embedding Microservice Docker
+
+```bash
+cd ../../../
+docker build -t opea/embedding:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/embeddings/src/Dockerfile .
+```
+
+### 🔹 1.2 Run Docker with Docker Compose
+
+```bash
+export your_mmei_port=8080
+export EMBEDDER_PORT=$your_mmei_port
+export MMEI_EMBEDDING_ENDPOINT="http://$ip_address:$your_mmei_port"
+export your_embedding_port_microservice=6600
+export MM_EMBEDDING_PORT_MICROSERVICE=$your_embedding_port_microservice
+cd comps/embeddings/deployment/docker_compose/
+```
+
+- For Gaudi HPU:
+
+```bash
+docker compose up multimodal-bridgetower-embedding-gaudi-serving multimodal-bridgetower-embedding-gaudi-server -d
+```
+
+- For Xeon CPU:
+
+```bash
+docker compose up multimodal-bridgetower-embedding-serving multimodal-bridgetower-embedding-server -d
+```
+
+## 📦 2. Consume Embedding Service
+
+Once the service is running, you can start using the API to generate embeddings for text and image pairs.
+
+### 🔹 2.1 Check Service Status
+
+Verify that the embedding service is running properly by checking its health status with this command:
+
+```bash
+curl http://localhost:6000/v1/health_check \
+-X GET \
+-H 'Content-Type: application/json'
+```
+
+### 🔹 2.2 Use the Embedding Service API
+
+You can now make API requests to generate embeddings. The service supports both single text embeddings and joint text-image embeddings.
+
+**Compute a Joint Embedding of an Image-Text Pair**
+To compute an embedding for a text and image pair, use the following API request:
+
+```bash
+curl -X POST http://0.0.0.0:6600/v1/embeddings \
+     -H "Content-Type: application/json" \
+     -d '{"text": {"text" : "This is some sample text."}, "image" : {"url": "https://github.com/docarray/docarray/blob/main/tests/toydata/image-data/apple.png?raw=true"}}'
+```
+
+In this example, the input is a text and an image URL. The service will return a vectorized embedding that represents both the text and image.
+
+**Compute an embedding of a text**
+
+To generate an embedding for just a text input, use this request:
+
+```bash
+curl -X POST http://0.0.0.0:6600/v1/embeddings \
+     -H "Content-Type: application/json" \
+     -d '{"text" : "This is some sample text."}'
+```
+
+This request will return an embedding representing the semantic meaning of the input text.
diff --git a/comps/embeddings/src/README_predictionguard.md b/comps/embeddings/src/README_predictionguard.md
@@ -0,0 +1,100 @@
+# Embedding Microservice with Prediction Guard
+
+[Prediction Guard](https://docs.predictionguard.com) allows you to utilize hosted open access LLMs, LVMs, and embedding functionality with seamlessly integrated safeguards. In addition to providing a scalable access to open models, Prediction Guard allows you to configure factual consistency checks, toxicity filters, PII filters, and prompt injection blocking. Join the [Prediction Guard Discord channel](https://discord.gg/TFHgnhAFKd) and request an API key to get started.
+
+This embedding microservice is designed to efficiently convert text into vectorized embeddings using the [BridgeTower model](https://huggingface.co/BridgeTower/bridgetower-large-itm-mlm-itc). Thus, it is ideal for both RAG or semantic search applications.
+
+**Note** - The BridgeTower model implemented in Prediction Guard can actually embed text, images, or text + images (jointly). For now this service only embeds text, but a follow on contribution will enable the multimodal functionality.
+
+## 📦 1. Start Microservice with `docker run`
+
+### 🔹 1.1 Start Embedding Service with TEI
+
+Before starting the service, ensure the following environment variable is set:
+
+```bash
+export PREDICTIONGUARD_API_KEY=${your_predictionguard_api_key}
+```
+
+### 🔹 1.2 Build Docker Image
+
+To build the Docker image for the embedding service, run the following command:
+
+```bash
+cd ../../../
+docker build -t opea/embedding:latest -f comps/embeddings/src/Dockerfile .
+```
+
+### 🔹 1.3 Start Service
+
+Run the Docker container in detached mode with the following command:
+
+```bash
+docker run -d --name="embedding-predictionguard" -p 6000:6000 -e PREDICTIONGUARD_API_KEY=$PREDICTIONGUARD_API_KEY opea/embedding:latest
+```
+
+## 📦 2. Start Microservice with docker compose
+
+You can also deploy the Prediction Guard embedding service using Docker Compose for easier management of multi-container setups.
+
+🔹 Steps:
+
+1. Set environment variables:
+
+   ```bash
+   export PG_EMBEDDING_MODEL_NAME="bridgetower-large-itm-mlm-itc"
+   export EMBEDDER_PORT=6000
+   export PREDICTIONGUARD_API_KEY=${your_predictionguard_api_key}
+   ```
+
+2. Navigate to the Docker Compose directory:
+
+   ```bash
+   cd comps/embeddings/deployment/docker_compose/
+   ```
+
+3. Start the services:
+
+   ```bash
+   docker compose up pg-embedding-server -d
+   ```
+
+## 📦 3. Consume Embedding Service
+
+### 🔹 3.1 Check Service Status
+
+Verify the embedding service is running:
+
+```bash
+curl http://localhost:6000/v1/health_check \
+-X GET \
+-H 'Content-Type: application/json'
+```
+
+### 🔹 3.2 Use the Embedding Service API
+
+The API is compatible with the [OpenAI API](https://platform.openai.com/docs/api-reference/embeddings).
+
+1. Single Text Input
+
+   ```bash
+   curl http://localhost:6000/v1/embeddings \
+   -X POST \
+   -d '{"input":"Hello, world!"}' \
+   -H 'Content-Type: application/json'
+   ```
+
+2. Multiple Text Inputs with Parameters
+
+   ```bash
+   curl http://localhost:6000/v1/embeddings \
+   -X POST \
+   -d '{"input":["Hello, world!","How are you?"], "dimensions":100}' \
+   -H 'Content-Type: application/json'
+   ```
+
+## ✨ Additional Notes
+
+- Prediction Guard Features: Prediction Guard comes with built-in safeguards such as factual consistency checks, toxicity filters, PII detection, and prompt injection protection, ensuring safe use of the service.
+- Multimodal Support: While the service currently only supports text embeddings, we plan to extend this functionality to support images and joint text-image embeddings in future releases.
+- Scalability: The microservice can easily scale to handle large volumes of requests for embedding generation, making it suitable for large-scale semantic search and RAG applications.
diff --git a/comps/embeddings/src/README_tei.md b/comps/embeddings/src/README_tei.md
@@ -0,0 +1,129 @@
+# 🌟 Embedding Microservice with TEI
+
+This guide walks you through starting, deploying, and consuming the **TEI-based Embeddings Microservice**. 🚀
+
+---
+
+## 📦 1. Start Microservice with `docker run`
+
+### 🔹 1.1 Start Embedding Service with TEI
+
+1. **Start the TEI service**:
+   Replace `your_port` and `model` with desired values to start the service.
+
+   ```bash
+   your_port=8090
+   model="BAAI/bge-large-en-v1.5"
+   docker run -p $your_port:80 -v ./data:/data --name tei-embedding-serving \
+   -e http_proxy=$http_proxy -e https_proxy=$https_proxy --pull always \
+   ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 --model-id $model
+   ```
+
+2. **Test the TEI service**:
+   Run the following command to check if the service is up and running.
+
+   ```bash
+   curl localhost:$your_port/v1/embeddings \
+   -X POST \
+   -d '{"input":"What is Deep Learning?"}' \
+   -H 'Content-Type: application/json'
+   ```
+
+### 🔹 1.2 Build Docker Image and Run Docker with CLI
+
+1. Build the Docker image for the embedding microservice:
+
+   ```bash
+   cd ../../../
+   docker build -t opea/embedding:latest \
+   --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy \
+   -f comps/embeddings/src/Dockerfile .
+   ```
+
+2. Run the embedding microservice and connect it to the TEI service:
+
+   ```bash
+   docker run -d --name="embedding-tei-server" \
+   -p 6000:5000 \
+   -e http_proxy=$http_proxy -e https_proxy=$https_proxy \
+   --ipc=host \
+   -e TEI_EMBEDDING_ENDPOINT=$TEI_EMBEDDING_ENDPOINT \
+   -e EMBEDDING_COMPONENT_NAME="OPEA_TEI_EMBEDDING" \
+   opea/embedding:latest
+   ```
+
+## 📦 2. Start Microservice with docker compose
+
+Deploy both the TEI Embedding Service and the Embedding Microservice using Docker Compose.
+
+🔹 Steps:
+
+1. Set environment variables:
+
+   ```bash
+   export host_ip=${your_ip_address}
+   export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
+   export TEI_EMBEDDER_PORT=8090
+   export EMBEDDER_PORT=6000
+   export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:${TEI_EMBEDDER_PORT}"
+   ```
+
+2. Navigate to the Docker Compose directory:
+
+   ```bash
+   cd comps/embeddings/deployment/docker_compose/
+   ```
+
+3. Start the services:
+
+   ```bash
+   docker compose up tei-embedding-serving tei-embedding-server -d
+   ```
+
+## 📦 3. Consume Embedding Service
+
+### 🔹 3.1 Check Service Status
+
+Verify the embedding service is running:
+
+```bash
+curl http://localhost:6000/v1/health_check \
+-X GET \
+-H 'Content-Type: application/json'
+```
+
+### 🔹 3.2 Use the Embedding Service API
+
+The API is compatible with the [OpenAI API](https://platform.openai.com/docs/api-reference/embeddings).
+
+1. Single Text Input
+
+   ```bash
+   curl http://localhost:6000/v1/embeddings \
+   -X POST \
+   -d '{"input":"Hello, world!"}' \
+   -H 'Content-Type: application/json'
+   ```
+
+2. Multiple Text Inputs with Parameters
+
+   ```bash
+   curl http://localhost:6000/v1/embeddings \
+   -X POST \
+   -d '{"input":["Hello, world!","How are you?"], "dimensions":100}' \
+   -H 'Content-Type: application/json'
+   ```
+
+## ✨ Tips for Better Understanding:
+
+1. Port Mapping:
+   Ensure the ports are correctly mapped to avoid conflicts with other services.
+
+2. Model Selection:
+   Choose a model appropriate for your use case, like "BAAI/bge-large-en-v1.5" or "BAAI/bge-base-en-v1.5".
+
+3. Environment Variables:
+   Use http_proxy and https_proxy for proxy setup if necessary.
+
+4. Data Volume:
+   The `-v ./data:/data` flag ensures the data directory is correctly mounted.