Skip to content

Commit

Permalink
Refine embedding microservice document (#1199)
Browse files Browse the repository at this point in the history
* Update embedding microservice document

Signed-off-by: lvliang-intel <[email protected]>
  • Loading branch information
lvliang-intel authored Jan 24, 2025
1 parent 90b53b7 commit 67a5fca
Show file tree
Hide file tree
Showing 4 changed files with 347 additions and 0 deletions.
12 changes: 12 additions & 0 deletions comps/embeddings/src/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,15 @@ Key Features:
**Customizable**: Supports configuration and customization to meet specific use case requirements, including different embedding models and preprocessing techniques.

Users are albe to configure and build embedding-related services according to their actual needs.

## Embeddings Microservice with TEI

For details, please refer to [readme](./README_tei.md).

## Embeddings Microservice with Prediction Guard

For details, please refer to this [readme](./README_predictionguard.md).

## Embeddings Microservice with Multimodal

For details, please refer to this [readme](./README_bridgetower.md).
106 changes: 106 additions & 0 deletions comps/embeddings/src/README_bridgetower.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
# Multimodal Embeddings Microservice

The Multimodal Embedding Microservice is designed to efficiently convert pairs of textual string and image into vectorized embeddings, facilitating seamless integration into various machine learning and data processing workflows. This service utilizes advanced algorithms to generate high-quality embeddings that capture the joint semantic essence of the input text-and-image pairs, making it ideal for applications in multi-modal data processing, information retrieval, and similar fields.

Key Features:

**High Performance**: Optimized for quick and reliable conversion of textual data and image inputs into vector embeddings.

**Scalability**: Built to handle high volumes of requests simultaneously, ensuring robust performance even under heavy loads.

**Ease of Integration**: Provides a simple and intuitive API, allowing for straightforward integration into existing systems and workflows.

**Customizable**: Supports configuration and customization to meet specific use case requirements, including different embedding models and preprocessing techniques.

Users are albe to configure and build embedding-related services according to their actual needs.

## 📦 1. Start Microservice

### 🔹 1.1 Build Docker Image

#### Build bridgetower multimodal embedding service

- For Gaudi HPU:

```bash
cd ../../../
docker build -t opea/embedding-multimodal-bridgetower-hpu:latest --build-arg EMBEDDER_PORT=$EMBEDDER_PORT --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/third_parties/bridgetower/src/Dockerfile.intel_hpu .
```

- For Xeon CPU:

```bash
cd ../../../
docker build -t opea/embedding-multimodal-bridgetower:latest --build-arg EMBEDDER_PORT=$EMBEDDER_PORT --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/third_parties/bridgetower/src/Dockerfile .
```

#### Build Embedding Microservice Docker

```bash
cd ../../../
docker build -t opea/embedding:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/embeddings/src/Dockerfile .
```

### 🔹 1.2 Run Docker with Docker Compose

```bash
export your_mmei_port=8080
export EMBEDDER_PORT=$your_mmei_port
export MMEI_EMBEDDING_ENDPOINT="http://$ip_address:$your_mmei_port"
export your_embedding_port_microservice=6600
export MM_EMBEDDING_PORT_MICROSERVICE=$your_embedding_port_microservice
cd comps/embeddings/deployment/docker_compose/
```

- For Gaudi HPU:

```bash
docker compose up multimodal-bridgetower-embedding-gaudi-serving multimodal-bridgetower-embedding-gaudi-server -d
```

- For Xeon CPU:

```bash
docker compose up multimodal-bridgetower-embedding-serving multimodal-bridgetower-embedding-server -d
```

## 📦 2. Consume Embedding Service

Once the service is running, you can start using the API to generate embeddings for text and image pairs.

### 🔹 2.1 Check Service Status

Verify that the embedding service is running properly by checking its health status with this command:

```bash
curl http://localhost:6000/v1/health_check \
-X GET \
-H 'Content-Type: application/json'
```

### 🔹 2.2 Use the Embedding Service API

You can now make API requests to generate embeddings. The service supports both single text embeddings and joint text-image embeddings.

**Compute a Joint Embedding of an Image-Text Pair**
To compute an embedding for a text and image pair, use the following API request:

```bash
curl -X POST http://0.0.0.0:6600/v1/embeddings \
-H "Content-Type: application/json" \
-d '{"text": {"text" : "This is some sample text."}, "image" : {"url": "https://github.com/docarray/docarray/blob/main/tests/toydata/image-data/apple.png?raw=true"}}'
```

In this example, the input is a text and an image URL. The service will return a vectorized embedding that represents both the text and image.

**Compute an embedding of a text**

To generate an embedding for just a text input, use this request:

```bash
curl -X POST http://0.0.0.0:6600/v1/embeddings \
-H "Content-Type: application/json" \
-d '{"text" : "This is some sample text."}'
```

This request will return an embedding representing the semantic meaning of the input text.
100 changes: 100 additions & 0 deletions comps/embeddings/src/README_predictionguard.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Embedding Microservice with Prediction Guard

[Prediction Guard](https://docs.predictionguard.com) allows you to utilize hosted open access LLMs, LVMs, and embedding functionality with seamlessly integrated safeguards. In addition to providing a scalable access to open models, Prediction Guard allows you to configure factual consistency checks, toxicity filters, PII filters, and prompt injection blocking. Join the [Prediction Guard Discord channel](https://discord.gg/TFHgnhAFKd) and request an API key to get started.

This embedding microservice is designed to efficiently convert text into vectorized embeddings using the [BridgeTower model](https://huggingface.co/BridgeTower/bridgetower-large-itm-mlm-itc). Thus, it is ideal for both RAG or semantic search applications.

**Note** - The BridgeTower model implemented in Prediction Guard can actually embed text, images, or text + images (jointly). For now this service only embeds text, but a follow on contribution will enable the multimodal functionality.

## 📦 1. Start Microservice with `docker run`

### 🔹 1.1 Start Embedding Service with TEI

Before starting the service, ensure the following environment variable is set:

```bash
export PREDICTIONGUARD_API_KEY=${your_predictionguard_api_key}
```

### 🔹 1.2 Build Docker Image

To build the Docker image for the embedding service, run the following command:

```bash
cd ../../../
docker build -t opea/embedding:latest -f comps/embeddings/src/Dockerfile .
```

### 🔹 1.3 Start Service

Run the Docker container in detached mode with the following command:

```bash
docker run -d --name="embedding-predictionguard" -p 6000:6000 -e PREDICTIONGUARD_API_KEY=$PREDICTIONGUARD_API_KEY opea/embedding:latest
```

## 📦 2. Start Microservice with docker compose

You can also deploy the Prediction Guard embedding service using Docker Compose for easier management of multi-container setups.

🔹 Steps:

1. Set environment variables:

```bash
export PG_EMBEDDING_MODEL_NAME="bridgetower-large-itm-mlm-itc"
export EMBEDDER_PORT=6000
export PREDICTIONGUARD_API_KEY=${your_predictionguard_api_key}
```

2. Navigate to the Docker Compose directory:

```bash
cd comps/embeddings/deployment/docker_compose/
```

3. Start the services:

```bash
docker compose up pg-embedding-server -d
```

## 📦 3. Consume Embedding Service

### 🔹 3.1 Check Service Status

Verify the embedding service is running:

```bash
curl http://localhost:6000/v1/health_check \
-X GET \
-H 'Content-Type: application/json'
```

### 🔹 3.2 Use the Embedding Service API

The API is compatible with the [OpenAI API](https://platform.openai.com/docs/api-reference/embeddings).

1. Single Text Input

```bash
curl http://localhost:6000/v1/embeddings \
-X POST \
-d '{"input":"Hello, world!"}' \
-H 'Content-Type: application/json'
```

2. Multiple Text Inputs with Parameters

```bash
curl http://localhost:6000/v1/embeddings \
-X POST \
-d '{"input":["Hello, world!","How are you?"], "dimensions":100}' \
-H 'Content-Type: application/json'
```

## ✨ Additional Notes

- Prediction Guard Features: Prediction Guard comes with built-in safeguards such as factual consistency checks, toxicity filters, PII detection, and prompt injection protection, ensuring safe use of the service.
- Multimodal Support: While the service currently only supports text embeddings, we plan to extend this functionality to support images and joint text-image embeddings in future releases.
- Scalability: The microservice can easily scale to handle large volumes of requests for embedding generation, making it suitable for large-scale semantic search and RAG applications.
129 changes: 129 additions & 0 deletions comps/embeddings/src/README_tei.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# 🌟 Embedding Microservice with TEI

This guide walks you through starting, deploying, and consuming the **TEI-based Embeddings Microservice**. 🚀

---

## 📦 1. Start Microservice with `docker run`

### 🔹 1.1 Start Embedding Service with TEI

1. **Start the TEI service**:
Replace `your_port` and `model` with desired values to start the service.

```bash
your_port=8090
model="BAAI/bge-large-en-v1.5"
docker run -p $your_port:80 -v ./data:/data --name tei-embedding-serving \
-e http_proxy=$http_proxy -e https_proxy=$https_proxy --pull always \
ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 --model-id $model
```

2. **Test the TEI service**:
Run the following command to check if the service is up and running.

```bash
curl localhost:$your_port/v1/embeddings \
-X POST \
-d '{"input":"What is Deep Learning?"}' \
-H 'Content-Type: application/json'
```

### 🔹 1.2 Build Docker Image and Run Docker with CLI

1. Build the Docker image for the embedding microservice:

```bash
cd ../../../
docker build -t opea/embedding:latest \
--build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy \
-f comps/embeddings/src/Dockerfile .
```

2. Run the embedding microservice and connect it to the TEI service:

```bash
docker run -d --name="embedding-tei-server" \
-p 6000:5000 \
-e http_proxy=$http_proxy -e https_proxy=$https_proxy \
--ipc=host \
-e TEI_EMBEDDING_ENDPOINT=$TEI_EMBEDDING_ENDPOINT \
-e EMBEDDING_COMPONENT_NAME="OPEA_TEI_EMBEDDING" \
opea/embedding:latest
```

## 📦 2. Start Microservice with docker compose

Deploy both the TEI Embedding Service and the Embedding Microservice using Docker Compose.

🔹 Steps:

1. Set environment variables:

```bash
export host_ip=${your_ip_address}
export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
export TEI_EMBEDDER_PORT=8090
export EMBEDDER_PORT=6000
export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:${TEI_EMBEDDER_PORT}"
```

2. Navigate to the Docker Compose directory:

```bash
cd comps/embeddings/deployment/docker_compose/
```

3. Start the services:

```bash
docker compose up tei-embedding-serving tei-embedding-server -d
```

## 📦 3. Consume Embedding Service

### 🔹 3.1 Check Service Status

Verify the embedding service is running:

```bash
curl http://localhost:6000/v1/health_check \
-X GET \
-H 'Content-Type: application/json'
```

### 🔹 3.2 Use the Embedding Service API

The API is compatible with the [OpenAI API](https://platform.openai.com/docs/api-reference/embeddings).

1. Single Text Input

```bash
curl http://localhost:6000/v1/embeddings \
-X POST \
-d '{"input":"Hello, world!"}' \
-H 'Content-Type: application/json'
```

2. Multiple Text Inputs with Parameters

```bash
curl http://localhost:6000/v1/embeddings \
-X POST \
-d '{"input":["Hello, world!","How are you?"], "dimensions":100}' \
-H 'Content-Type: application/json'
```

## ✨ Tips for Better Understanding:

1. Port Mapping:
Ensure the ports are correctly mapped to avoid conflicts with other services.

2. Model Selection:
Choose a model appropriate for your use case, like "BAAI/bge-large-en-v1.5" or "BAAI/bge-base-en-v1.5".

3. Environment Variables:
Use http_proxy and https_proxy for proxy setup if necessary.

4. Data Volume:
The `-v ./data:/data` flag ensures the data directory is correctly mounted.

0 comments on commit 67a5fca

Please sign in to comment.