Skip to content

Latest commit

 

History

History
 
 

283-photo-maker

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Text-to-image generation using PhotoMaker and OpenVINO

PhotoMaker is an efficient personalized text-to-image generation method, which mainly encodes an arbitrary number of input ID images into a stack ID embedding for preserving ID information. Such an embedding, serving as a unified ID representation, can not only encapsulate the characteristics of the same input ID comprehensively, but also accommodate the characteristics of different IDs for subsequent integration. This paves the way for more intriguing and practically valuable applications. Users can input one or a few face photos, along with a text prompt, to receive a customized photo or painting (no training required!). Additionally, this model can be adapted to any base model based on SDXL or used in conjunction with other LoRA modules.More details about PhotoMaker can be found in the technical report.

The notebook provides a simple interface that allows communication with a model using text instruction. In this demonstration user can provide input instructions and the model generates an image.

The image below illustrates the provided generated image example.

output

Notebook Contents

The tutorial consists of the following steps:

  • PhotoMaker pipeline introduction
  • Prerequisites
  • Load original pipeline and prepare models for conversion
  • Convert models to OpenVINO Intermediate representation (IR) format
  • Prepare Inference pipeline
  • Running Text-to-Image Generation OpenVINO
  • Interactive Demo

Installation Instructions

This is a self-contained example that relies solely on its own code.
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start. For details, please refer to Installation Guide.