Skip to content

Latest commit

 

History

History
 
 

282-siglip-zero-shot-image-classification

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Zero-shot Image Classification with SigLIP

Colab

Zero-shot image classification is a computer vision task with the goal to classify images into one of several classes without any prior training or knowledge of these classes.

zero-shot-pipeline

In this tutorial, you will use the SigLIP model to perform zero-shot image classification.

Notebook Contents

This tutorial demonstrates how to perform zero-shot image classification using the open-source SigLIP model. The SigLIP model was proposed in the Sigmoid Loss for Language Image Pre-Training paper. SigLIP suggests replacing the loss function used in CLIP (Contrastive Language–Image Pre-training) with a simple pairwise sigmoid loss. This results in better performance in terms of zero-shot classification accuracy on ImageNet.

siglip-performance-comparison

*image source

You can find more information about this model in the research paper, GitHub repository, Hugging Face model page.

The notebook contains the following steps:

  1. Instantiate model.
  2. Run PyTorch model inference.
  3. Convert the model to OpenVINO Intermediate Representation (IR) format.
  4. Run OpenVINO model.
  5. Apply post-training quantization using NNCF:
    1. Prepare dataset.
    2. Quantize model.
    3. Run quantized OpenVINO model.
    4. Compare File Size.
    5. Compare inference time of the FP16 IR and quantized models.

The results of the SigLIP model's performance in zero-shot image classification from this notebook are demonstrated in the image below. image

Installation Instructions

If you have not installed all required dependencies, follow the Installation Guide.