QualiCLIP

Quality-Aware Image-Text Alignment for Opinion-Unaware Image Quality Assessment

🔥🔥🔥 [2025/01/15] QualiCLIP is now included in the IQA-PyTorch library

This is the official repository of the paper "Quality-Aware Image-Text Alignment for Opinion-Unaware Image Quality Assessment".

Note

If you are interested in IQA, take a look at our new dataset with UHD images and our self-supervised NR-IQA model

Overview

Abstract

No-Reference Image Quality Assessment (NR-IQA) focuses on designing methods to measure image quality in alignment with human perception when a high-quality reference image is unavailable. Most state-of-the-art NR-IQA approaches are opinion-aware, i.e. they require human annotations for training. This dependency limits their scalability and broad applicability. To overcome this limitation, we propose QualiCLIP (Quality-aware CLIP), a CLIP-based self-supervised opinion-unaware approach that does not require human opinions. In particular, we introduce a quality-aware image-text alignment strategy to make CLIP generate quality-aware image representations. Starting from pristine images, we synthetically degrade them with increasing levels of intensity. Then, we train CLIP to rank these degraded images based on their similarity to quality-related antonym text prompts. At the same time, we force CLIP to generate consistent representations for images with similar content and the same level of degradation. Our experiments show that the proposed method improves over existing opinion-unaware approaches across multiple datasets with diverse distortion types. Moreover, despite not requiring human annotations, QualiCLIP achieves excellent performance against supervised opinion-aware methods in cross-dataset experiments, thus demonstrating remarkable generalization capabilities.

Overview of the proposed quality-aware image-text alignment strategy. Starting from a pair of two random overlapping crops from a pristine image, we synthetically degrade them with $L$ increasing levels of intensity, resulting in $L$ pairs. Then, given two quality-related antonym prompts $T_p$ and $T_n$, we fine-tune CLIP's image encoder with three margin ranking losses ($L_{cons}$, $L_{pos}$, $L_{neg}$) by considering the similarity between the prompts and the degraded crops. Specifically, we use $L_{cons}$ to force CLIP to generate consistent representations for the crops belonging to each pair, since they exhibit similar content and the same degree of distortion. At the same time, we make the similarity between the prompt $T_p$ (or $T_n$) and the increasingly degraded versions of the crops correlate inversely (or directly) with the intensity of the distortion through $L_{pos}$ (or $L_{neg}$).

Citation

@article{agnolucci2024qualityaware,
      title={Quality-Aware Image-Text Alignment for Opinion-Unaware Image Quality Assessment}, 
      author={Agnolucci, Lorenzo and Galteri, Leonardo and Bertini, Marco},
      journal={arXiv preprint arXiv:2403.11176},
      year={2024}
}

Usage

Minimal Working Example

Thanks to torch.hub, you can use our model for inference without the need to clone our repo or install any specific dependencies. QualiCLIP outputs a quality score in the range [0, 1], where higher is better.

import torch
import torchvision.transforms as transforms
from PIL import Image

# Set the device
device = torch.device("cuda") if torch.cuda.is_available() else "cpu"

# Load the model
model = torch.hub.load(repo_or_dir="miccunifi/QualiCLIP", source="github", model="QualiCLIP")
model.eval().to(device)

# Define the preprocessing pipeline
preprocess = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.48145466, 0.4578275, 0.40821073], std=[0.26862954, 0.26130258, 0.27577711]),
])

# Load the image
img_path = "<path_to_your_image>"
img = Image.open(img_path).convert("RGB")

# Preprocess the image
img = preprocess(img).unsqueeze(0).to(device)

# Compute the quality score
with torch.no_grad(), torch.cuda.amp.autocast():
    score = model(img)

print(f"Image quality score: {score.item()}")

Getting Started

Installation

We recommend using the Anaconda package manager to avoid dependency/reproducibility problems. For Linux systems, you can find a conda installation guide here.

Clone the repository

git clone https://github.com/miccunifi/QualiCLIP

Install Python dependencies

conda create -n QualiCLIP -y python=3.10
conda activate QualiCLIP
cd QualiCLIP
chmod +x install_requirements.sh
./install_requirements.sh

Single Image Inference

To get the quality score of a single image, run the following command:

python single_image_inference.py --img_path assets/01.png

--img_path                  Path to the image to be evaluated

QualiCLIP outputs a quality score in the range [0, 1], where higher is better.

To be released

Pre-trained model
Testing code
Training code

Authors

LICENSE

All material is made available under Creative Commons BY-NC 4.0. You can use, redistribute, and adapt the material for non-commercial purposes, as long as you give appropriate credit by citing our paper and indicate any changes that you've made.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
assets		assets
clip		clip
LICENSE		LICENSE
README.md		README.md
hubconf.py		hubconf.py
install_requirements.sh		install_requirements.sh
single_image_inference.py		single_image_inference.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QualiCLIP

Quality-Aware Image-Text Alignment for Opinion-Unaware Image Quality Assessment

Overview

Abstract

Citation

Usage

Minimal Working Example

Getting Started

Installation

Single Image Inference

To be released

Authors

LICENSE

About

Releases 1

Packages

Languages

License

miccunifi/QualiCLIP

Folders and files

Latest commit

History

Repository files navigation

QualiCLIP

Quality-Aware Image-Text Alignment for Opinion-Unaware Image Quality Assessment

Overview

Abstract

Citation

Usage

Minimal Working Example

Getting Started

Installation

Single Image Inference

To be released

Authors

LICENSE

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages