Skip to content

diegocaumont/semantic-search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image Embeddings and Tagging with AltCLIP and CLIPViT

This project provides tools for generating image embeddings and tags using AltCLIP and CLIPViT models. It includes scripts for embedding images, ASSIGNING tags, and searching for similar images based on embeddings.

Table of Contents

Description

The project aims to process images by generating embeddings using the AltCLIP and CLIPViT models. It also classifies images using a custom VisionModel to generate tags. The embeddings and tags can be used for image similarity search and tagging purposes.

Required Models

Place this at root of project folder naming the folder as 'classify-model'.

Features

  • Generate image embeddings using AltCLIP and CLIPViT models.
  • Classify images to generate tags using a custom VisionModel.
  • Multithreading support for processing images concurrently.
  • API endpoints for embedding text and images.
  • Web interface for searching images based on text queries or image uploads.
  • Tag-based navigation and filtering.

Installation

Place this at root of project folder naming the folder as 'classify-model'.

  1. Run the setup_demo.sh file to install the dependencies:

    ./setup_demo.sh

    This script will:

    • Verify Python and required packages are installed
    • Generate embeddings for images
    • Configure the XAMPP htdocs directory
    • Start the API server
    • Copy necessary files to the demo subdirectory
  2. If the script fails or you prefer manual installation:

    a. Ensure Python 3.6 or higher is installed.

    b. Install required Python packages:

    pip install -r requirements.txt

    c. Start the API server:

    python api_embeddings_altclip.py

    d. Generate embeddings:

    cd altclip
    python embed_altClip_class_threading.py --model_dir classify-model

    e. Manually copy the following files to your XAMPP htdocs directory:

    • altclip/index_tags.html
    • altclip/search.php
    • altclip/get_tags.php
    • data/ (entire directory)
  3. Access the demo by navigating to http://localhost/your_demo_directory/index_tags.html in your web browser.

Usage

Generating Image Embeddings

To generate embeddings and tags for images, run the embedding scripts:

  • Using AltCLIP

    python altclip/embed_altClip_class_threading.py
  • Using CLIPViT

    python clipvit/embed_clipViT_class_threading.py

These scripts process images in the data/images directory and save the embeddings and tags to data/image_embeddings.json.

Running the API Server

Start the API server to handle embedding requests:

  • AltCLIP API

    python altclip/api_embeddings_altclip.py
  • CLIPViT API

    python clipvit/api_embeddings_clipvit.py

The API runs on http://localhost:5000 by default.

Search Interface

A web interface is provided to search for images based on text queries or image uploads.

  1. Start a local PHP server

    Navigate to the altclip or clipvit directory:

    cd altclip  # or `cd clipvit`
  2. Start the PHP server

    php -S localhost:8000
  3. Access the search interface

    Open your web browser and navigate to http://localhost:8000/index_tags.html.

Project Structure

  • altclip/

    • embed_altClip_class.py: Script for generating embeddings using AltCLIP.
    • embed_altClip_class_threading.py: Multithreaded version of the embedding script.
    • api_embeddings_altclip.py: API server for AltCLIP embeddings.
    • index_tags.html: Web interface for searching images.
    • search.php: Backend script for handling search requests.
    • get_tags.php: Script to fetch available tags.
    • classify-model/: Contains the custom VisionModel and configuration files.
  • clipvit/

    • embed_clipViT_class.py: Script for generating embeddings using CLIPViT.
    • embed_clipViT_class_threading.py: Multithreaded version of the embedding script.
    • api_embeddings_clipvit.py: API server for CLIPViT embeddings.
    • index_tags.html: Web interface for searching images.
    • classify-model/: Contains the custom VisionModel and configuration files.
  • data/

    • images/: Directory containing images to process.
    • image_embeddings.json: Generated embeddings and tags.
    • input_text.json: Sample text inputs for embedding.
  • logs/

    • generate_embeddings.log: Logs from embedding scripts.

Dependencies

  • Python 3.x
  • PyTorch
  • Transformers
  • Pillow
  • Tqdm
  • Flask (for API servers)
  • PHP (for the web interface)
  • Web browser (for accessing the search interface)

About

img2img and txt2img

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published