A sophisticated photo management system that enables semantic search, facial recognition, and temporal organization of your photo collection. The system processes images to extract embeddings using CLIP, detects and clusters faces using face_recognition, and stores everything in a LanceDB database for efficient retrieval.
With this system, you can store and search through your family photos locally! There are NO API calls and NO internet connectivity is required. Your personal photos stay private.
- Semantic Image Search: Using CLIP embeddings for natural language photo search
- Face Detection & Recognition: Automatic face detection and clustering of people across photos
- EXIF Data Processing: Extraction of timestamp and location data from images
- Multi-format Support: Handles various image formats including JPEG, PNG, HEIC, and RAW files
- Vector Search: Efficient text-to-image similarity search using LanceDB and the open-source OpenAI CLIP model
main.py
: Core orchestration script for processing images and managing the databaseget_emb.py
: CLIP model integration for semantic embeddingsget_exif.py
: EXIF data extraction utilitiesproc_imgs.py
: Face detection and processing pipeline
- Python 3.8+
- PyTorch
- transformers
- face_recognition
- LanceDB
- PIL/Pillow
- pyheif
- scikit-learn
- numpy (for a full list, install the requirements.txt)
If you're starting from scratch, take a look at setup.sh
which provides all of the commands needed to get it up and
running, starting from an Ubuntu 24.04 environment. Please note versions prior to 23.04 will NOT WORK! It also does not
work on Windows, sadly :/ (but you CAN use WSL).
pip install -r requirements.txt
- Place your images in a directory
- Run the processing pipeline:
python main_load.py
The system provides REST API endpoints for:
- Semantic image search using natural language queries
- Face-based photo search
- Temporal search and filtering
- Individual photo retrieval
- Person management (naming, merging identities)
See the OpenAPI specification for detailed endpoint documentation.
people_id
: Unique identifier for each personname
: Person's name (can be updated via API)
image_id
: Unique identifier for each imagevector
: CLIP embedding vector (512 dimensions)image_path
: Path to the original imagepeople_ids
: List of people present in the imagedate
: Timestamp from EXIF datalocation
: Geographic location (if available in EXIF)