Skip to content

This project showcases the use of the BLIP (Bootstrapping Language-Image Pre-training) model from Salesforce for image captioning, integrated into a Streamlit web application. BLIP leverages advanced transformer architectures to generate descriptive captions for images, providing both conditional and unconditional image captioning capabilities.

Notifications You must be signed in to change notification settings

Panchadip-128/Image_Captioning_Application

Repository files navigation

Image Captioning Application Overview:

This project combines sentiment analysis with image captioning using the Salesforce BLIP model and Hugging Face's Transformers library. The chatbot analyzes text input sentiment and provides captions for uploaded images.

conditional image captioning [INPUT PROCESSING WINDOW]

Tools Used:

Python: Backend development and machine learning. Flask: Web framework for handling backend requests. Streamlit: Tool for deploying and serving the web application. Hugging Face Transformers: Library for NLP and image captioning. HTML, CSS, JavaScript: Frontend development for user interaction.

Setup:

Install Dependencies: pip install transformers torch torchvision flask streamlit pillow

Run the Application:

Backend:

python app.py Frontend (Streamlit):

streamlit run frontend.py Accessing the Application:

Open your browser and navigate to http://localhost:8501 to interact with the chatbot and image captioning interface.

File Structure:

app.py: Flask backend for processing requests. frontend.py: Streamlit frontend for user interface. requirements.txt: List of Python dependencies.

Usage:

Sentiment Analysis:

Enter text in the chat interface and receive sentiment analysis results.

Image Captioning:

Upload an image to get automated captions generated by the BLIP model. conditional img captioning result [RESULT GENERATION WINDOW] various options available [EXPLAINS VARIUOUS FUNCTIONBALITY OF ADDING AN IMAGE THROUGH WEBCAM, SYSTEM IMAGES AND SO ON]

Deployment:

Deployed using GRADIO for easy web deployment and sharing (Streamlit/Flask or other cloud platforms can also be used). Please find the deployed link of web application in description for testing/implementation purposes

About

This project showcases the use of the BLIP (Bootstrapping Language-Image Pre-training) model from Salesforce for image captioning, integrated into a Streamlit web application. BLIP leverages advanced transformer architectures to generate descriptive captions for images, providing both conditional and unconditional image captioning capabilities.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published