This project is a Sentiment Analysis web application built with a fine-tuned BERT model using LoRA (Low-Rank Adaptation). It allows users to analyze the sentiment (positive or negative) of movie reviews. The application includes a backend API (FastAPI) for predictions and a simple HTML frontend with Bootstrap for user interaction.
-
Machine Learning Model:
- A fine-tuned BERT model (
bert-base-uncased
) optimized with LoRA for efficient training. - Achieves 87.4% accuracy on the IMDB dataset.
- Supports single and batch predictions.
- A fine-tuned BERT model (
-
Backend:
- Implemented with FastAPI for serving predictions.
- Supports two endpoints:
/predict
: Analyze a single review./predict_batch
: Analyze multiple reviews in one request.
- CORS enabled for frontend communication.
-
Frontend:
- Simple and clean user interface created using HTML and Bootstrap.
- Features:
- Dynamic input fields for multiple reviews.
- A loading spinner indicating processing status.
- Real-time display of prediction results.
-
Deployment:
- Dockerized application for easy deployment.
- Managed using
docker-compose
to run both backend and frontend as separate services.
project/
├── app.py # FastAPI backend
├── saved_model/ # Directory containing the fine-tuned BERT model and tokenizer
├── frontend/ # Frontend HTML files
│ └── index.html
├── requirements.txt # Python dependencies
├── Dockerfile # Dockerfile for building the backend API
├── docker-compose.yml # Compose file for managing frontend and backend services
├── logs/ # Directory for storing training and API logs
├── results/ # Directory for saving initial training results and checkpoints
├── results_improved/ # Directory for saving improved training results (e.g., with hyperparameter tuning)
└── wandb/ # Directory automatically created by Weights & Biases for experiment tracking
- BERT (
bert-base-uncased
) fine-tuned on the IMDB dataset. - LoRA (Low-Rank Adaptation) for efficient parameter fine-tuning.
- FastAPI for high-performance API development.
- Transformers library (HuggingFace) for model inference.
- PyTorch as the deep learning framework.
- HTML, CSS, Bootstrap for a responsive and clean user interface.
- JavaScript for dynamic functionality (API calls, loading spinner).
- Docker for containerizing backend and frontend services.
- Docker Compose for orchestrating multiple services.
- Nginx for serving static frontend files.
- Weights & Biases (W&B) for experiment tracking, including metrics and hyperparameters.
- Logs and training results are stored in dedicated directories:
logs/
results/
results_improved/
Ensure the following software is installed on your machine:
-
Clone the Repository
First, clone this repository to your local machine:git clone https://github.com/Andreevromano/HSE_LSML2_FP.git cd HSE_LSML2_FP
-
Build and Run the Project
Use (docker-compose
) to build the Docker images and start the backend (API) and frontend services:docker-compose up --build
-
Access the Application
- Frontend: Open a browser and navigate to (
http://localhost:8080
) - Backend API: The API is available at (
http://localhost:8000
)
- Frontend: Open a browser and navigate to (
-
Test the API
You can test the API endpoints using tools like curl, Postman, or Python’s (requests
) library.- Single Prediction Endpoint (
(/predict)
): Send a POST request with a single review:
curl -X POST http://localhost:8000/predict \ -H "Content-Type: application/json" \ -d '{"text":"This movie is fantastic!"}'
Response:
- Batch Prediction Endpoint (
(/predict)
): Send a POST request with multiple reviews:
curl -X POST http://localhost:8000/predict_batch \ -H "Content-Type: application/json" \ -d '{"texts": ["This movie was amazing!", "Bad actors and movie!"]}'
Response:
- Single Prediction Endpoint (
This Sentiment Analysis Web Application successfully combines state-of-the-art machine learning with practical deployment and user interaction capabilities. By leveraging a fine-tuned BERT model enhanced with LoRA (Low-Rank Adaptation), the project achieves 87.4% accuracy on the IMDB dataset.
The application includes:
- A FastAPI backend for real-time predictions.
- A user-friendly frontend built with HTML, Bootstrap for dynamic interaction.
- Deployment using Docker and Docker Compose, ensuring portability and ease of scaling.
The integration of experiment tracking through Weights & Biases (W&B) ensures transparency and reproducibility in model training and improvement.
- Efficient Model Fine-Tuning: LoRA reduces the computational costs while maintaining high accuracy.
- End-to-End Deployment: From model training to a functional API and frontend interface.
- User Interaction: Dynamic UI with support for batch predictions and visual feedback.
This project serves as a foundation for real-world NLP applications and can be extended further to include:
- Neutral sentiment classification.
- More advanced analytics and confidence scores.
- Integration with cloud providers for production-grade deployment.
With its modular design, the application can be scaled, extended, and adapted to suit various sentiment analysis use cases in domains such as customer feedback, social media monitoring, and business intelligence.