Hosted Client : https://ai-pdf-analyzer-questioner.vercel.app/
Hosted API : https://aipdf-analyzer.onrender.com
Disclaimer : server is a spin down server so it will be sleeping on first request , it takes 1 - 2 minutes to go live again ! So , please be patient while testing.
This project consists of a FastAPI server for uploading PDF documents, storing their content in a vector database, and querying them using natural language. Additionally, it includes a frontend client built with Vite, React, TypeScript, and Tailwind CSS that provides a user-friendly interface for interacting with the API.
- Upload PDF documents via API and store their metadata in a PostgreSQL database.
- Process PDFs into vector embeddings using HuggingFace embeddings and store them in FAISS for efficient querying.
- Query uploaded PDFs with natural language and receive context-based answers.
- Intuitive interface for uploading PDF files and asking questions.
- Real-time feedback and responses displayed to the user.
- Built with modern technologies including Vite, TypeScript, and Tailwind CSS for a sleek, responsive UI.
- Added Chat Scroll Option for going through chat history.
- Added Indicator to tell if server is ready or not for questions.
- Added Chat Saving through PDF so that users can save their chat history for future references.
- FastAPI: For building the backend API.
- PostgreSQL: To store metadata about uploaded PDFs.
- FAISS: For efficient vector storage and retrieval.
- Langchain: Framework for language model integrations.
- HuggingFace Embeddings: For generating vector representations of PDF content.
- React: For building the UI.
- TypeScript: For a type-safe codebase.
- Vite: For fast development and bundling.
- Tailwind CSS: For styling and responsive design.
- Python 3.8+
- Node.js and npm
- PostgreSQL database (local or remote)
-
Clone the repository:
git clone https://github.com/yourusername/your-repo-name.git cd your-repo-name
-
Create a virtual environment and activate it:
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install backend dependencies:
pip install -r requirements.txt
-
Configure environment variables by creating a
.env
file in the root directory with the following content:DB_NAME=your_database_name DB_USER=your_database_user DB_PASSWORD=your_database_password DB_HOST=your_database_host DB_PORT=your_database_port groq_api=your_groq_api_key
-
Start the server:
uvicorn main:app --host 0.0.0.0 --port 5000 --reload
The server will run at http://0.0.0.0:5000
.
-
Navigate to the
client
directory:cd client
-
Install dependencies:
npm install
-
Start the development server:
npm run dev
The client will be accessible at http://localhost:5173
.
- Method: POST
- Description: Upload a PDF document, store its metadata in the database, and save its content in a vector store.
- Parameters:
file
: The PDF file to upload (Multipart/Form-Data).
Example Request:
curl -X 'POST' \
'http://0.0.0.0:5000/upload_pdf/' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'file=@path_to_your_pdf.pdf'
Response:
{
"message": "Vector database created successfully and metadata saved."
}
- Method: POST
- Description: Ask a question based on the uploaded PDFs, and receive an answer.
- Parameters:
question
: The question to ask (Form-Data).
Example Request:
curl -X 'POST' \
'http://0.0.0.0:5000/ask_question/' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-d 'question=What is the purpose of this document?'
Response:
{
"answer": "This document is a guide to using the API for PDF uploading and querying."
}
The application uses a PostgreSQL database to store metadata about the uploaded PDF documents. The table schema is as follows:
Column | Type | Description |
---|---|---|
id |
SERIAL | Primary Key, auto-incremented ID |
filename |
VARCHAR(255) | The name of the uploaded PDF file |
upload_date |
TIMESTAMP | The date and time when the PDF was uploaded |
The client provides a bot-like interface where users can:
- Upload a PDF document directly from their browser.
- Input questions related to the uploaded document.
- Receive context-aware answers from the server.
- Export or save chat history as a pdf.
AiPdf_Analyzer/
├── client/ # Frontend application
│ ├── public/ # Public assets
│ ├── src/ # Source code
│ │ ├── components/ # React components
│ │ ├── types/ # Application Interfaces
│ │ ├── App.tsx # Main application component
│ │ ├── main.tsx # Entry point
│ │ └── ... # Other source files
│ ├── index.html # HTML template
│ ├── package.json # NPM package configuration
│ ├── postcss.config.js # PostCSS configuration
│ ├── tailwind.config.js # Tailwind CSS configuration
│ └── tsconfig.json # TypeScript configuration
│
├── server/ # Backend application
│ ├── main.py # FastAPI application
│ ├── requirements.txt # Python dependencies
│ └── ... # Other server files
│
├── .env # Environment variables
├── .gitignore # Git ignore file
├── LICENSE # License file
└── README.md # Project documentation
This project is licensed under the MIT License. Feel free to use and contribute.
Final_Demon_Video_Aryan_Swaroop.mp4
- Fork the repository.
- Create a new branch.
- Make your changes.
- Submit a pull request.
For any issues or questions, feel free to open an issue or reach out to the maintainer.