This project extracts text from images using OCR (Tesseract), structures the extracted data into JSON format, and stores it into a MySQL database.
- Python
- OpenCV
- Tesseract OCR
- MySQL
- JSON
📂 Your-GitHub-Repo
│── 📄 README.md # Project documentation with setup instructions
│── 📄 extract_and_structure.py # Python script for OCR and JSON structuring
│── 📄 store_to_mysql.py # Python script for storing extracted data into MySQL
│── 📂 data
│ ├── sample_image.png # Sample input image
│ ├── sample_output.json # Sample extracted JSON output
│── 📂 sql
│ ├── schema.sql # SQL schema for database tables
│ ├── insert_sample.sql # Sample SQL insert statements
Ensure you have Python installed. Install required libraries:
pip install pytesseract opencv-python mysql-connector-python
Install Tesseract OCR and add it to the system PATH:
- Windows: Download Here
- Linux:
sudo apt install tesseract-ocr
To extract text and structure it as JSON:
python extract_and_structure.py
Create the database and tables using:
mysql -u root -p < sql/schema.sql
Run the MySQL storage script:
python store_to_mysql.py
{
"Patient Name": "John Doe",
"DOB": "01/05/1980",
"Pain Level": 6,
"Comments": "Not good"
}
Feel free to fork and improve the project. Submit a pull request for any enhancements.
This project is licensed under the MIT License.