OCR to MySQL Data Pipeline

Overview

This project extracts text from images using OCR (Tesseract), structures the extracted data into JSON format, and stores it into a MySQL database.

Technologies Used

Python
OpenCV
Tesseract OCR
MySQL
JSON

Setup Instructions

1. Install Dependencies

Ensure you have Python installed. Install required libraries:

pip install pytesseract opencv-python mysql-connector-python

Install Tesseract OCR and add it to the system PATH:

Windows: Download Here
Linux:
```
sudo apt install tesseract-ocr
```

2. Run OCR Script

To extract text and structure it as JSON:

python extract_and_structure_image.py

3. Setup MySQL Database

Create the database and tables using:

mysql -u root -p < sql/schema.sql

4. Insert Extracted Data

Run the MySQL storage script:

python store_to_mysql.py

Sample JSON Output

{
  "Patient Name": "John Doe",
  "DOB": "01/05/1980",
  "Pain Level": 6,
  "Comments": "Not good"
}

Contributing

Feel free to fork and improve the project. Submit a pull request for any enhancements.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
extract_and_structure_image.py		extract_and_structure_image.py
extract_text_pdf.py		extract_text_pdf.py
extracted_oaksoldata_pdf.json		extracted_oaksoldata_pdf.json
image.png		image.png
json_processing_pdf.py		json_processing_pdf.py
oaksoldata.pdf		oaksoldata.pdf
store_to_mysql_image.py		store_to_mysql_image.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCR to MySQL Data Pipeline

Overview

Technologies Used

Setup Instructions

1. Install Dependencies

2. Run OCR Script

3. Setup MySQL Database

4. Insert Extracted Data

Sample JSON Output

Contributing

License

About

Releases

Packages

Languages

Atul-vaibhav/OCR-Extraction-Using-Python

Folders and files

Latest commit

History

Repository files navigation

OCR to MySQL Data Pipeline

Overview

Technologies Used

Setup Instructions

1. Install Dependencies

2. Run OCR Script

3. Setup MySQL Database

4. Insert Extracted Data

Sample JSON Output

Contributing

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages