基于 Tesseract OCR 的识别结果,结合 pywin32 库实现 IO 设备的控制。
- win10 22H2/win11 23H2
- python 3.11
- tesseract-ocr-w64 5.3.4.20240503
- PostgreSQL 15.7
使用 Tesseract at UB Mannheim 构建的 Windows 版本,在 此处 获取安装包。相关说明如下:
Tesseract installer for Windows
Normally we run Tesseract on Debian GNU Linux, but there was also the need for a Windows version. That's why we have built a Tesseract installer for Windows.
WARNING: Tesseract should be either installed in the directory which is suggested during the installation or in a new directory. The uninstaller removes the whole installation directory. If you installed Tesseract in an existing directory, that directory will be removed with all its subdirectories and files.
1、引入依赖
poetry add pillow
poetry add pytesseract
2、指定命令路径
# 设置pytesseract的tesseract命令路径
pytesseract.pytesseract.tesseract_cmd = r'C:\custom\Tesseract-OCR\tesseract.exe'
安装 Tesseract 过程中,Additional language data
增加的中文语言包下载会失败,不必勾选,在 此处 使用 SSH([email protected]:tesseract-ocr/tessdata.git) 克隆整个仓库,将如下文件放到安装目录下 tessdata
即可:
chi_sim.traineddata(43327KB)
chi_sim_vert.traineddata(2414KB)
poetry add pywin32
在项目中即可引入 import win32gui
实现 IO 设备控制。
SQLAlchemy:sqlalchemy
PostgreSQL database adapter :psycopg2
0.1 (2024-06-23)
- Initial release