Skip to content

Latest commit

 

History

History
100 lines (69 loc) · 2.88 KB

Install_in_macOS_en.md

File metadata and controls

100 lines (69 loc) · 2.88 KB

Using PDF-Extract-Kit on macOS

Overview

The project was initially developed with a default environment of Linux servers, so running it directly on a macOS machine can be challenging. After encountering some issues, we have compiled a list of problems that might arise on macOS and documented them in this guide. Not all solutions provided here may apply to your specific setup. If you have any questions, please raise them in an issue.

Preprocessing

To run the project smoothly on macOS, perform the following preparations:

Using on Intel CPU machine

1.Create a Virtual Environment

Use either venv or conda, with Python version recommended as 3.10.

2.Install Dependencies

pip install -r requirements+cpu.txt

# For detectron2, compile it yourself as per https://github.com/facebookresearch/detectron2/issues/5114
# Or use our precompiled wheel
pip install https://github.com/opendatalab/PDF-Extract-Kit/raw/main/assets/whl/detectron2-0.6-cp310-cp310-macosx_10_9_universal2.whl

3.Modify config, use CPU for inference

PDF-Extract-Kit/configs/model_configs.yaml:2

device: cpu

PDF-Extract-Kit/modules/layoutlmv3/layoutlmv3_base_inference.yaml:72

DEVICE: cpu

4.Run the Application

python pdf_extract.py --pdf demo/demo1.pdf

Using on M-series chip machine

1.Create a Virtual Environment

Use either venv or conda, with Python version recommended as 3.10.

2.Install Dependencies

pip install -r requirements+cpu.txt

# For detectron2, compile it yourself as per https://github.com/facebookresearch/detectron2/issues/5114
# Or use our precompiled wheel
pip install https://github.com/opendatalab/PDF-Extract-Kit/raw/main/assets/whl/detectron2-0.6-cp310-cp310-macosx_11_0_arm64.whl

3. Modify config, use MPS for accelerated inference

PDF-Extract-Kit/configs/model_configs.yaml:2

device: mps

PDF-Extract-Kit/modules/layoutlmv3/layoutlmv3_base_inference.yaml:72

DEVICE: mps

4.Run the Application

python pdf_extract.py --pdf demo/demo1.pdf

5.FAQ

  • On some newer M chip devices, MPS acceleration fails to activate.
    • Uninstall torch and torchvision, then reinstall the nightly build versions of torch and torchvision.
    • pip uninstall torch torchvision 
      pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cpu
    • Reference source: opendatalab#23