The project was initially developed with a default environment of Linux servers, so running it directly on a macOS machine can be challenging. After encountering some issues, we have compiled a list of problems that might arise on macOS and documented them in this guide. Not all solutions provided here may apply to your specific setup. If you have any questions, please raise them in an issue.
- Intel CPU Click here for Intel CPU machines
- M-series CPU Click here for M-series chip machines
To run the project smoothly on macOS, perform the following preparations:
- Install ImageMagick:
- Modify configurations:
- PDF-Extract-Kit/pdf_extract.py:148
dataloader = DataLoader(dataset, batch_size=128, num_workers=0)
- PDF-Extract-Kit/pdf_extract.py:148
Use either venv or conda, with Python version recommended as 3.10.
pip install -r requirements+cpu.txt
# For detectron2, compile it yourself as per https://github.com/facebookresearch/detectron2/issues/5114
# Or use our precompiled wheel
pip install https://github.com/opendatalab/PDF-Extract-Kit/raw/main/assets/whl/detectron2-0.6-cp310-cp310-macosx_10_9_universal2.whl
PDF-Extract-Kit/configs/model_configs.yaml:2
device: cpu
PDF-Extract-Kit/modules/layoutlmv3/layoutlmv3_base_inference.yaml:72
DEVICE: cpu
python pdf_extract.py --pdf demo/demo1.pdf
Use either venv or conda, with Python version recommended as 3.10.
pip install -r requirements+cpu.txt
# For detectron2, compile it yourself as per https://github.com/facebookresearch/detectron2/issues/5114
# Or use our precompiled wheel
pip install https://github.com/opendatalab/PDF-Extract-Kit/raw/main/assets/whl/detectron2-0.6-cp310-cp310-macosx_11_0_arm64.whl
PDF-Extract-Kit/configs/model_configs.yaml:2
device: mps
PDF-Extract-Kit/modules/layoutlmv3/layoutlmv3_base_inference.yaml:72
DEVICE: mps
python pdf_extract.py --pdf demo/demo1.pdf
- On some newer M chip devices, MPS acceleration fails to activate.
- Uninstall torch and torchvision, then reinstall the nightly build versions of torch and torchvision.
-
pip uninstall torch torchvision pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cpu
- Reference source: opendatalab#23