Skip to content

Latest commit

 

History

History
35 lines (33 loc) · 832 Bytes

File metadata and controls

35 lines (33 loc) · 832 Bytes

Dataset Download

# download from our official URL
wget https://modelscope-open.oss-cn-hangzhou.aliyuncs.com/open_data/cc_ocr/cc_ocr_data.zip
unzip cc_ocr_data.zip

The final directory structure should be as follows:

Benchmarks/CC-OCR/data
├── doc_parsing
│   ├── doc
│   ├── formula
│   ├── molecular
│   └── table
├── kie
│   ├── constrained_category
│   └── open_category
├── multi_lan_ocr
│   ├── Arabic
│   ├── French
│   ├── German
│   ├── Italian
│   ├── Japanese
│   ├── Korean
│   ├── Portuguese
│   ├── Russian
│   ├── Spanish
│   └── Vietnamese
└── multi_scene_ocr
    ├── document_text
    ├── scene_text
    └── ugc_text