Merge pull request PaddlePaddle#7279 from an1018/dyg/update_doc

update doc
m986883511 · Aug 22, 2022 · d048fe8 · d048fe8
2 parents 8af214d + 27215c6
commit d048fe8
Show file tree

Hide file tree

Showing 13 changed files with 284 additions and 103 deletions.
diff --git a/ppstructure/docs/models_list.md b/ppstructure/docs/models_list.md
@@ -10,13 +10,17 @@
 <a name="1"></a>
 ## 1. 版面分析模型
 
-|模型名称|模型简介|下载地址|label_map|
-| --- | --- | --- | --- |
-| ppyolov2_r50vd_dcn_365e_publaynet | PubLayNet 数据集训练的版面分析模型，可以划分**文字、标题、表格、图片以及列表**5类区域 | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet.tar) / [训练模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet_pretrained.pdparams) |{0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}|
-| ppyolov2_r50vd_dcn_365e_tableBank_word | TableBank Word 数据集训练的版面分析模型，只能检测表格 | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_word.tar) | {0:"Table"}|
-| ppyolov2_r50vd_dcn_365e_tableBank_latex | TableBank Latex 数据集训练的版面分析模型，只能检测表格 | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_latex.tar) | {0:"Table"}|
+|模型名称|模型简介|推理模型大小|下载地址|dict path|
+| --- | --- | --- | --- | --- |
+| picodet_lcnet_x1_0_fgd_layout | 基于PicoDet LCNet_x1_0和FGD蒸馏在PubLayNet 数据集训练的英文版面分析模型，可以划分**文字、标题、表格、图片以及列表**5类区域 | 9.7M | [推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout.pdparams) | [PubLayNet dict](../../ppocr/utils/dict/layout_dict/layout_publaynet_dict.txt) |
+| ppyolov2_r50vd_dcn_365e_publaynet | 基于PP-YOLOv2在PubLayNet数据集上训练的英文版面分析模型 | 221M | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet.tar) / [训练模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet_pretrained.pdparams) | 同上 |
+| picodet_lcnet_x1_0_fgd_layout_cdla | CDLA数据集训练的中文版面分析模型，可以划分为**表格、图片、图片标题、表格、表格标题、页眉、脚本、引用、公式**10类区域 | 9.7M | [推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla.pdparams) | [CDLA dict](../../ppocr/utils/dict/layout_dict/layout_cdla_dict.txt) |
+| picodet_lcnet_x1_0_fgd_layout_table | 表格数据集训练的版面分析模型，支持中英文文档表格区域的检测 | 9.7M | [推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table.pdparams) | [Table dict](../../ppocr/utils/dict/layout_dict/layout_table_dict.txt) |
+| ppyolov2_r50vd_dcn_365e_tableBank_word | 基于PP-YOLOv2在TableBank Word 数据集训练的版面分析模型，支持英文文档表格区域的检测 | 221M | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_word.tar) | 同上 |
+| ppyolov2_r50vd_dcn_365e_tableBank_latex | 基于PP-YOLOv2在TableBank Latex数据集训练的版面分析模型，支持英文文档表格区域的检测 | 221M | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_latex.tar) | 同上 |
 
 <a name="2"></a>
+
 ## 2. OCR和表格识别模型
 
 <a name="21"></a>

diff --git a/ppstructure/docs/models_list_en.md b/ppstructure/docs/models_list_en.md
@@ -6,15 +6,18 @@
   - [2.2 Table Recognition](#22-table-recognition)
 - [3. KIE](#3-kie)
 
-
 <a name="1"></a>
+
 ## 1. Layout Analysis
 
-|model name| description                                                                                                                                             |download|label_map|
-| --- |---------------------------------------------------------------------------------------------------------------------------------------------------------| --- | --- |
-| ppyolov2_r50vd_dcn_365e_publaynet | The layout analysis model trained on the PubLayNet dataset, the model can recognition 5 types of areas such as **text, title, table, picture and list** | [inference model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet.tar) / [trained model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet_pretrained.pdparams) |{0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}|
-| ppyolov2_r50vd_dcn_365e_tableBank_word | The layout analysis model trained on the TableBank Word dataset, the model can only detect tables                                                       | [inference model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_word.tar) | {0:"Table"}|
-| ppyolov2_r50vd_dcn_365e_tableBank_latex | The layout analysis model trained on the TableBank Latex dataset, the model can only detect tables                                                      | [inference model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_latex.tar) | {0:"Table"}|
+|model name| description                                                                                                                                             | inference model size                                                                                                                         |download|dict path|
+| --- |---------------------------------------------------------------------------------------------------------------------------------------------------------| --- | --- | --- |
+| picodet_lcnet_x1_0_fgd_layout | The layout analysis English model trained on the PubLayNet dataset based on PicoDet LCNet_x1_0 and FGD . the model can recognition 5 types of areas such as **Text, Title, Table, Picture and List** | 9.7M | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout.pdparams) | [PubLayNet dict](../../ppocr/utils/dict/layout_dict/layout_publaynet_dict.txt) |
+| ppyolov2_r50vd_dcn_365e_publaynet | The layout analysis English model trained on the PubLayNet dataset based on PP-YOLOv2 | 221M | [inference_moel]](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet.tar) / [trained model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet_pretrained.pdparams) | sme as above |
+| picodet_lcnet_x1_0_fgd_layout_cdla | The layout analysis Chinese model trained on the CDLA dataset, the model can recognition 10 types of areas such as **Table、Figure、Figure caption、Table、Table caption、Header、Footer、Reference、Equation** | 9.7M | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla.pdparams) | [CDLA dict](../../ppocr/utils/dict/layout_dict/layout_cdla_dict.txt) |
+| picodet_lcnet_x1_0_fgd_layout_table | The layout analysis model trained on the table dataset, the model can detect tables in Chinese and English documents                     | 9.7M                                                  | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table.pdparams) | [Table dict](../../ppocr/utils/dict/layout_dict/layout_table_dict.txt) |
+| ppyolov2_r50vd_dcn_365e_tableBank_word | The layout analysis model trained on the TableBank Word dataset based on PP-YOLOv2, the model can detect  tables  in English documents | 221M | [inference model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_word.tar) | same as above |
+| ppyolov2_r50vd_dcn_365e_tableBank_latex | The layout analysis model trained on the TableBank Latex dataset based on PP-YOLOv2, the model can detect  tables  in English documents | 221M                 | [inference model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_latex.tar) | same as above |
 
 <a name="2"></a>
 ## 2. OCR and Table Recognition

diff --git a/ppstructure/docs/quickstart.md b/ppstructure/docs/quickstart.md
@@ -8,27 +8,34 @@
     - [2.1.3 版面分析](#213-版面分析)
     - [2.1.4 表格识别](#214-表格识别)
     - [2.1.5 关键信息抽取](#215-关键信息抽取)
+    - [2.1.6 版面恢复](#216-版面恢复)
   - [2.2 代码使用](#22-代码使用)
-    - [2.2.1 图像方向分类版面分析表格识别](#221-图像方向分类版面分析表格识别)
+
+    - [2.2.1 图像方向+分类版面分析+表格识别](#221-图像方向分类版面分析表格识别)
     - [2.2.2 版面分析+表格识别](#222-版面分析表格识别)
     - [2.2.3 版面分析](#223-版面分析)
     - [2.2.4 表格识别](#224-表格识别)
+
     - [2.2.5 关键信息抽取](#225-关键信息抽取)
+    - [2.2.6 版面恢复](#226-版面恢复)
+
   - [2.3 返回结果说明](#23-返回结果说明)
     - [2.3.1 版面分析+表格识别](#231-版面分析表格识别)
     - [2.3.2 关键信息抽取](#232-关键信息抽取)
+
   - [2.4 参数说明](#24-参数说明)
 
 
 <a name="1"></a>
 ## 1. 安装依赖包
 
 ```bash
-# 安装 paddleocr，推荐使用2.5+版本
-pip3 install "paddleocr>=2.5"
+# 安装 paddleocr，推荐使用2.6版本
+pip3 install "paddleocr>=2.6"
 # 安装 关键信息抽取 依赖包（如不需要KIE功能，可跳过）
 pip install -r kie/requirements.txt
-
+# 安装 图像方向分类依赖包paddleclas（如不需要图像方向分类功能，可跳过）
+pip3 install paddleclas
 ```
 
 <a name="2"></a>
@@ -62,15 +69,24 @@ paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/table.jpg --type=structur
 ```
 
 <a name="215"></a>
-#### 2.1.5 关键信息抽取
 
+#### 2.1.5 关键信息抽取
 请参考：[关键信息抽取教程](../kie/README_ch.md)。
 
+<a name="216"></a>
+
+#### 2.1.6 版面恢复
+
+```bash
+paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/1.png --type=structure --recovery=true
+```
+
 <a name="22"></a>
+
 ### 2.2 代码使用
 
 <a name="221"></a>
-#### 2.2.1 图像方向分类版面分析表格识别
+#### 2.2.1 图像方向分类+版面分析+表格识别
 
 ```python
 import os
@@ -149,6 +165,7 @@ for line in result:
 ```
 
 <a name="224"></a>
+
 #### 2.2.4 表格识别
 
 ```python
@@ -174,6 +191,33 @@ for line in result:
 
 请参考：[关键信息抽取教程](../kie/README_ch.md)。
 
+<a name="226"></a>
+
+#### 2.2.6 版面恢复
+
+```python
+import os
+import cv2
+from paddleocr import PPStructure,save_structure_res
+from paddelocr.ppstructure.recovery.recovery_to_doc import sorted_layout_boxes, convert_info_docx
+
+table_engine = PPStructure(layout=False, show_log=True)
+
+save_folder = './output'
+img_path = 'PaddleOCR/ppstructure/docs/table/1.png'
+img = cv2.imread(img_path)
+result = table_engine(img)
+save_structure_res(result, save_folder, os.path.basename(img_path).split('.')[0])
+
+for line in result:
+    line.pop('img')
+    print(line)
+
+h, w, _ = img.shape
+res = sorted_layout_boxes(res, w)
+convert_info_docx(img, result, save_folder, os.path.basename(img_path).split('.')[0])
+```
+
 <a name="23"></a>
 ### 2.3 返回结果说明
 PP-Structure的返回结果为一个dict组成的list，示例如下
@@ -235,6 +279,7 @@ dict 里各个字段说明如下
 | table  | 前向中是否执行表格识别  | True   |
 | ocr    | 对于版面分析中的非表格区域，是否执行ocr。当layout为False时会被自动设置为False| True |
 | recovery    | 前向中是否执行版面恢复| False |
+| save_pdf | 版面恢复导出docx文件的同时，是否导出pdf文件 | False |
 | structure_version |  模型版本，可选 PP-structure和PP-structurev2  | PP-structure |
 
 大部分参数和PaddleOCR whl包保持一致，见 [whl包文档](../../doc/doc_ch/whl.md)
diff --git a/ppstructure/docs/quickstart_en.md b/ppstructure/docs/quickstart_en.md
@@ -8,12 +8,15 @@
     - [2.1.3 layout analysis](#213-layout-analysis)
     - [2.1.4 table recognition](#214-table-recognition)
     - [2.1.5 Key Information Extraction](#215-Key-Information-Extraction)
+    - [2.1.6 layout recovery](#216-layout-recovery)
   - [2.2 Use by code](#22-use-by-code)
     - [2.2.1 image orientation + layout analysis + table recognition](#221-image-orientation--layout-analysis--table-recognition)
     - [2.2.2 layout analysis + table recognition](#222-layout-analysis--table-recognition)
     - [2.2.3 layout analysis](#223-layout-analysis)
     - [2.2.4 table recognition](#224-table-recognition)
+    - [2.2.5 DocVQA](#225-dockie)
     - [2.2.5 Key Information Extraction](#225-Key-Information-Extraction)
+    - [2.2.6 layout recovery](#226-layout-recovery)  
   - [2.3 Result description](#23-result-description)
     - [2.3.1 layout analysis + table recognition](#231-layout-analysis--table-recognition)
     - [2.3.2 Key Information Extraction](#232-Key-Information-Extraction)
@@ -24,14 +27,16 @@
 ## 1. Install package
 
 ```bash
-# Install paddleocr, version 2.5+ is recommended
-pip3 install "paddleocr>=2.5"
+# Install paddleocr, version 2.6 is recommended
+pip3 install "paddleocr>=2.6"
 # Install the KIE dependency packages (if you do not use the KIE, you can skip it)
 pip install -r kie/requirements.txt
-
+# Install the image direction classification dependency package paddleclas (if you do not use the image direction classification, you can skip it)
+pip3 install paddleclas
 ```
 
 <a name="2"></a>
+
 ## 2. Use
 
 <a name="21"></a>
@@ -66,6 +71,12 @@ paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/table.jpg --type=structur
 
 Please refer to: [Key Information Extraction](../kie/README.md) .
 
+<a name="216"></a>
+#### 2.1.6 layout recovery
+```bash
+paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/1.png --type=structure --recovery=true
+```
+
 <a name="22"></a>
 ### 2.2 Use by code
 
@@ -174,6 +185,32 @@ for line in result:
 
 Please refer to: [Key Information Extraction](../kie/README.md) .
 
+<a name="226"></a>
+#### 2.2.6 layout recovery
+
+```python
+import os
+import cv2
+from paddleocr import PPStructure,save_structure_res
+from paddelocr.ppstructure.recovery.recovery_to_doc import sorted_layout_boxes, convert_info_docx
+
+table_engine = PPStructure(layout=False, show_log=True)
+
+save_folder = './output'
+img_path = 'PaddleOCR/ppstructure/docs/table/1.png'
+img = cv2.imread(img_path)
+result = table_engine(img)
+save_structure_res(result, save_folder, os.path.basename(img_path).split('.')[0])
+
+for line in result:
+    line.pop('img')
+    print(line)
+
+h, w, _ = img.shape
+res = sorted_layout_boxes(res, w)
+convert_info_docx(img, result, save_folder, os.path.basename(img_path).split('.')[0])
+```
+
 <a name="23"></a>
 ### 2.3 Result description
 
@@ -235,6 +272,7 @@ Please refer to: [Key Information Extraction](../kie/README.md) .
 | table  | Whether to perform table recognition in forward  | True   |
 | ocr    | Whether to perform ocr for non-table areas in layout analysis. When layout is False, it will be automatically set to False| True |
 | recovery    | Whether to perform layout recovery in forward| False |
+| save_pdf    | Whether to convert docx to pdf when recovery| False |
 | structure_version |  Structure version, optional PP-structure and PP-structurev2  | PP-structure |
 
 Most of the parameters are consistent with the PaddleOCR whl package, see [whl package documentation](../../doc/doc_en/whl.md)
diff --git a/ppstructure/docs/recovery/recovery.jpg b/ppstructure/docs/recovery/recovery.jpg
diff --git a/ppstructure/docs/table/recovery.jpg b/ppstructure/docs/table/recovery.jpg