-
Notifications
You must be signed in to change notification settings - Fork 5
命名实体识别baseline(使用LTP)
memeda edited this page Aug 30, 2016
·
1 revision
人民日报1998年1月做训练(后10%数据作为开发集),6月前10000句做测试作为训练数据。
语料 | 实例数(行数) | 实体数 |
---|---|---|
pku-train | 34,426 | 40,922 |
pku-holdout | 3,000 | 4269 |
pku-test | 10,000 | 11,340 |
语料中共包含13种标签,O
标示Out , 其次,分别是 {S- , B- , I- , E-} x {Nh , Ns , Ni}
, 表示单个、起始、中间、结尾的人名、地名、机构名标签。
语料 | P | R | F1 |
---|---|---|---|
pku-train | 99.45% | 99.74% | 99.59% |
pku-holdout | 91.66% | 91.10% | 91.38% |
pku-test | 93.18% | 94.39% | 93.78% |
具体标签的信息
PKU-TRAIN
Nh: precision: 99.83%; recall: 99.95%; FB1: 99.89 13150
Ni: precision: 99.10%; recall: 99.71%; FB1: 99.40 8978
Ns: precision: 99.36%; recall: 99.60%; FB1: 99.48 18912
PKU-HOLDOUT
Nh: precision: 94.31%; recall: 92.58%; FB1: 93.44 1336
Ni: precision: 85.71%; recall: 83.92%; FB1: 84.81 749
Ns: precision: 92.08%; recall: 92.72%; FB1: 92.40 2158
PKU-TEST
Nh: precision: 97.48%; recall: 97.87%; FB1: 97.67 3249
Ni: precision: 87.11%; recall: 87.74%; FB1: 87.42 2653
Ns: precision: 93.56%; recall: 95.54%; FB1: 94.54 5586
耗时:
PKU-TRAIN : 18.065 s
PKU-HOLDOUT: 1.718 s
PKU-TEST : 5.655 s
基于神经网络的序列标注任务 - WIKI (wiki语法见gollum)