NFLAT4NER

This is the code for the paper NFLAT: Non-Flat-Lattice Transformer for Chinese Named Entity Recognition.

Introduction

We advocate a novel lexical enhancement method, InterFormer, that effectively reduces the amount of computational and memory costs by constructing non-flat lattices. Furthermore, with InterFormer as the backbone, we implement NFLAT for Chinese NER. NFLAT decouples lexicon fusion and context feature encoding. Compared with FLAT, it reduces unnecessary attention calculations in "word-character" and "word-word". This reduces the memory usage by about 50% and can use more extensive lexicons or higher batches for network training.

Environment Requirement

The code has been tested under Python 3.7. The required packages are as follows:

torch==1.5.1
numpy==1.18.5
FastNLP==0.5.0
fitlog==0.3.2

you can click here to know more about FastNLP. And you can click here to know more about Fitlog.

Example to Run the Codes

Download the pretrained character embeddings and word embeddings and put them in the data folder.
- Character embeddings (gigaword_chn.all.a2b.uni.ite50.vec): Google Drive or Baidu Pan
- Bi-gram embeddings (gigaword_chn.all.a2b.bi.ite50.vec): Baidu Pan
- Word(Lattice) embeddings (ctb.50d.vec): Baidu Pan
- If you want to use a larger word embedding, you can refer to Chinese Word Vectors 中文词向量 and Tencent AI Lab Embedding
Modify the utils/paths.py to add the pretrained embedding and the dataset.
Long sentence clipping for MSRA and Ontonotes, run the command:

python sentence_clip.py

Merging char embeddings and word embeddings:

python char_word_mix.py

Model training and evaluation

Weibo dataset

python main.py --dataset weibo

Resume dataset

python main.py --dataset resume

Ontonotes dataset

python main.py --dataset ontonotes

MSRA dataset

python main.py --dataset msra

Acknowledgements

Thanks to Dr. Li and his team for contributing the FLAT source code.
Thanks to the author team and contributors of TENER source code.
Thanks to the author team and contributors of FastNLP.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
models		models
modules		modules
utils		utils
LICENSE		LICENSE
README.md		README.md
char_word_mix.py		char_word_mix.py
main.py		main.py
requirements.txt		requirements.txt
sentence_clip.py		sentence_clip.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NFLAT4NER

Introduction

Environment Requirement

Example to Run the Codes

Acknowledgements

About

Releases

Packages

Languages

License

CoderMusou/NFLAT4CNER

Folders and files

Latest commit

History

Repository files navigation

NFLAT4NER

Introduction

Environment Requirement

Example to Run the Codes

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages