UEA

The source code for the DASFAA 2021 paper: Towards Entity Alignment in the Open World: An Unsupervised Approach.

Dependencies

Python>=3.7 (tested on Python=3.8.10)
Tensorflow-gpu=2.x (tested on Tensorflow-gpu=2.6.0)
Scipy
Numpy
Scikit-learn
python-Levenshtein

Datasets

The original datasets are obtained from DBP15K dataset, GCN-Align and JAPE.

Take the dataset DBP15K (ZH-EN) as an example, the folder "zh_en" contains:

ent_ids_1: ids for entities in source KG (ZH);
ent_ids_1_trans_goo: entities in source KG (ZH) with translated names;
ent_ids_2: ids for entities in target KG (EN);
ref_ent_ids: entity links for testing/validation;
sup_ent_ids: entity links for training;
triples_1: relation triples encoded by ids in source KG (ZH);
triples_2: relation triples encoded by ids in target KG (EN);
zh_vectorList.json: the input entity feature matrix initialized by word vectors;

Semantic Information

Regarding the Semantic Information, we obtain the entity name embeddings for DBP15K from RDGCN. You may also obtain from here. Note that before running you need to place the _vectorList.json file under the corresponding directory.

Running

First generate the string similarity by running python stringsim.py --lan "fr_en" . The dataset could be chosen from zh_en, ja_en, fr_en
Then run

python main.py --lan "fr_en"

You may also directly run

bash auto.sh

Due to the instability of embedding-based methods, it is acceptable that the results fluctuate a little bit when running code repeatedly.

If you have any questions about reproduction, please feel free to email to [email protected].

Citation

If you use this model or code, please cite it as follows:

@inproceedings{DBLP:conf/dasfaa/ZengZTLLZ21,
  author    = {Weixin Zeng and
               Xiang Zhao and
               Jiuyang Tang and
               Xinyi Li and
               Minnan Luo and
               Qinghua Zheng}
  title     = {Towards Entity Alignment in the Open World: An Unsupervised Approach},
  booktitle = {DASFAA},
  pages     = {272--289},
  publisher = {Springer},
  year      = {2021},
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
include		include
README.md		README.md
auto.sh		auto.sh
main.py		main.py
stringsim.py		stringsim.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UEA

Dependencies

Datasets

Semantic Information

Running

Citation

About

Releases

Packages

Languages

DexterZeng/UEA

Folders and files

Latest commit

History

Repository files navigation

UEA

Dependencies

Datasets

Semantic Information

Running

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages