There are source codes for Adapting Open Domain Fact Extraction and Verification to COVID-FACT through In-Domain Language Modeling.
More information about the SCIFACT shared task can be found on this website.
In this part, we mainly focus on continuously training language models for the scientific domain. For more advanced retrieval technics for COVID search, please refer to our paper and OpenMatch toolkit to achieve top retrieval for COVID search.
- requirments.txt
- We provide annotations and some scripts for users in the
script
data folder. - Inference with our best pipeline, and you can get the evaluation results:
bash script/inference.sh
bash script/eval.sh
- Train the rationale selection model with the same code with our baseline:
bash script/train_rs.sh
- Train the KGAT model with this command:
bash script/train_kgat.sh
- The continuous training is presented. We provide two methods, Mask Language Model (MLM) and Rationale Prediction (PR). For MLM, please go to
mlm_models
for more information. The other one is the same with rationale selection training. We can warm up from these models.
- All our experimental results are provided in the
prediction
folder. - We provide scripts for training, inference, and eval in the
script
folder.
Overall performance on the development set:
Sentence Level | Abstract Level | |||||
---|---|---|---|---|---|---|
Model | Prec. | Rec. | F1 | Prec. | Rec. | F1 |
RoBERTa (Large) | 46.51 | 38.25 | 41.98 | 53.30 | 46.41 | 49.62 |
KGAT | 57.07 | 31.97 | 40.98 | 72.73 | 38.28 | 50.16 |
SciKGAT (w.A) | 42.07 | 47.81 | 44.76 | 47.66 | 58.37 | 52.47 |
SciKGAT (w, AR) | 50 | 47.81 | 48.88 | 53.15 | 56.46 | 54.76 |
SciKGAT (Full) | 74.36 | 39.62 | 51.69 | 84.26 | 43.54 | 57.41 |
Overall performance on the testing set:
Sentence Level | Abstract Level | |||||
---|---|---|---|---|---|---|
Model | Prec. | Rec. | F1 | Prec. | Rec. | F1 |
RoBERTa (Large) | 38.6 | 40.5 | 39.5 | 46.6 | 46.4 | 46.5 |
SciKGAT (w.A) | 40.5 | 48.38 | 44.09 | 47.06 | 57.66 | 51.82 |
SciKGAT (w, AR) | 41.67 | 45.95 | 43.7 | 47.47 | 54.96 | 50.94 |
SciKGAT (Full) | 61.15 | 42.97 | 50.48 | 76.09 | 47.3 | 58.3 |
We also conduct some ablation studies about abstract retrieval and rationale selection for further work (These parts are omitted in our paper because of the limited space). We find that the abstract retrieval can be improved with a single SciBERT (MLM), and here are the results. In this experiment, we use SciBERT (MLM) to rerank the top 100 TF-IDF retrieved abstract and then leverage the following steps of our baseline (RoBERTa large) to do rationale selection and claim label prediction.
Model | Ranking Acc. | Abstract Level | Sentence Level | |||||
---|---|---|---|---|---|---|---|---|
Hit one | Hit all | Precision | Recall | F1 | Precision | Recall | F1 | |
TF-IDF | 84.67 | 83.33 | 53.30 | 46.41 | 49.62 | 46.51 | 38.25 | 41.98 |
w. SciBERT | 94.67 | 93.00 | 48.18 | 56.94 | 52.19 | 42.09 | 47.27 | 44.53 |
w. SciBERT (MLM) | 95.33 | 93.67 | 47.66 | 58.37 | 52.47 | 42.07 | 47.81 | 44.76 |
Also, we have ablation studies of rationale selection:
Model | Ranking Acc. | Sentence Level | Abstract Level | ||||||
---|---|---|---|---|---|---|---|---|---|
Precision | Recall | F1 | Precision | Recall | F1 | Precision | Recall | F1 | |
SciBERT | 36.90 | 65.03 | 47.08 | 43.22 | 46.99 | 45.03 | 48.94 | 55.02 | 51.80 |
SciBERT (MLM) | 43.73 | 60.93 | 50.91 | 50.00 | 47.81 | 48.88 | 53.15 | 56.46 | 54.76 |
RoBERTa-Base | 37.56 | 61.48 | 46.63 | 43.64 | 45.90 | 44.74 | 46.06 | 53.11 | 49.33 |
RoBERTa-Base (MLM) | 29.82 | 61.75 | 40.21 | 41.45 | 48.36 | 44.64 | 45.02 | 54.07 | 49.13 |
RoBERTa-Large | 36.78 | 64.21 | 46.77 | 42.07 | 47.81 | 44.76 | 47.66 | 58.37 | 52.47 |
RoBERTa-Large (MLM) | 38.44 | 63.11 | 47.78 | 42.93 | 46.45 | 44.62 | 47.03 | 53.11 | 49.89 |
@inproceedings{liu2020adapting,
title = {Adapting Open Domain Fact Extraction and Verification to COVID-FACT through In-Domain Language Modeling},
author = {Liu, Zhenghao and Xiong, Chenyan and Dai, Zhuyun and Sun, Si and Sun, Maosong and Liu, Zhiyuan},
booktitle = {Findings of the Association for Computational Linguistics: EMNLP 2020},
year={2020}
}
If you have questions, suggestions, and bug reports, please email: