BERT for Coreference Resolution: Baselines and Analysis (EMNLP19) #8

hideaki-j · 2021-09-09T10:37:49Z

BERT for Coreference Resolution: Baselines and Analysis

Contribution summary

Joshi et al. proposed BERT-based CR method
to employ BERT's ability of passage-level understanding.
The model achieved SOTA on the GAP and OntoNotes benchmarks. The qualitative analysis showed that (1) handling pronouns in conversations and (2) mention paraphrasing are still difficult for the model.

Authors

Mandar Joshi, Omer Levy, Daniel S. Weld, and Luke Zettlemoyer
(University of Washington, AI2, FAIR)

Motivation

BERT's major improvement is passage-level training, which allows it to better model longer sequences
Can we apply it to CR task?

Method

Proposed BERT-based CR method.
Two ways of extending the c2f-coref, ELMo-based CR model:
- The independent variant uses non-overlapping segments each of which acts as an independent instance for BERT
- The overlap variant splits the document into overlapping segments so as to provide the model with context beyond 512 tokens

Results / Insight

Dataset

GAP: human-labeled dataset of pronoun-name pairs from Wikipedia snippets
OntoNotes 5.0: document-level dataset from the CoNLL-2012

Results

Achieved SOTA on the GAP and OntoNotes benchmarks
- with +6.2 F1 (baseline: BERT+RR) and +0.3 F1 (baseline: EE)
The overlap variant offers no improvement over independent

Insight

Unable to handle conversations: Modeling pronouns especially in the context of conversations (Table 3), continues to be difficult for all models, perhaps partly because c2f-coref does very little to model dialog structure of the document.
Importance of entity information: The models are unable to resolve cases requiring mention paraphrasing.
- E.g., Bridging the Royals with Prince Charles and his wife Camilla likely requires pretraining models to encode relations between entities

hideaki-j · 2021-09-09T10:40:45Z

Others

Code: https://github.com/mandarjoshi90/coref

QA

Q. Why BERT is bad at handling longer documents?
A. "Recent work (Joshi et al., 2019) suggests that BERT’s inability to use longer sequences effectively is likely a by-product pretraining on short sequences for a vast majority of updates."

hideaki-j added CR Coreference resolution EMNLP labels Sep 10, 2021

hideaki-j changed the title ~~🚧 BERT for Coreference Resolution: Baselines and Analysis (EMNLP19)~~ 👀 BERT for Coreference Resolution: Baselines and Analysis (EMNLP19) Sep 10, 2021

hideaki-j changed the title ~~👀 BERT for Coreference Resolution: Baselines and Analysis (EMNLP19)~~ ✅ BERT for Coreference Resolution: Baselines and Analysis (EMNLP19) Sep 10, 2021

hideaki-j changed the title ~~✅ BERT for Coreference Resolution: Baselines and Analysis (EMNLP19)~~ BERT for Coreference Resolution: Baselines and Analysis (EMNLP19) Sep 10, 2021

hideaki-j removed the EMNLP label Sep 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BERT for Coreference Resolution: Baselines and Analysis (EMNLP19) #8

BERT for Coreference Resolution: Baselines and Analysis (EMNLP19) #8

hideaki-j commented Sep 9, 2021 •

edited

Loading

hideaki-j commented Sep 9, 2021 •

edited

Loading

BERT for Coreference Resolution: Baselines and Analysis (EMNLP19) #8

BERT for Coreference Resolution: Baselines and Analysis (EMNLP19) #8

Comments

hideaki-j commented Sep 9, 2021 • edited Loading

BERT for Coreference Resolution: Baselines and Analysis

Contribution summary

Authors

Motivation

Method

Results / Insight

hideaki-j commented Sep 9, 2021 • edited Loading

Others

Related

QA

hideaki-j commented Sep 9, 2021 •

edited

Loading

hideaki-j commented Sep 9, 2021 •

edited

Loading