Dialogue NLI

This repository contains the 2024 DNLI dataset and baseline models published with the following paper:

Adam Ek, Bill Noble, Stergios Chatzikyriakidis, Robin Cooper, Simon Dobnik, Eleni Gregoromichelaki, Christine Howes, Staffan Larsson, Vladislav Maraev, Gregory Mills, and Gijs Wijnholds. 2024. I hea- umm think that’s what they say: A Dataset of Inferences from Natural Language Dialogues. In Proceedings of the 28th Workshop on the Semantics and Pragmatics of Dialogue.

data/pretraining contains the BNC pretraining corpus used to fine-tune BERT. This was created with create_pretraining_corpus.py.
data/compiled contains the DNLI dataset provided with different context lengths for context ablation studies. This was created with create_data_files.py.
baselines contains code for the the BERT and LSTM baselines.
baselines/llm contains code for the LLama 2 and Zephyr baselines.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
baselines		baselines
data		data
LICENSE		LICENSE
create_data_files.py		create_data_files.py
create_pretraining_corpus.py		create_pretraining_corpus.py
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dialogue NLI

About

Releases

Packages

Languages

License

GU-CLASP/DNLI

Folders and files

Latest commit

History

Repository files navigation

Dialogue NLI

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages