Can Large Language Models Capture Dissenting Human Voices? (EMNLP 2023)

Authors: Noah Lee* Na Min An* James Thorne

We test generative LLMs jointly on the performance and human disagreement on NLI.
We suggest two probability distribution estimation techniques for LLMs to represent disagreement and perform empirical evaluations to with respect to the human disagreement distribution.
LLMs do not excel as expected on NLI tasks and fail to align with human disagreement levels.

Environment Setting

conda create -n humllm
conda activate humllm
pip install -r requirements.txt

Datasets

The datasets used for the research are as the following:

Usage

All the script examples can be found in ./scripts/

Preprocess & Sample

Sample random 100 samples & hardest 100 samples

bash ./scripts/sample.sh

Generate

Generation of a LLM output can be done by bash ./scripts/generate.sh or either:

python generate.py --data_dir <input data directory> \
                    --data_type <input data type> \
                    --model <model name> \
                    --file_name <output file name> \
                    --out_dir <output directory> \
                    --max_length <maximum token lengths> \
                    --gen_type <generation type> \
                    --num_iter <iteration number> \ 
                    --num_samples <sample number> # num_iter x num_samples = total sample size \
                    --prompt_variations <use prompt variations> \
                    --few_shot <few shot number>

Evaluate

Evaluation of generated distribution is available by bash ./scripts/evaluate.sh or either:

python evaluate.py --data_dir <input data directory> \
                    --data_type <input data type> \
                    --gen_type <generation type>

Citation

Please consider citing our work if you find this work helpful for your research.

@misc{lee2023large,
      title={Can Large Language Models Capture Dissenting Human Voices?}, 
      author={Noah Lee and Na Min An and James Thorne},
      year={2023},
      eprint={2305.13788},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Contact

Noah Lee: [email protected]
Na Min An: [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
images		images
inputs		inputs
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
evaluate.py		evaluate.py
generate.py		generate.py
merge.py		merge.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
sample.py		sample.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Can Large Language Models Capture Dissenting Human Voices? (EMNLP 2023)

Environment Setting

Datasets

Usage

Preprocess & Sample

Generate

Evaluate

Citation

Contact

About

Releases

Packages

Languages

License

xfactlab/emnlp2023-LLM-Disagreement

Folders and files

Latest commit

History

Repository files navigation

Can Large Language Models Capture Dissenting Human Voices? (EMNLP 2023)

Environment Setting

Datasets

Usage

Preprocess & Sample

Generate

Evaluate

Citation

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages