Skip to content

Latest commit

 

History

History
121 lines (88 loc) · 6.63 KB

README.md

File metadata and controls

121 lines (88 loc) · 6.63 KB

Idiosyncrasies in Large Language Models

Official code of Idiosyncrasies in Large Language Models

Idiosyncrasies in Large Language Models
Mingjie Sun*, Yida Yin*, Zhiqiu Xu, J. Zico Kolter, Zhuang Liu (* indicates equal contribution)
Carnegie Mellon University, UC Berkeley, University of Pennsylvania, and Princeton University
[Paper] [Project page]

@article{sun2025idiosyncrasies,
    title    = {Idiosyncrasies in Large Language Models}, 
    author   = {Sun, Mingjie and Yin, Yida and Xu, Zhiqiu and Kolter, J. Zico and Liu, Zhuang},
    year     = {2025},
    journal  = {arXiv preprint arXiv:2502.12150}
}

We study idiosyncrasies in Large Language Models (LLMs) -- unique patterns in their outputs. We consider a simple classification task: given a particular text output, a neural network is trained to predict the source LLM that generates that text.

Setup

Installation instructions can be found in INSTALL.md.

Pre-generated Responses

We host a collection of pre-generated responses for Chat APIs, Instruct LLMs, and Base LLMs.

ChatGPT Claude Grok Gemini DeepSeek Phi-4
links download download download download download download
Llama3.1-8b-it Gemma2-9b-it Qwen2.5-7b-it Mistral-7b-v3-it
links download download download download
Llama3.1-8b Gemma2-9b Qwen2.5-7b Mistral-7b-v3
links download download download download

Response Generation

Chat APIs

We call official APIs to generate responses for Chat APIs.

Below is an example command to generate 11K responses for ChatGPT on UltraChat dataset.

  • Change the --model argument to generate responses for different Chat API models, including ChatGPT, Claude, Grok, Gemini, and DeepSeek.
python generate_responses.py \
    --model ChatGPT --api_key $api_key \
    --dataset UltraChat --num_samples 11_000 \
    --output_path /path/to/output.json

Instruct and Base LLMs

We use vLLM to generate responses for instruct / base LLMs in our paper.

Below is an example command to generate 11K responses for Llama3.1-8b-it on UltraChat dataset with greedy decoding.

  • --model argument controls the LLM used to generate responses. Our code currently supports generating responses for nine LLMs in our paper, including Llama3.1-8b-it, Gemma2-9b-it, Qwen2.5-7b-it, Mistral-7b-v3-it, Phi-4, Llama3.1-8b, Gemma2-9b, Qwen2.5-7b, and Mistral-7b-v3. We recommend using temperature 0.6 and repetition penalty 1.1 for base LLMs.
  • --dataset argument specifies the prompt dataset to generate responses on, including UltraChat, Cosmopedia, LmsysChat, WildChat, and FineWeb.
  • It is also possible to use multiple GPUs to generate responses. Simply change the --num_gpus argument. This is implemented through tensor parallelism by vLLM.
python generate_responses.py \
    --model Llama3.1-8b-it --temperature 0 \
    --dataset UltraChat --num_samples 11_000 \
    --output_path /path/to/output.json

Transformations

Below we provide scripts to perform various transformations on the generated responses. The supported transformations are remove_special_characters, shuffle_word, shuffle_letter, markdown_elements_only, paraphrase, translate, and summarize.

Here is the example command to shuffle words from the generated responses.

python transform.py \
    --input_path /path/to/input.json \
    --output_path /path/to/output.json \
    --transform_mode shuffle_word

To rewrite (e.g., paraphrase, translate, summarize) the generated responses, you also need to provide the API key for the rewriting model (e.g., GPT-4o-mini) through the --api_key argument.

python transform.py \
    --input_path /path/to/input.json \
    --output_path /path/to/output.json \
    --transform_mode paraphrase \
    --api_key $api_key

Classification

Below is an example command to classify responses from two different models. For $N$-way classification, you can change the --response_paths argument to include $N$ response paths (with white space separated).

You can change the --classifier argument to use different classifiers. Our code currently supports the following classifiers: llm2vec, gpt2, t5, and bert. Each classifier can be run on a single GPU (supported bfloat16) with 24 GB memory.

python classification.py \
    --response_paths /path/to/model1.json /path/to/model2.json \
    --classifier llm2vec \
    --output_dir /path/to/output_dir

License

This project is released under the MIT license. Please see the LICENSE file for more information.

Questions

Feel free to discuss papers/code with us through issues/emails!

mingjies at cs.cmu.edu
davidyinyida0609 at berkeley.edu