BEYOND THE TOP-1

This repository provides code and resources for our paper, Looking Beyond The Top-1: Transformers Determine Top Tokens In Order.

Installation & Setup

Prerequisites

Ensure you have Python 3.8+ and pip installed.

Steps

Clone the repository:

git clone https://github.com/daria-lioubashevski/beyond_top1
cd beyond_top1

Install required Python packages:
```
pip install -r requirements.txt
```

Usage

1. Analysis

All scripts support the following models: GPT2-XL (pre-trained and randomly initalized), ViT-L/16, Whisper-large.

To investigate the order of saturation layers for top-k tokens use analysis/order_of_saturation_layer_analysis.py.

Arguments:

-model or --model_name (str): The model to analyze. Choose from gpt2, vit, whisper, or random_gpt2.
-a or --analysis (str): The type of analysis to perform. Options are rank_corr or kendalls_tau.
-n or --num_samples (int): Number of samples to use in the analysis.
-o or --output_path (str): Path to save the analysis results.

Example usage for ViT model and rank correlation analysis over 200 images:

python -m analysis.order_of_saturation_layer_analysis -model vit -a rank_corr -n 200 -o rank_corr.png

To probe the task information in the model's embeddings, first run analysis/create_data_for_task_probing.py to create training data.

Arguments:

-model or --model_name (str): The model to analyze. Choose from gpt2, vit, whisper, or random_gpt2.
-n or --num_samples (int): Number of samples to use.
-k or --num_tasks (int): number of tasks (should probably be between 3 and 5).
-o or --output_path (str): Path to save the pkl containing the extracted embeddings.

Then use analysis/run_task_transition_probing.py to train and evaluate the classifier.

Arguments:

-d or --data_path (str): Path to data (extracted embeddings) for probing.
-n or --num_tasks (int): number of tasks (should probably be between 3 and 5).
-k or --kfolds (int): number of kfolds in training.

-o or --output_path (str): Path to save the analysis results.

Example usage with GPT2 model for tasks 1 to 5 over 50 texts:

python -m analysis.create_data_for_task_probing -model gpt2 -n 50 -k 5 -o gpt2_embds_for_probing.pkl
python -m analysis.run_task_transition_probing -n 4 -k 5 -d gpt2_embds_for_probing.pkl -o probing_results.txt

2. Intervention

All scripts support the following models: GPT2-XL (pre-trained only), ViT-L/16, Whisper-large. To perform casusal intervention on the model activations causing it to switch from task-1 to task-2 use analysis/run_intervention_procedure.py

Arguments:

-model or --model_name (str): The model to analyze. Choose from gpt2, vit, whisper, or random_gpt2.
-n or --num_pairs (int): Number of pairs to use in intervention procedure.
-o or --output_path (str): Path to save the intervention figure.

Example usage for Whisper model over 100 pairs:

python -m intervention.run_intervention_procedure -model whisper -n 100 -o interv_result.png

3. Practical Applications

All script currently support only GPT-2 model.

To compare the performance of our new early exit measure against existing ones in terms of performance/efficiency tradeoff run practical_applications/new_early_exit.py.

Arguments:

-n or --num_samples (int): Number of samples (texts) to use in comparison.
-o or --output_path (str): Path to save resulting figure.

Example usage with already trained task index classifier:

 python -m practical_applications.new_early_exit -n 10 -c GPT2_top5_clf.pkl -o ee_compar.png

To see how the saturation layer affects language modeling use practical_applications/better_language_modeling.py.

Arguments:

-n or --num_samples (int): Number of samples (texts) to use in comparison.
-o or --output_path (str): Path to save results.

Example usage:

 python -m practical_applications.better_language_modeling -n 20 -o lang_modeling.txt

Citation

To cite our work, please use the following BibTeX entry:

@article{lioubashevski2024looking,
  title={Looking Beyond The Top-1: Transformers Determine Top Tokens In Order},
  author={Lioubashevski, Daria and Schlank, Tomer and Stanovsky, Gabriel and Goldstein, Ariel},
  journal={arXiv preprint arXiv:2410.20210},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
analysis		analysis
intervention		intervention
practical_applications		practical_applications
README.md		README.md
__init__.py		__init__.py
consts.py		consts.py
plots.py		plots.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BEYOND THE TOP-1

Table of Contents

Installation & Setup

Prerequisites

Steps

Usage

1. Analysis

2. Intervention

3. Practical Applications

Citation

About

Releases

Packages

Contributors 2

Languages

daria-lioubashevski/beyond_top1

Folders and files

Latest commit

History

Repository files navigation

BEYOND THE TOP-1

Table of Contents

Installation & Setup

Prerequisites

Steps

Usage

1. Analysis

2. Intervention

3. Practical Applications

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages