The code has been verified on Python 3.8.19.
$ conda create -n icu python=3.8
$ conda activate icu
# Install the correct torch version depending on CUDA version from https://pytorch.org/
$ conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
$ pip install -r requirements.txt
Training scripts can be found under scripts
directory.
With one gpu ("cuda:0"), the command below starts training of GPT-NEO-125m.
# train-125m.sh
$ python ./train_neo_meanwhile_update.py \
--exp "exp0" --model_name "EleutherAI/gpt-neo-125m" \
--tokenizer_name "EleutherAI/gpt-neo-125m" \
--gpt2_name "openai-community/gpt2" \
--bert_name "google-bert/bert-base-uncased" \
--prefix_length 200 --suffix_length 200 --target_length 200 \
--device "cuda:0" --batch_size 8 --num_workers 8 --lr 5e-6 \
--uw 1.0 --lw 0.5 --kl 1.0 --f1 0.3 --bleu 0.01 --acc 0.5994 \
--el 0.0499 --dir "result/test"
With three gpus (0,1,2), the command below starts training of GPT-NEO-1.3B.
# train-1_3b.sh
deepspeed --include localhost:0,1,2 \
./train_neo_meanwhile_update_deepspeed.py \
--deepspeed_config ./config/deepspeed3.json \
--exp "exp0" --model_name "EleutherAI/gpt-neo-1.3B" \
--tokenizer_name "EleutherAI/gpt-neo-1.3B" \
--gpt2_name "openai-community/gpt2" \
--bert_name "google-bert/bert-base-uncased" \
--prefix_length 200 --suffix_length 200 --target_length 200 \
--batch_size 4 --num_workers 8 --lr 5e-6 \
--uw 1.0 --lw 0.5 --kl 1.0 --f1 0.3 --bleu 0.01 --acc 0.5994 \
--el 0.0499 --dir "result/test"
With six gpus (0,1,2,3,4,5), the command below starts training of GPT-NEO-1.3B.
# train-2_7b.sh
$ deepspeed --include localhost:0,1,2,3,4,5 \
./train_neo_meanwhile_update_deepspeed.py \
--deepspeed_config ./config/deepspeed6.json \
--exp "exp0" --model_name "EleutherAI/gpt-neo-2.7B" \
--tokenizer_name "EleutherAI/gpt-neo-2.7B" \
--gpt2_name "openai-community/gpt2" \
--bert_name "google-bert/bert-base-uncased" \
--prefix_length 200 --suffix_length 200 --target_length 200 \
--batch_size 4 --num_workers 8 --lr 5e-6 \
--uw 1.0 --lw 0.5 --kl 1.0 --f1 0.3 --bleu 0.01 --acc 0.5994 \
--el 0.0499 --dir "result/test1"
You can test the code on the downstream tasks using the command below.
# valid.sh
$ python ./valid.py \
--model_name "./result/test/EleutherAI/gpt-neo-125m_exp0_lr5e-06_uw1.0_lw0.5_kl1.0_epoch19_updateboth" \
--tokenizer_name "EleutherAI/gpt-neo-125m" \
--prefix_length 512 --suffix_length 512 --device "cuda:0" \
--batch_size 32 --num_workers 48 \
--dir "./result/test" --cache "./.cache"
The original model can be evaluated using the command below.
# eval.sh
python ./eval.py --exp "all" \
--model_name "EleutherAI/gpt-neo-125m" \
--tokenizer_name "EleutherAI/gpt-neo-125m" \
--gpt2_name "openai-community/gpt2" \
--bert_name "google-bert/bert-base-uncased" \
--prefix_length 200 --suffix_length 200 --target_length 200 \
--device "cuda:0" --batch_size 8 --num_workers 8 \
--dir "./result/test"
The related code is in evaluation
directory. test.ipynb
is more convenient than api.py
.
- Fill in your GPT-4 api key in the code.
- Use
convert.py
to convert the results of previous files in Evaluating. (Rearranging the files according to the code may be necessary.) - Run the code inside
evaluation
.
The files should be rearranged to a tree following below structure:
evaluation
│ 125mneo.csv # the results of gpt-neo-125m on all
│ 125mopt.csv # the results of opt-125m on all
│ api.py
│ convert.py
│ lm_extraction_128_0.csv
│ prompt.py
│ test.ipynb
│
├───125m-0
│ and.csv # the results of KUMPR on 0
│ neo.csv # generated
| opt.csv # generated
│ ours.csv # our results
│ results.json # generated
The target data can be downloaded from this link.
Below are the validation datasets used and can be downloaded from open source.
First, place the train_dataset.npy
under directory datasets
. Then run data_prep.py
. This will complete the converting and the KNN sampling process.
The data used in 5 runs in our paper is under directory datasets/exp/exp{0/1/2/3/4}
respectively.
Our codebase is based on the following repo. Thanks for open-sourcing!