Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when unfreeze_vision_tower set true No matter how the image changes, the model outputs the exact same answer #140

Open
Gary2018X opened this issue Dec 5, 2024 · 3 comments

Comments

@Gary2018X
Copy link

unfreeze_vision_tower set false Normal output
I want to know how to solve it
the train.sh as follow

#!/bin/bash

MODEL_TYPE=qwen1.5-1.8b
DATA_VERSION=s_all
PRETRAIN_DIR=bunny-$MODEL_TYPE-pretrain
OUTPUT_DIR=bunny-lora-juzao-$DATA_VERSION-7k-uvit-$MODEL_TYPE

mkdir -p ./checkpoints-$MODEL_TYPE/$OUTPUT_DIR

deepspeed bunny/train/train.py \
    --lora_enable True --lora_r 128 --lora_alpha 256 --mm_projector_lr 2e-5 \
    --deepspeed ./script/deepspeed/zero3.json \
    --model_name_or_path models/Qwen1.5-1.8B \
    --model_type $MODEL_TYPE \
    --version bunny \
    --data_path data/$DATA_VERSION/Bunny_all_filtered.json \
    --image_folder  /\
    --vision_tower models/siglip-so400m-patch14-384 \
    --pretrain_mm_mlp_adapter models/bunny-pretrain-qwen1.5-1.8b-siglip/mm_projector.bin \
    --mm_projector_type mlp2x_gelu \
    --image_aspect_ratio pad \
    --group_by_modality_length False \
    --bf16 True \
    --output_dir ./checkpoints-$MODEL_TYPE/$OUTPUT_DIR \
    --num_train_epochs 3 \
    --per_device_train_batch_size 8 \
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 2 \
    --evaluation_strategy "no" \
    --save_strategy "steps" \
    --save_steps 500 \
    --save_total_limit 1 \
    --learning_rate 2e-4 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --tf32 True \
    --model_max_length 2048 \
    --gradient_checkpointing True \
    --dataloader_num_workers 4 \
    --lazy_preprocess True \
    --unfreeze_vision_tower True\
    --report_to none | tee 2>&1 ./checkpoints-$MODEL_TYPE/$OUTPUT_DIR/log.txt
@Isaachhh
Copy link
Collaborator

Isaachhh commented Dec 5, 2024

What is the "exact same" answer? Maybe overfitting?

@Gary2018X
Copy link
Author

My answers are diverse
I have also trained a few times
I found that the output randomly selected one of my answers
and My previous code used this training script normally
I updated to the latest code and there was a problem

@Isaachhh
Copy link
Collaborator

Isaachhh commented Dec 6, 2024

The recent commit 08273ac is due to #130.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants