Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't Reproduce the result of the Bunny bsaed Phi-3 #142

Open
bollossom opened this issue Dec 14, 2024 · 1 comment
Open

Can't Reproduce the result of the Bunny bsaed Phi-3 #142

bollossom opened this issue Dec 14, 2024 · 1 comment

Comments

@bollossom
Copy link

bollossom commented Dec 14, 2024

I have used the bunny_695k for finetune with unfreeze sigclip, however I found that science QA only get 68.3%.
Is this because the dataset is too small during fine-tuning and the weights of Vision Tower should not be unfrozen?

Uploading 截屏2024-12-14 01.31.15.png…

Traing script:

deepspeed bunny/train/train.py \
    --lora_enable True --lora_r 128 --lora_alpha 256 --mm_projector_lr 2e-5 \
    --deepspeed ./script/deepspeed/zero3.json \
    --model_name_or_path .LLaVA/llms/Phi_3_mini_4k \
    --model_type $MODEL_TYPE \
    --version phi3 \
    --data_path ./finetune/bunny_695k.json \
    --image_folder .bunny/finetune/images \
    --vision_tower `./LLaVA/vision_tower/siglip_L_384` \
    --use_s2 True \
    --unfreeze_vision_tower True \
    --pretrain_mm_mlp_adapter ./checkpoints-pretrain/$PRETRAIN_DIR/mm_projector.bin \
    --mm_projector_type mlp2x_gelu \
    --image_aspect_ratio pad \
    --group_by_modality_length False \
    --bf16 True \
    --output_dir ./checkpoints-$MODEL_TYPE/$OUTPUT_DIR \
    --num_train_epochs 1 \
    --per_device_train_batch_size 4 \
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 4 \
    --evaluation_strategy "no" \
    --save_strategy "steps" \
    --save_steps 500 \
    --save_total_limit 1 \
    --learning_rate 2e-4 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --tf32 True \
    --model_max_length 4096 \
    --gradient_checkpointing True \
    --dataloader_num_workers 4 \
    --lazy_preprocess True \
    --run_name bunny_phi3_finetune \
    --report_to wandb | tee 2>&1 ./checkpoints-$MODEL_TYPE/$OUTPUT_DIR/log.txt
@Isaachhh
Copy link
Collaborator

Bunny uses SigLIP-SO.

Besides, you may refer to the paper and related results of Bunny-v1.0-4B.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants