Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inference with LLM and vision frozen #400

Open
simoneriggi opened this issue Jan 22, 2025 · 0 comments
Open

inference with LLM and vision frozen #400

simoneriggi opened this issue Jan 22, 2025 · 0 comments

Comments

@simoneriggi
Copy link

Dear all,
I have fine-tuned a LLaVA-OneVision (0.5B and 7B) with LLM and vision components frozen. The checkpoint output directory contains these files:

runs
checkpoint-1000     
...
...
checkpoint-22000
trainer_state.json
mm_projector.bin
config.json

Loading the trained model with load_pretrained_model(model_path, model_base=None, model_name='llava_qwen', device_map='auto') fail as no tokenizer files are found in the model_path. When I copy tokenizer files (tokenizer_config.json, tokenizer.json) from the base model, loading fails with this error: OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory.
I had a look at the load_pretrained_model method in the builder.py file. It seems that I should set model_base to base model (e.g. lmms-lab/llava-onevision-qwen2-0.5b-ov) rather than setting to None. Also, it seems that some logic to load qwen model is missing in the method.
I tried to add this code:

elif "qwen" in model_name.lower():
    from llava.model.language_model.llava_qwen import LlavaQwenConfig, LlavaQwenForCausalLM
            	
    tokenizer = AutoTokenizer.from_pretrained(model_base, use_fast=False)
    if overwrite_config is not None:
        llava_cfg = LlavaQwenConfig.from_pretrained(model_path)
        rank0_print(f"Overwriting config with {overwrite_config}")
        for k, v in overwrite_config.items():
             setattr(llava_cfg, k, v)
        model = LlavaQwenForCausalLM.from_pretrained(model_base, low_cpu_mem_usage=True, attn_implementation=attn_implementation, config=llava_cfg, **kwargs)
    else:
        model = LlavaQwenForCausalLM.from_pretrained(model_base, low_cpu_mem_usage=True, attn_implementation=attn_implementation, **kwargs)

right after:

elif model_base is not None:
    ...
    ...
    elif (
                "wizardlm-2" in model_name.lower()
                and "vicuna" in model_name.lower()
                or "llama" in model_name.lower()
                or "yi" in model_name.lower()
                or "nous-hermes" in model_name.lower()
                or "llava-v1.6-34b" in model_name.lower()
                or "llava-v1.5" in model_name.lower()
            ):
            ....
            ....
            model = LlavaLlamaForCausalLM.from_pretrained(model_base, low_cpu_mem_usage=True, config=llava_cfg, **kwargs)      

   [ADD CODE HERE]

I managed to load the model with this fix. Can you please confirm if this is correct or if I am doing something wrong?
Thanks a lot for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant