Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NPU] Expose parameter to control blob / IR save logic #12767

Merged
merged 5 commits into from
Feb 6, 2025

Conversation

rnwang04
Copy link
Contributor

@rnwang04 rnwang04 commented Feb 5, 2025

Description

1. Why the change?

https://github.com/analytics-zoo/nano/issues/1833#issuecomment-2635643123
https://github.com/intel-analytics/llm.cpp/pull/807

2. User API changes

Expose two parameter keep_ir and compile_blob for AutoModelForCausalLM.from_pretrained, keep_ir is default to False and compile_blob is default to True.
In C++ convert.py script,

  • default blob convert: python convert.py --repo-id-or-model-path D:\llm-models\Llama-3.2-3B-Instruct --save-directory D:\Llama-3.2-3B-Instruct-npu-q4_0-blob
  • ir convert: python convert.py --repo-id-or-model-path D:\llm-models\Llama-3.2-3B-Instruct --save-directory D:\Llama-3.2-3B-Instruct-npu-q4_0-ir --keep-ir --disable-compile-blob

3. Summary of the change

  • Expose two parameter keep_ir and compile_blob for AutoModelForCausalLM.from_pretrained, keep_ir is default to False and compile_blob is default to True
  • update C++ convert.py script
  • Remove unnecessary bin files to save disk space

4. How to test?

  • Unit test: Please manually trigger the PR Validation here by inputting the PR number (e.g., 1234). And paste your action link here once it has been successfully finished.

Copy link
Contributor

@plusbang plusbang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rnwang04 rnwang04 merged commit 094a25b into intel:main Feb 6, 2025
1 check passed
@rnwang04 rnwang04 deleted the update_save_api branch February 6, 2025 02:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants