-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transformer migration #334
Closed
qcdipankar
wants to merge
15
commits into
quic:tf_upgrade_4.50
from
qcdipankar:transformer_migration
Closed
Transformer migration #334
qcdipankar
wants to merge
15
commits into
quic:tf_upgrade_4.50
from
qcdipankar:transformer_migration
+699
−213
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
d194e98
to
492ef1f
Compare
compilation fix and enabled mxfp6 for vision encoder --------- Signed-off-by: Amit Raj <[email protected]> Signed-off-by: Dipankar Sarkar <[email protected]>
Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Dipankar Sarkar <[email protected]>
Removing onnx_defer_loading flag which was originally removed in _[Removed onnx_defer_loading from Immutable Convertor Args. PR: 230]_ but got added back later in _[Mllama(single + dual) + InternVL(single) + Llava (single) PR: 267]_ maybe becausing of rebasing. Signed-off-by: Shubham Agrawal <[email protected]> Signed-off-by: Dipankar Sarkar <[email protected]>
This will create a config JSON file, which contains all the details about compilation and SDK versions. Currently, this code is added in the code block of QEFFAutoModelForCausalLM.compile. The config would look like below: ``` { "huggingface_config": { "vocab_size": 50257, "n_positions": 1024, "n_embd": 768, "n_layer": 12, "n_head": 12, "n_inner": null, "activation_function": "gelu_new", "resid_pdrop": 0.1, "embd_pdrop": 0.1, "attn_pdrop": 0.1, "layer_norm_epsilon": 1e-05, "initializer_range": 0.02, "summary_type": "cls_index", "summary_use_proj": true, "summary_activation": null, "summary_first_dropout": 0.1, "summary_proj_to_labels": true, "scale_attn_weights": true, "use_cache": true, "scale_attn_by_inverse_layer_idx": false, "reorder_and_upcast_attn": false, "bos_token_id": 50256, "eos_token_id": 50256, "return_dict": true, "output_hidden_states": false, "output_attentions": false, "torchscript": false, "torch_dtype": null, "use_bfloat16": false, "tf_legacy_loss": false, "pruned_heads": {}, "tie_word_embeddings": true, "chunk_size_feed_forward": 0, "is_encoder_decoder": false, "is_decoder": false, "cross_attention_hidden_size": null, "add_cross_attention": false, "tie_encoder_decoder": false, "max_length": 20, "min_length": 0, "do_sample": false, "early_stopping": false, "num_beams": 1, "num_beam_groups": 1, "diversity_penalty": 0.0, "temperature": 1.0, "top_k": 50, "top_p": 1.0, "typical_p": 1.0, "repetition_penalty": 1.0, "length_penalty": 1.0, "no_repeat_ngram_size": 0, "encoder_no_repeat_ngram_size": 0, "bad_words_ids": null, "num_return_sequences": 1, "output_scores": false, "return_dict_in_generate": false, "forced_bos_token_id": null, "forced_eos_token_id": null, "remove_invalid_values": false, "exponential_decay_length_penalty": null, "suppress_tokens": null, "begin_suppress_tokens": null, "architectures": [ "GPT2LMHeadModel" ], "finetuning_task": null, "id2label": { "0": "LABEL_0", "1": "LABEL_1" }, "label2id": { "LABEL_0": 0, "LABEL_1": 1 }, "tokenizer_class": null, "prefix": null, "pad_token_id": null, "sep_token_id": null, "decoder_start_token_id": null, "task_specific_params": { "text-generation": { "do_sample": true, "max_length": 50 } }, "problem_type": null, "_name_or_path": "gpt2", "_commit_hash": "607a30d783dfa663caf39e06633721c8d4cfcd7e", "_attn_implementation_internal": "eager", "transformers_version": null, "model_type": "gpt2", "n_ctx": 1024 }, "qpc_config": { "QEff_config": { "pytorch_transforms": [ "AwqToMatmulNbitsTransform", "GPTQToMatmulNbitsTransform", "CustomOpsTransform", "KVCacheTransform" ], "onnx_transforms": [ "FP16ClipTransform", "SplitTensorsTransform" ], "onnx_path": "/root/.cache/qeff_models/GPT2LMHeadModel-36f0eca92731bb47/GPT2LMHeadModel.onnx" }, "aic_compiler_config": { "apps_sdk_version": "1.20.0", "compile_dir": "/root/.cache/qeff_models/GPT2LMHeadModel-36f0eca92731bb47", "specializtions_file_path": "/root/.cache/qeff_models/GPT2LMHeadModel-36f0eca92731bb47/specializations.json", "prefill_seq_len": 32, "ctx_len": 128, "batch_size": 1, "full_batch_size": null, "num_devices": 1, "num_cores": 16, "mxfp6_matmul": false, "mxint8_kv_cache": false, "num_speculative_tokens": null }, "qnn_config": { "enable_qnn": true, "qnn_config_path": "QEfficient/compile/qnn_config.json", "product": "QAIRT", "os": { "Ubuntu": 22.04, "Windows": 11 }, "sdk_flavor": [ "aic" ], "version": "2.31.0", "build_id": "250109072054_3882", "qnn_backend_api_version": "2.18.0", "tensorflow": "2.10.1", "tflite": "2.3.0", "torch": "1.13.1", "onnx": "1.16.1", "onnxruntime": "1.17.1", "onnxsimplifier": "0.4.36", "android-ndk": "r26c", "platform": "AIC.1.20.0.14" } } } ``` Note: The code structure may change. --------- Signed-off-by: Abukhoyer Shaik <[email protected]> Signed-off-by: Dipankar Sarkar <[email protected]>
… validation page (quic#303) Signed-off-by: Abukhoyer Shaik <[email protected]> Signed-off-by: Dipankar Sarkar <[email protected]>
This is just small fixes done for printing the `QEFFAutoModelForCausalLM`'s instance by changing the `__repr__(self)` method. Signed-off-by: Abukhoyer Shaik <[email protected]> Signed-off-by: Dipankar Sarkar <[email protected]>
… models (quic#286) Minor fixes to generate and compile to be more consistent with how other models are called. --------- Signed-off-by: Kushal Dulla <[email protected]> Signed-off-by: Dipankar Sarkar <[email protected]>
…re computed (quic#233) 1) Adding the support to resume the fine tuning using checkpoints from a prev run which would have stopped in between. 2) Checkpoints, both intermediate and for complete epoch, will get saved for each epoch through these changes. 3) There's no necessity to pass tokenizer_name if a model_name is passed. It will take the same name as model_name by default. If a different tokenizer_name is required than the model_name, then it can be passed separately as an argument in the command. --------- Signed-off-by: Swati Allabadi <[email protected]> Co-authored-by: Swati Allabadi <[email protected]> Signed-off-by: Dipankar Sarkar <[email protected]>
BUGFIX: added patch for InternVL to have vit_embeds 0th dim as dynamic based on num_patches. Signed-off-by: quic-dhirajku <[email protected]> Signed-off-by: Dipankar Sarkar <[email protected]>
made few changes in the modeling files of both models for this method to now work appropriately. Signed-off-by: quic-dhirajku <[email protected]> Signed-off-by: Dipankar Sarkar <[email protected]>
1. In case of finetuning on qaic, torch_qaic gradScaler will be used 2. Moving back to lora_dropout = 0.05 on ML Framework team's ask. Signed-off-by: Swati Allabadi <[email protected]> Co-authored-by: Swati Allabadi <[email protected]> Signed-off-by: Dipankar Sarkar <[email protected]>
Absent of customrmsnorm was causing GraniteCausalLM to fail in aic with full model in 4.46.3 Addition of CustomRMSNormAIC fixes the issue --------- Signed-off-by: Dipankar Sarkar <[email protected]> Co-authored-by: Dipankar Sarkar <[email protected]> Signed-off-by: Dipankar Sarkar <[email protected]>
Signed-off-by: Rishin Raj <[email protected]> Signed-off-by: Dipankar Sarkar <[email protected]>
Absent of customrmsnorm was causing GraniteCausalLM to fail in aic with full model in 4.46.3 Addition of CustomRMSNormAIC fixes the issue Signed-off-by: Dipankar Sarkar <[email protected]>
1. Changes in libraries used during Context binary generation. 2. Changed convertor spelling to converter to align with qairt-converter string. --------- Signed-off-by: Shubham Agrawal <[email protected]> Signed-off-by: Dipankar Sarkar <[email protected]>
14fb609
to
8b72e27
Compare
Please Rebase |
917a501
to
fc89e8b
Compare
e7f885e
to
99a2063
Compare
6eff2ef
to
e4b1cb7
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Enabled Models for transformer==4.50.0
Models Enabled
1.GPT2
2.GPTJ
3.Phi
4.Phi3
5.Granite
6.Whisper