When trying to create a Task Bundle using a TFLite file, I'm not allowed to enter the stop token of the model #5715

Arya-Hari · 2024-11-03T17:17:35Z

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

No

OS Platform and Distribution

Linux Ubuntu 16.04

Mobile device if the issue happens on mobile device

No response

Browser and version if the issue happens on browser

No response

Programming Language and version

Python

MediaPipe version

No response

Bazel version

No response

Solution

LLM Inference

Android Studio, NDK, SDK versions (if issue is related to building in Android environment)

No response

Xcode & Tulsi version (if issue is related to building for iOS)

No response

Describe the actual behavior

I created .tflite file using ai-edge-torch for Llama 3.2 1B model and now am trying to deploy it for inference on edge. When trying to create the task bundle, the stop token is asked. When I provide "<|end_of_text|>", it is not able to resolve it. I previously converted the tokenizer to the SentencePiece format through the code given in the ai-edge-torch repository.

Describe the expected behaviour

The task bundle should be created without errors.

Standalone code/steps you may have used to try to get what you need

I tried to manually check the possible tokens the model could identify using its vocab and "<|end_of_text|>" is a token in its vocab.

I also tried changing the stop token and the task bundle was created. However, on using the bundle for deployment, I was getting a Failed to initialize engine : modelError building tflite model. Also, just as a side question, the .task file that's created, can it be used interchangeably with the .bin file that's given in the model path in the repository examples?

Other info / Complete Logs

No response

kuaashish · 2024-11-04T08:47:30Z

Hi @Arya-Hari,

Could you please share the complete example you are using from our documentation? Additionally, if you have any error logs, sharing them would help us better understand the issue.

Thank you!!

Arya-Hari · 2024-11-05T12:49:28Z

Hello @kuaashish

So I converted the tokenizer to the SentencePiece compatible format through the code given in the ai-edge-torch repository. It generated a llama3.spm.model file.

Then I ran this script

import sentencepiece as spm

# Load the SentencePiece model
sp = spm.SentencePieceProcessor()
sp.load("/content/llama3.spm.model")

# Check special tokens or tokens that might indicate sequence ends
print("End token ID:", sp.eos_id())  # Check if the model has a predefined EOS token ID
print("Start token ID:", sp.bos_id())  # BOS may also indicate a start-of-sequence token

vocab_size = sp.get_piece_size()
for i in range(vocab_size):
    print(f"ID {i}: {sp.id_to_piece(i)}")

This then printed 128255 tokens along with their ID. The token with ID 128001 was <|end_of_text|>. According to the official documentation in the config files of Llama 3.2 1B, this is the stop token, and <|begin_of_text|> is the start token.

When running this piece of code as given in llm_bundling.ipynb,

tflite_model="/content/gemma_2b_quantized.tflite" # @param {type:"string"}
tokenizer_model="/content/llama3.spm.model" # @param {type:"string"}
start_token="<|begin_of_text|>" # @param {type:"string"}
stop_token="<|end_of_text|>" # @param {type:"string"}
output_filename="/content/llama.task" # @param {type:"string"}
enable_bytes_to_unicode_mapping=False # @param ["False", "True"] {type:"raw"}

config = bundler.BundleConfig(
    tflite_model=tflite_model,
    tokenizer_model=tokenizer_model,
    start_token=start_token,
    stop_tokens=[stop_token],
    output_filename=output_filename,
    enable_bytes_to_unicode_mapping=enable_bytes_to_unicode_mapping,
)
bundler.create_bundle(config)

I get this error - ValueError: Failed to encode stop token <|end_of_text|> with tokenizer.. When I try with any other valid token from the list of 128255 tokens, the code executes properly and generates a .task file. This is the first issue.

Secondly, when pushing the model onto the device, the documentation requires that a .bin file be pushed. I did not understand how to generate the .bin file after generating the Task Bundle.

Your help is much appreciated. Thank you!

google-ml-butler · 2024-11-05T12:49:31Z

Are you satisfied with the resolution of your issue?
Yes
No

Arya-Hari · 2024-11-07T02:40:13Z

@kuaashish Hello...is there way to resolve this?

talumbau · 2024-11-14T20:31:37Z

Thanks for all of the detail provided. Two quick items:

re: .task vs. .bin: Yes, you can use the .task wherever you would use a .bin file. The .task extension indicates that the file is a "converted TF Lite model + metadata/tokenizer"
I noticed in your provided script that you have this line:

tflite_model="/content/gemma_2b_quantized.tflite" # @param {type:"string"}

Is this just a copy/paste error? I assumed you would have something like llama3_1_1b_quantized.tflite, not a Gemma model.

Arya-Hari · 2024-11-16T03:50:10Z

Hi @talumbau. To clarify, I used the quantization script provided in the AI Edge Torch repository for quantizing and converting it to the TFLite format. The script used there, by default, saves the output file under the name of gemma_2b_quantized.tflite. I forgot to change it before using it. But, I changed everything else from the script to work for Llama 3.2 1B instead. Sorry for the confusion.

hheydary · 2024-11-18T15:38:34Z

Hello @Arya-Hari,
Thank you for reporting this issue. The task bundler code is now updated at HEAD to allow end of text tokens to be same as unknown token. Please pull the latest changes and create the task bundle file.

Arya-Hari · 2024-11-19T12:21:07Z

Okay thank you

Arya-Hari added the type:bug Bug in the Source Code of MediaPipe Solution label Nov 3, 2024

google-ml-butler bot assigned kuaashish Nov 3, 2024

kuaashish added the stat:awaiting response Waiting for user response label Nov 4, 2024

Arya-Hari closed this as completed Nov 5, 2024

Arya-Hari reopened this Nov 5, 2024

google-ml-butler bot removed the stat:awaiting response Waiting for user response label Nov 7, 2024

kuaashish added platform:python MediaPipe Python issues and removed python Pull requests that update Python code labels Nov 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When trying to create a Task Bundle using a TFLite file, I'm not allowed to enter the stop token of the model #5715

When trying to create a Task Bundle using a TFLite file, I'm not allowed to enter the stop token of the model #5715

Arya-Hari commented Nov 3, 2024 •

edited

Loading

kuaashish commented Nov 4, 2024

Arya-Hari commented Nov 5, 2024

google-ml-butler bot commented Nov 5, 2024

Arya-Hari commented Nov 7, 2024

talumbau commented Nov 14, 2024

Arya-Hari commented Nov 16, 2024

hheydary commented Nov 18, 2024

Arya-Hari commented Nov 19, 2024

When trying to create a Task Bundle using a TFLite file, I'm not allowed to enter the stop token of the model #5715

When trying to create a Task Bundle using a TFLite file, I'm not allowed to enter the stop token of the model #5715

Comments

Arya-Hari commented Nov 3, 2024 • edited Loading

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

OS Platform and Distribution

Mobile device if the issue happens on mobile device

Browser and version if the issue happens on browser

Programming Language and version

MediaPipe version

Bazel version

Solution

Android Studio, NDK, SDK versions (if issue is related to building in Android environment)

Xcode & Tulsi version (if issue is related to building for iOS)

Describe the actual behavior

Describe the expected behaviour

Standalone code/steps you may have used to try to get what you need

Other info / Complete Logs

kuaashish commented Nov 4, 2024

Arya-Hari commented Nov 5, 2024

google-ml-butler bot commented Nov 5, 2024

Arya-Hari commented Nov 7, 2024

talumbau commented Nov 14, 2024

Arya-Hari commented Nov 16, 2024

hheydary commented Nov 18, 2024

Arya-Hari commented Nov 19, 2024

Arya-Hari commented Nov 3, 2024 •

edited

Loading