-
Notifications
You must be signed in to change notification settings - Fork 660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Excdeeding 63 character limit for Volume names in Flyte Propeller Container Plugin #4824
Comments
Thank you for opening your first issue here! 🛠 |
🤖 Hey @alexbeach-bc, great to see you back here! Hope you're doing well. Based on the information available in the Flyte repository, it's not clear how volume names are generated in the Flyte Propeller Container Plugin. The error message you're seeing suggests that the volume name is exceeding the Kubernetes limit of 63 characters, but it's not clear from the provided context why this is happening. The registry name you're using does not appear to play a direct role in the generation of volume names, based on the information available. The volume name is generated by the template "flyte-sandbox.persistence.minioVolumeName" as seen in the charts/flyte-sandbox/templates/storage/minio/pvc.yaml. However, without the content of this template, it's not possible to confirm this. It's possible that there's a bug or a misconfiguration causing this issue, but without more information, it's hard to say for sure. If you could provide more details about your configuration and the exact steps you're taking, it might be possible to identify the cause of this issue. SourcesThis response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
@alexbeach-bc apologies for just now getting to this issue. Reproduced the invalid volume name error with running a task with a PodSpec with a > 63 char volume name. The latest version of Flyte properly handles the error so I don’t believe there was a regression with launching pods. This is sufficient to close concern on propeller’s end. Will follow up on issue to see how the volume name was generated. What Flyte version are you running? |
closing after discussion offline |
encountered this again. Any suggestions?
|
Describe the bug
I am running this tutorial
https://github.com/unionai-oss/llm-fine-tuning/tree/main/flyte_llama
I am running the command in the readme:
pyflyte -c $FLYTECTL_CONFIG run --remote \ --copy-all \ flyte_llama/workflows.py train_workflow \ --config config/flyte_llama_7b_qlora_v0.json
The only code modifications is that I am using gcp registry and a self hosted flyte cluster in GKE. The flyte cluster i have, contains node pools with the required gpu, memory, and cpu resources to run.
Note that the registry name is very long due to how the urls are constructed by gcp
The first tasks succeeds, but the second task is stuck in Queue state:
After looking at the flyte-propeller k8s logs, i see this error.
How are the volume names generated? Could it be because the long registry name is used in the volume name?
Expected behavior
I would expect valid Volume names to be generated and the tutorial task to complete.
Additional context to reproduce
No response
Screenshots
No response
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: