display available cached versions in TGI server error message #776

jimburtoft · 2025-02-06T03:15:28Z

If a model is cached with a different configuration, I want to display alternative options to the user.

If someone copies from the deploy code on Hugging Face and changes something (e.g. sequence length), it is not obvious why it isn't working from this code. (especially if they don't understand compiling because they are referencing the original model)

Based on a true story!

added some carriage returns to make it more readable

get_hub_cached_entries does generate an error if it is fed a model that doesn't have a model_type. For example: (randomly selected) model_id = "hexgrad/Kokoro-82M"

Traceback (most recent call last):
File "", line 1, in
File "/opt/aws_neuronx_venv_pytorch_2_1/lib/python3.10/site-packages/optimum/neuron/utils/hub_cache_utils.py", line 431, in get_hub_cached_entries
model_type = target_entry.config["model_type"]
KeyError: 'model_type'

However, we already call that function inside of is_cached at the top of this block, so I don't know if we are filtering for certain types of models before we get to this point or not. If not, the existing code would generate that error before it ever gets here.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

If a model is cached with a different configuration, I want to display alternative options to the user. If someone copies from the deploy code on Hugging Face and change something (e.g. sequence length), it is not obvious why it isn't working. Based on a true story! added some carriage returns to make it more readable get_hub_cached_entries does generate an error if it is fed a model that doesn't have a model. For example: (randomly selected) model_id = "hexgrad/Kokoro-82M" Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/opt/aws_neuronx_venv_pytorch_2_1/lib/python3.10/site-packages/optimum/neuron/utils/hub_cache_utils.py", line 431, in get_hub_cached_entries model_type = target_entry.config["model_type"] KeyError: 'model_type' However, we call that function inside of is_cached, so I don't know if we are filtering for certain types of models or not. If not, it should generate the error before it ever gets here

jimburtoft · 2025-02-10T17:58:02Z

@dacorvo Not urgent. Do carriage returns break logging in any way?

jimburtoft changed the title ~~display available cached versions in error message~~ display available cached versions in TGI server error message Feb 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

display available cached versions in TGI server error message #776

display available cached versions in TGI server error message #776

jimburtoft commented Feb 6, 2025

jimburtoft commented Feb 10, 2025

display available cached versions in TGI server error message #776

Are you sure you want to change the base?

display available cached versions in TGI server error message #776

Conversation

jimburtoft commented Feb 6, 2025

Before submitting

jimburtoft commented Feb 10, 2025