Skip to content

Fall back on guessed MIME type when requested multi-modal content doesn't have Content-Type header #1613

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

kylesf
Copy link
Contributor

@kylesf kylesf commented Apr 28, 2025

@Kludex I needed to add during testing of agents I built. I am not sure why tests did not catch the issue and whats the best long term approach for the project. Please share your thoughts.

@kylesf kylesf changed the title Gemini Video Implementation Gaurds Gemini Video Implementation Guards Apr 28, 2025
@kylesf
Copy link
Contributor Author

kylesf commented Apr 28, 2025

I did see difference in how bed rock is handling audio, video etc compared to that of other models.

@kylesf
Copy link
Contributor Author

kylesf commented Apr 30, 2025

@DouweM Thoughts here?

@DouweM
Copy link
Contributor

DouweM commented Apr 30, 2025

@kylesf Thanks! We'll want to address the issue of Content-Type not containing a useful value everywhere we currently use it, not just for Gemini. The fallback logic should be centralized in {Image,Video,Audio,Document}Url.media_type so all models can use it. The one on DocumentUrl already uses the same guess_type function you're using here, which likely makes sense for the other types as well. Can you have a look at implementing this in that more generic way, please?

@DouweM DouweM marked this pull request as draft April 30, 2025 21:04
@DouweM DouweM self-assigned this Apr 30, 2025
@kylesf
Copy link
Contributor Author

kylesf commented May 1, 2025

idk lot of things going on. guessing type is not ideal if we have fixed supported types. i.e. VideoMediaType - at the same time we have a test case like image_url = ImageUrl(url='https://goo.gle/instrument-img') where we need to do web req to get our type. prob needs more thinking.

@kylesf
Copy link
Contributor Author

kylesf commented May 1, 2025

i do think this change allows better support as if we do req and that info is not given up, then we can use the extension.

@kylesf
Copy link
Contributor Author

kylesf commented May 1, 2025

@DouweM thoughts?

@DouweM
Copy link
Contributor

DouweM commented May 1, 2025

@kylesf Falling back on media_type if there's no Content-Type header makes sense. Can you please see if we can also fix that wherever else we currently use Content-Type (also written lowercase in some places)?

@kylesf
Copy link
Contributor Author

kylesf commented May 1, 2025

@DouweM got them all I believe

@DouweM DouweM changed the title Gemini Video Implementation Guards Fall back on guessed MIME type when requested multi-modal content doesn't have Content-Type header May 2, 2025
@DouweM
Copy link
Contributor

DouweM commented May 2, 2025

@kylesf Thank you! The code looks good, but CI is complaining about coverage. To prevent having to add a test of this for each model, I suggest moving this logic to a helper function, and then adding a single test for this scenario that passes through that function.

@kylesf
Copy link
Contributor Author

kylesf commented May 6, 2025

@DouweM - in my attempts to do this, I have not been happy with the resulting code, more complex and not as simple to implement. I will probably leave it here, for the team to decide what to do. I will just monkey patch on my side for now. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants