Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for tgi multimodal models #531

Merged
merged 34 commits into from
Nov 16, 2023
Merged

Add support for tgi multimodal models #531

merged 34 commits into from
Nov 16, 2023

Conversation

nsarrazin
Copy link
Collaborator

@nsarrazin nsarrazin commented Oct 24, 2023

Working for now but things that still need fixing:

  • Better UI for deleting an uploaded image (like a ❌ button or something)
  • Better UI for displaying an image in the chat
  • Handle resizing of images in the front-end
  • Add a button for uploading an image without dragging it (for mobile for example)
  • Disable dragging feature if multimodal:true is not set in the config for the model
  • Check image size on backend
  • Retry button doesn't retry with images
  • share conversation with images
  • when sending a message, show images in optimistic updates
  • Only allow 1 image upload max

How to test

  1. pull this pr
  2. npm run updateLocalEnv
  3. npm run dev

Start a conversation with IDEFICS to see the new ui

Screenshots

image
image

@nsarrazin nsarrazin marked this pull request as draft October 24, 2023 19:13
@nsarrazin
Copy link
Collaborator Author

@julien-blanchon I reused your dropzone component for this feature 🤗 thanks a lot for making it!

@nsarrazin nsarrazin added enhancement New feature or request front This issue is related to the front-end of the app. back This issue is related to the Svelte backend or the DB models This issue is related to model performance/reliability labels Oct 25, 2023
@julien-blanchon
Copy link
Contributor

Nice ! I'm pretty hype by this PR btw 👀

@julien-blanchon
Copy link
Contributor

julien-blanchon commented Oct 26, 2023

Hey @nsarrazin I'm thinking of dropping Mathpix dependencies in my implementation of Convert PDF to Markdown inside Chat UI (#441).
And include two text extractors:

  • A basic text extractor that extracts pure text from the PDF
  • A more advanced text extractor that uses an advanced OCR like https://huggingface.co/facebook/nougat-base, and uses the hosted inference API with hf user provided token

Are you interested in this functionality on the huggingchat side?

If so, how can we work together? Is this functionality included in your multimodal tgi implementation?
We could refactor the code a bit to enable the use of multiple file types and multiple "agents", what are your plans in this regard?

@nsarrazin
Copy link
Collaborator Author

Hey @julien-blanchon!

I think as a rule of thumb it's good to decouple any dependencies (especially remote APIs) from the feature itself if possible (see web search for example where we support three different providers now), so that people can configure the pdf parsing that they want. Maybe have some kind of standardized interface that takes a pdf file and returns the extracted text, so that people can copy the method and implement their own versions in future PRs?

And I think that could be a cool feature, maybe when this (#531) PR is merged we can have a look to see how to hook it up? This PR already adds support for passing files to the backend,so we could have a logic check that handles files differently based on mime type, like you mentioned.

@nsarrazin
Copy link
Collaborator Author

This is pretty much done and ready for review, I think I covered every edge case of the feature! 😄

.env.template Outdated
]
},
{
"name": "HuggingFaceM4/idefics-80b-instruct",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this change to .env.template if we don't want IDEFICS in production for HuggingChat

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @julien-c to confirm this

@mishig25
Copy link
Collaborator

mishig25 commented Nov 3, 2023

except the nits I left, looks very close to being merged 👍

Copy link
Collaborator

@gary149 gary149 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We could add paste support and full-screen dropzone in another PR.

@nsarrazin
Copy link
Collaborator Author

Going to remove IDEFICS from the prod template, add a readme note about using it and then merge this

@nsarrazin nsarrazin merged commit 0e5c445 into main Nov 16, 2023
@nsarrazin nsarrazin deleted the feature/idefics branch November 16, 2023 11:04
This was referenced Dec 6, 2023
ice91 pushed a commit to ice91/chat-ui that referenced this pull request Oct 30, 2024
* wip: add support for tgi multimodal models

* wip work on passing images to prompt

* working idefics config!

* rm allowed conv feature

* lint

* Add image resizing

* fix ssr

* add upload button

* add delete button

* misc formatting

* lint

* server file size check

* optimistic update of images

* retry with images

* fix websearch button

* lint

* better error handling & max one image at a time

* replace test image by blank one

* disable loading on page change

* Fix sharing of images

* fix comments

* Update filedropzone (huggingface#544)

* Update src/lib/buildPrompt.ts

Co-authored-by: Mishig <[email protected]>

* small tweaks

* Fix merge conflicts

* lint

* wildcard image mime type

* fix lint and comment

* added comments

* added comment about file size

* Readme update

---------

Co-authored-by: Mishig <[email protected]>
Co-authored-by: Victor Mustar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
back This issue is related to the Svelte backend or the DB enhancement New feature or request front This issue is related to the front-end of the app. models This issue is related to model performance/reliability
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants