Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: image input for VLMs #34

Open
yagil opened this issue Mar 9, 2025 · 5 comments
Open

Feature: image input for VLMs #34

yagil opened this issue Mar 9, 2025 · 5 comments
Assignees

Comments

@yagil
Copy link
Member

yagil commented Mar 9, 2025

Equivalent to lmstudio-js’s https://lmstudio.ai/docs/typescript/llm-prediction/image-input

@ncoghlan ncoghlan self-assigned this Mar 10, 2025
@ncoghlan
Copy link
Collaborator

ncoghlan commented Mar 10, 2025

Breaking down the individual pieces of this:

  • Add prepare_image to the files session API
  • Make the client files session attribute public
  • Add a top-level prepare_image convenience API for easy interactive use
  • Make the FileHandle type public
  • Make the images parameter in Chat.add_user_message public, but without the implicit local file input support
  • Drop the implicit local file handling from Chat instances entirely (as we're going to take a different approach to the chat session history management convenience API)

@ncoghlan
Copy link
Collaborator

#36 makes the APIs for adding file handles to chat history instances public.

Passing file handles to Chat.from_history/.add_entry/.append will still require the full multi-part user message format for now (it seems preferable to wait for lmstudio-ai/lmstudio-js#270 to be considered first, rather than jumping directly to independently replicating the required input types on the Python side)

ncoghlan added a commit that referenced this issue Mar 11, 2025
@ncoghlan
Copy link
Collaborator

While not directly part of this issue, #37 makes the file preparation interface public, which will be used as the base for the image preparation interface.

ncoghlan added a commit that referenced this issue Mar 11, 2025
Implements the remaining components of #34
ncoghlan added a commit that referenced this issue Mar 11, 2025
Also make debug logging for file uploads less noisy.

Implements the remaining components of #34
@ncoghlan
Copy link
Collaborator

#38 adds the image-specific preparation APIs

ncoghlan added a commit that referenced this issue Mar 11, 2025
Also make debug logging for file uploads less noisy.

Implements the remaining components of #34
ncoghlan added a commit that referenced this issue Mar 11, 2025
ncoghlan added a commit that referenced this issue Mar 11, 2025
@ncoghlan
Copy link
Collaborator

Draft SDK docs PR is up at lmstudio-ai/docs#48

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants