-
Notifications
You must be signed in to change notification settings - Fork 15.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Streaming Capability to SageMaker LLMs #10535
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
cc @3coins |
Added formatting fix using black
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dazajuandaniel
This is awesome! Thanks for adding this. I don't have a model setup to test, but code looks good. 🚀
One minor suggestion for a follow up PR, would be nice to move the streaming/non-streaming parts to separate private functions for better readability.
Hey @3coins, Yes, I agree. I am happy to work on the follow up PR, no issues. I tried to follow the pattern that |
@baskaryan - please let me know if anything is missing :) |
lgtm aside from linting issues. would also be nice to factor out stream implementation into the |
@dazajuandaniel, @3coins - Is this good to be merged ? |
@baskaryan |
@pinak-p I was missing a mypy fix, I've done it now |
sorry for the delay, thank you @dazajuandaniel! |
Is there any documentation on this? I cant seem to find any. @dazajuandaniel |
This PR adds the ability to declare a Streaming response in the SageMaker LLM by leveraging the
invoke_endpoint_with_response_stream
capability inboto3
. It is heavily based on the AWS Blog Post announcement linked here.It does not add any additional dependencies since it uses the existing
boto3
version.