-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core][Feature] Input metadata dump on crash #13407
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Wallas Santos <[email protected]>
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
Signed-off-by: Wallas Santos <[email protected]>
Signed-off-by: Wallas Santos <[email protected]>
@wallashss Thanks for writing up this PR. I think it will be useful to have details for debugging printed to the logs at crash! When I try out these changes in my dev environment running online mode with
and sending a request with a large prompt and requesting
I see the
The above seems to happen only when the first reques to the server crashes it. If I send a shortened request first (e.g. prompt from
|
This PR adds a feature to dump input metadata when vllm engine crashes. In essence, this change is the spiritual successor to #8305 that was recently removed in #12582. However, I tried to solve it differently, since this feature can give us more hints to help debug crashes in production environment. So, I would like to propose it again to the community and give it a second chance.
Summary:
V0 dump sample
V1 Dump Sample