Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Pod Termination in InferenceService using nvcr.io/nim/meta/llama-3.1-8b-instruct:1.1.2 - No Terminate Signal Sent to Nim Server #91

Open
test-1pro opened this issue Sep 23, 2024 · 1 comment

Comments

@test-1pro
Copy link

Currently, I am using the nvcr.io/nim/meta/llama-3.1-8b-instruct:1.1.2 image to create an InferenceService following the provided guide, and the service is running properly. Pod creation and API calls are working fine, but I am encountering an issue when trying to delete the Pod.

It seems that the Terminate command is not being sent to the Nim server when I request the deletion of the InferenceService or Pod. There are no KILL signals in the internal logs either, and the Pod is only forcefully deleted when it reaches the terminationGracePeriodSeconds: 300.

Do I need to provide any additional options when starting the Nim server, or is this a known issue?

@bmcfeeters
Copy link

I'm seeing this same behavior when deploying using KServe. The scale up process works fine and is fast but, when no longer needed, the removal of the pod during scale down takes five minutes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants