You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This template is only for usage issues encountered.
I have thoroughly reviewed the project documentation but couldn't find information to solve my problem.
I have searched for existing issues, including closed ones, and couldn't find a solution.
I confirm that I am using English to submit this report in order to facilitate communication.
Environment Details
Hi, I'm using the API with Docker. I'm noticing notably slow bootup times (usually 35-50 seconds) and I'd like to fix this to cut down cold start times. I've tried debugging to find out what exactly its doing during this bootup delay but couldn't find any culprits.
If anyone knows what might be causing this, or is able to help investigate, I'd be happy to contribute a PR towards solving it. Not sure if it would be a dockerfile change or setup change. Not sure if this is within our control, though.
Steps to Reproduce
Build docker
Run docker, with api.py as CMD
✔️ Expected Behavior
(desired behavior, but this isn't a bug so much as it is lack of optimization) Boots relatively quickly (within 10s) and is ready to take requests
❌ Actual Behavior
Prints out NVIDIA message, then waits 40s+ before is able to handle requests
2024-11-29 20:25:18 ==========
2024-11-29 20:25:18 == CUDA ==
2024-11-29 20:25:18 ==========
2024-11-29 20:25:18
2024-11-29 20:25:18 CUDA Version 12.4.0
2024-11-29 20:25:18
2024-11-29 20:25:18 Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2024-11-29 20:25:18
2024-11-29 20:25:18 This container image and its contents are governed by the NVIDIA Deep Learning Container License.
2024-11-29 20:25:18 By pulling and using the container, you accept the terms and conditions of this license:
2024-11-29 20:25:18 https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
2024-11-29 20:25:18
2024-11-29 20:25:18 A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
2024-11-29 20:25:18
2024-11-29 20:25:18 WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.
2024-11-29 20:25:18 Use the NVIDIA Container Toolkit to start this container with GPU support; see
2024-11-29 20:25:18 https://docs.nvidia.com/datacenter/cloud-native/ .
2024-11-29 20:25:18
2024-11-29 20:27:51 INFO: Started server process [1]
2024-11-29 20:27:51 INFO: Waiting for application startup.
2024-11-29 20:27:51 INFO: Application startup complete.
2024-11-29 20:27:51 INFO: Uvicorn running on http://0.0.0.0:9888 (Press CTRL+C to quit)```
The text was updated successfully, but these errors were encountered:
Yes but I don't think that's relevant. I was using a GPU-less device to get those logs, but that's not the cause of the slow boots because I still see the slow boot times when I run the docker image on servers with GPUs (with Driver detected, and everything works fast after the boot. First request might take 40 seconds waiting for server to launch, afterwards only 3s per request).
The goal of this issue is to fix the slow boot times. If anyone know what might be happening I'm happy to take a deeper look.
Checks
Environment Details
Hi, I'm using the API with Docker. I'm noticing notably slow bootup times (usually 35-50 seconds) and I'd like to fix this to cut down cold start times. I've tried debugging to find out what exactly its doing during this bootup delay but couldn't find any culprits.
If anyone knows what might be causing this, or is able to help investigate, I'd be happy to contribute a PR towards solving it. Not sure if it would be a dockerfile change or setup change. Not sure if this is within our control, though.
Steps to Reproduce
✔️ Expected Behavior
(desired behavior, but this isn't a bug so much as it is lack of optimization) Boots relatively quickly (within 10s) and is ready to take requests
❌ Actual Behavior
Prints out NVIDIA message, then waits 40s+ before is able to handle requests
The text was updated successfully, but these errors were encountered: