Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow boot times / cold start (Docker with API) #557

Open
4 tasks done
Jhappy77 opened this issue Nov 30, 2024 · 3 comments
Open
4 tasks done

Slow boot times / cold start (Docker with API) #557

Jhappy77 opened this issue Nov 30, 2024 · 3 comments
Labels
help wanted Extra attention is needed

Comments

@Jhappy77
Copy link

Jhappy77 commented Nov 30, 2024

Checks

  • This template is only for usage issues encountered.
  • I have thoroughly reviewed the project documentation but couldn't find information to solve my problem.
  • I have searched for existing issues, including closed ones, and couldn't find a solution.
  • I confirm that I am using English to submit this report in order to facilitate communication.

Environment Details

Hi, I'm using the API with Docker. I'm noticing notably slow bootup times (usually 35-50 seconds) and I'd like to fix this to cut down cold start times. I've tried debugging to find out what exactly its doing during this bootup delay but couldn't find any culprits.

If anyone knows what might be causing this, or is able to help investigate, I'd be happy to contribute a PR towards solving it. Not sure if it would be a dockerfile change or setup change. Not sure if this is within our control, though.

Steps to Reproduce

  1. Build docker
  2. Run docker, with api.py as CMD

✔️ Expected Behavior

(desired behavior, but this isn't a bug so much as it is lack of optimization) Boots relatively quickly (within 10s) and is ready to take requests

❌ Actual Behavior

Prints out NVIDIA message, then waits 40s+ before is able to handle requests

2024-11-29 20:25:18 ==========
2024-11-29 20:25:18 == CUDA ==
2024-11-29 20:25:18 ==========
2024-11-29 20:25:18 
2024-11-29 20:25:18 CUDA Version 12.4.0
2024-11-29 20:25:18 
2024-11-29 20:25:18 Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2024-11-29 20:25:18 
2024-11-29 20:25:18 This container image and its contents are governed by the NVIDIA Deep Learning Container License.
2024-11-29 20:25:18 By pulling and using the container, you accept the terms and conditions of this license:
2024-11-29 20:25:18 https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
2024-11-29 20:25:18 
2024-11-29 20:25:18 A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
2024-11-29 20:25:18 
2024-11-29 20:25:18 WARNING: The NVIDIA Driver was not detected.  GPU functionality will not be available.
2024-11-29 20:25:18    Use the NVIDIA Container Toolkit to start this container with GPU support; see
2024-11-29 20:25:18    https://docs.nvidia.com/datacenter/cloud-native/ .
2024-11-29 20:25:18
2024-11-29 20:27:51 INFO:     Started server process [1]
2024-11-29 20:27:51 INFO:     Waiting for application startup.
2024-11-29 20:27:51 INFO:     Application startup complete.
2024-11-29 20:27:51 INFO:     Uvicorn running on http://0.0.0.0:9888 (Press CTRL+C to quit)```
@Jhappy77 Jhappy77 added the help wanted Extra attention is needed label Nov 30, 2024
@SWivid
Copy link
Owner

SWivid commented Nov 30, 2024

2024-11-29 20:25:18 WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.

seems like using cpu?
try nvidia-smi
make sure you have got the gpu with your docker

@Jhappy77
Copy link
Author

Jhappy77 commented Dec 3, 2024

Yes but I don't think that's relevant. I was using a GPU-less device to get those logs, but that's not the cause of the slow boots because I still see the slow boot times when I run the docker image on servers with GPUs (with Driver detected, and everything works fast after the boot. First request might take 40 seconds waiting for server to launch, afterwards only 3s per request).

The goal of this issue is to fix the slow boot times. If anyone know what might be happening I'm happy to take a deeper look.

@SWivid
Copy link
Owner

SWivid commented Dec 3, 2024

sure, no specific idea yet
see if someone would help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants