Simplify GPU distribution #16

mpoemsl · 2024-03-04T10:40:37Z

Simplify GPU distribution so it can be controlled better in a multi-model_server setup.

Changes:

Add env var override_gpu_id that makes GPUManager always assign a specified GPU to new workers. Default value is -1, which does not have any effect.
Remove GPU failure tracking logic since it is not practical
Stochastic GPU distribution weighted by current free memory distribution remains

mpoemsl · 2024-03-04T10:43:31Z

I'm building multi-platform images for this now as textshuttle/pytorch-serve:torchserve-23mt-v0.8.0-v3-{DEVICE}.

mpoemsl · 2024-03-04T11:50:06Z

There exists now a multi-arch image textshuttle/pytorch-serve:torchserve-23mt-v0.8.0-v3-cpu and a single-arch amd image textshuttle/pytorch-serve:torchserve-23mt-v0.8.0-v3-gpu.

JeffWigger

These changes look good.

Martin Pömsl added 3 commits March 1, 2024 11:02

Implement env var override_gpu_id

12febde

Remove GPU failure history

aa61eaa

Include GPU override in config printout

b9b2a31

mpoemsl requested a review from pypae March 4, 2024 10:41

mpoemsl mentioned this pull request Mar 4, 2024

build: change custom build scripts to build production image #15

Open

JeffWigger approved these changes Mar 5, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify GPU distribution #16

Simplify GPU distribution #16

mpoemsl commented Mar 4, 2024

mpoemsl commented Mar 4, 2024

mpoemsl commented Mar 4, 2024 •

edited

Loading

JeffWigger left a comment

Simplify GPU distribution #16

Are you sure you want to change the base?

Simplify GPU distribution #16

Conversation

mpoemsl commented Mar 4, 2024

mpoemsl commented Mar 4, 2024

mpoemsl commented Mar 4, 2024 • edited Loading

JeffWigger left a comment

Choose a reason for hiding this comment

mpoemsl commented Mar 4, 2024 •

edited

Loading