Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

map workers and GPUs, deviceIds not considered in ts_config #3393

Open
RuDevKu opened this issue Feb 25, 2025 · 0 comments
Open

map workers and GPUs, deviceIds not considered in ts_config #3393

RuDevKu opened this issue Feb 25, 2025 · 0 comments

Comments

@RuDevKu
Copy link

RuDevKu commented Feb 25, 2025

lt;dr: using my existing configuration shows no effect when using the "deviceIds" property.

I am successfully hosting three diffeerent models on a server with two gpus.
Each model can be run on a single gpu, but one is more demanding - so I'd like to control the distribution of workers per gpu.

The deviceIds property seems to be exactly what I'd need for that.
It is described here for the archiver and here for either/and the archivers yaml or the model configuration.
And seems to be implemented here.

However, using my existing configuration - which succsessfully controls the worker numbers and timeouts - shows no effect whatsoever when using the deviceIds or deviceType properties. Is this only implemented for the YAML file uppon archiving?

Is there a way to set the deviceIds via the API?

Configuration excerpt:
...
"defaultVersion": true,
"marName": "model.mar",
"deviceIds": [1,],
"minWorkers": 4,
"maxWorkers": 4,
"batchSize": 1,
"maxBatchDelay": 50,
"responseTimeout": 120
...


Environment headers

Torchserve branch:

torchserve==0.12.0
torch-model-archiver==0.12.0

Python version: 3.10 (64-bit runtime)
Python executable: /opt/conda/bin/python

Versions of relevant python libraries:
captum==0.6.0
numpy==2.2.3
pillow==10.3.0
psutil==5.9.8
requests==2.32.0
torch==2.4.0+cu121
torch-model-archiver==0.12.0
torch-workflow-archiver==0.2.15
torchaudio==2.4.0+cu121
torchelastic==0.2.2
torchserve==0.12.0
torchvision==0.19.0+cu121
wheel==0.42.0
torch==2.4.0+cu121
**Warning: torchtext not present ..
torchvision==0.19.0+cu121
torchaudio==2.4.0+cu121

Java Version:

OS: N/A
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: N/A
CMake version: version 3.26.4

Is CUDA available: Yes
CUDA runtime version: 12.1
NVIDIA GPU models and configuration:
NVIDIA RTX 4000 Ada Generation
NVIDIA RTX 4000 Ada Generation
Nvidia driver version: 565.77
Nvidia driver cuda version: 12.7
cuDNN version: 9.1.0

Environment:
library_path (LD_/DYLD_): /usr/local/nvidia/lib:/usr/local/nvidia/lib64

@RuDevKu RuDevKu changed the title deviceIds not considered in ts_config map workers and GPUs, deviceIds not considered in ts_config Feb 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant