You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
lt;dr: using my existing configuration shows no effect when using the "deviceIds" property.
I am successfully hosting three diffeerent models on a server with two gpus.
Each model can be run on a single gpu, but one is more demanding - so I'd like to control the distribution of workers per gpu.
The deviceIds property seems to be exactly what I'd need for that.
It is described here for the archiver and here for either/and the archivers yaml or the model configuration.
And seems to be implemented here.
However, using my existing configuration - which succsessfully controls the worker numbers and timeouts - shows no effect whatsoever when using the deviceIds or deviceType properties. Is this only implemented for the YAML file uppon archiving?
Is CUDA available: Yes
CUDA runtime version: 12.1
NVIDIA GPU models and configuration:
NVIDIA RTX 4000 Ada Generation
NVIDIA RTX 4000 Ada Generation
Nvidia driver version: 565.77
Nvidia driver cuda version: 12.7
cuDNN version: 9.1.0
lt;dr: using my existing configuration shows no effect when using the "deviceIds" property.
I am successfully hosting three diffeerent models on a server with two gpus.
Each model can be run on a single gpu, but one is more demanding - so I'd like to control the distribution of workers per gpu.
The deviceIds property seems to be exactly what I'd need for that.
It is described here for the archiver and here for either/and the archivers yaml or the model configuration.
And seems to be implemented here.
However, using my existing configuration - which succsessfully controls the worker numbers and timeouts - shows no effect whatsoever when using the deviceIds or deviceType properties. Is this only implemented for the YAML file uppon archiving?
Is there a way to set the deviceIds via the API?
Configuration excerpt:
...
"defaultVersion": true,
"marName": "model.mar",
"deviceIds": [1,],
"minWorkers": 4,
"maxWorkers": 4,
"batchSize": 1,
"maxBatchDelay": 50,
"responseTimeout": 120
...
Environment headers
Torchserve branch:
torchserve==0.12.0
torch-model-archiver==0.12.0
Python version: 3.10 (64-bit runtime)
Python executable: /opt/conda/bin/python
Versions of relevant python libraries:
captum==0.6.0
numpy==2.2.3
pillow==10.3.0
psutil==5.9.8
requests==2.32.0
torch==2.4.0+cu121
torch-model-archiver==0.12.0
torch-workflow-archiver==0.2.15
torchaudio==2.4.0+cu121
torchelastic==0.2.2
torchserve==0.12.0
torchvision==0.19.0+cu121
wheel==0.42.0
torch==2.4.0+cu121
**Warning: torchtext not present ..
torchvision==0.19.0+cu121
torchaudio==2.4.0+cu121
Java Version:
OS: N/A
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: N/A
CMake version: version 3.26.4
Is CUDA available: Yes
CUDA runtime version: 12.1
NVIDIA GPU models and configuration:
NVIDIA RTX 4000 Ada Generation
NVIDIA RTX 4000 Ada Generation
Nvidia driver version: 565.77
Nvidia driver cuda version: 12.7
cuDNN version: 9.1.0
Environment:
library_path (LD_/DYLD_): /usr/local/nvidia/lib:/usr/local/nvidia/lib64
The text was updated successfully, but these errors were encountered: