Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Build] Issues with Multithreading in the New Versions of onnxruntime-directml #22867

Open
lianshiye0 opened this issue Nov 18, 2024 · 1 comment
Labels
build build issues; typically submitted using template ep:DML issues related to the DirectML execution provider

Comments

@lianshiye0
Copy link

Describe the issue

Issue Description:

In versions 1.17.0 and earlier of onnxruntime-directml, when using an AMD GPU and the onnxruntime.InferenceSession() method to load an ONNX model onto the GPU, a model session is created. If the program utilizes multithreading, multiple threads may compete for the model session, leading to deadlocks and crashes. Implementing a queue mechanism to avoid resource contention resolves the issue in these versions.

However, from version 1.18.0 onwards, despite using various mechanisms such as queueing, locks, and thread semaphores to limit resource contention in a multithreaded environment, these solutions have no effect. The problem persists, resulting in deadlocks and crashes.

Steps to Reproduce:

Use an AMD GPU.

Load an ONNX model using onnxruntime.InferenceSession() in a multithreaded program.

Observe deadlocks and crashes due to multiple threads competing for the model session.

Implement queueing, locks, and thread semaphores to manage resource contention.

Observe that these mechanisms do not resolve the issue in versions 1.18.0 and later.

Expected Behavior: Multithreading mechanisms should effectively manage resource contention, preventing deadlocks and crashes.

Actual Behavior: Resource contention management mechanisms are ineffective in versions 1.18.0 and later, resulting in persistent deadlocks and crashes.

Environment:

ONNX Runtime DirectML Versions: 1.17.0 and earlier (issue resolved with queueing), 1.18.0 and later (issue persists)

Hardware: AMD GPU

Operating System: (windows10 or windows11)

Request for Assistance: Given my observations, there seems to be a resource contention issue, but I am not entirely certain of the underlying cause. Could you provide guidance or solutions for resolving this issue in the newer versions of onnxruntime-directml?

Urgency

No response

Target platform

windows10 or windows11

Build script

session= onnxruntime.InferenceSession(onnx_model_path, providers= ['DmlExecutionProvider', 'CPUExecutionProvider'])

Error / output

The program deadlocks and crashes without generating any error messages or logs.

Visual Studio Version

No response

GCC / Compiler Version

No response

@lianshiye0 lianshiye0 added the build build issues; typically submitted using template label Nov 18, 2024
@github-actions github-actions bot added the ep:DML issues related to the DirectML execution provider label Nov 18, 2024
@lianshiye0
Copy link
Author

session= onnxruntime 的InferenceSession(onnx_model_path, providers= ['DmlExecutionProvider', 'CPUExecutionProvider']
"After loading the model onto the GPU, the issue of crashing occurs when calling session.run()."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build issues; typically submitted using template ep:DML issues related to the DirectML execution provider
Projects
None yet
Development

No branches or pull requests

1 participant