-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zesDeviceProcessesGetState is returning 78000003 (ZE_RESULT_ERROR_UNSUPPORTED_FEATURE) #809
Comments
Adding additional context; it looks like the device handle I was using was for the integrated Intel UHD 770: Output while UHD 770 is running a workload, and I monitor the UHD 770:
I had mistakenly thought the B580 would have engine groups, so mistook the existence of engine groups meaning it was running on the B580. So while When I run the workload on the B580 and and monitor it, Output while running workload on B580 and monitor its usage:
An oddity is when running the workload on the integrated GPU (i915) the query to the B580 for process stats is showing the process that the i915 driver is using, but with no engine group flags: Output while UHD 770 is running a workload, and I monitor the B580:
|
@jketreno we will look into internally and update you |
[Sai] XE driver upstream patch is in review and waiting for merge. once it is ready, it will merge and regarding other issue you raised for UHD770 , we able to see its working as per below log root@DUT6051BMGSVC:/home/gta/level_zero/bin# export ZELLO_SYSMAN_USE_ZESINIT=1; export ZES_ENABLE_SYSMAN=1; export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/gta/level_zero/libs/:/home/gta/level_zero/latest_loa der/:/home/gta/level_zero/bin/; |
This looks like relevant kernel patch series, but it's for Xe KMD tree, not upstream: https://patchwork.freedesktop.org/series/144408/ |
Reproduction of U770 failureRunning Ubuntu Oracular (24.10) with the linux-intel kernel and all other packages updated to latest versions as of 2025-02-20. Find version of libze-intel-gpu1 on system
Get compute-runtime source matching the version of libze-intel-gpu1
Build zello_sysman
Test
Output:
|
I see it is failing in the patch tests: Assuming those errors get fixed, am I correct that the flow will be Xe KMD tree -> DRM next -> DRM -> kernel.org? Or would these go straight to kernel.org as a bug fix to the existing Xe KMD driver? Or might they get picked up in the linux-intel kernel in the Ubuntu intel-graphics PPA? I'm just trying to figure out if I should abandon trying to get the B580 to work for a few more months while I wait for these patches to land, or if there might be a shorter path. I'm not too keen on rip/replace my system's kernel with one I build from source as I tend to end up with other random system failures anytime I use a tip-of-tree kernel and I'm trying to keep this system as a "production" config vs. a franken-developer config :) Thanks, |
I'm not a kernel developer, but it's a new feature (for the Xe KMD), and I would think even bug fixes normally go through DRM integration tree, to make sure they do not break anything.
I'm not familiar with that. Ubuntu HWE packages are LTS backports from things that have been tested for few months in latest non-LTS releases, so those would have quite a lot of delay, but I guess PPAs could include anything. I don't think they would do backporting though, at least not for things like metrics, which do not block using the HW. So either it would be upstream kernel with the Xe stuff already merged, or kernel test package from the Xe driver repo. In latter case, I personally I would rather build test kernels myself. One might be able to fork-lift latest driver source from the driver integration repo to distro (HWE) kernel version sources; either whole driver, or specific source file(s). If you do that, you could notify upstream whether it worked or not (add your
While one could use stripped distro kernel config for building own kernels, to speed up the builds, I'd use the configs as-is (as much as possible), when wanting to make sure everything works as expected. If you do build your own kernel and it fails, it would be good to notifty upstream about that, at least about reproducible issues. |
I'm writing a small ze-top like utility to monitor the B580. It looks like zesDeviceProcessesGetState should be able to tell me the info for processes using the GPU. However, it always returns ZE_RESULT_ERROR_UNSUPPORTED_FEATURE. That error return code is documented for other APIs, but doesn't seem to be in the list of valid return codes for zesDeviceProcessesGetState
I have a valid device handle, which I'm using to call zesDeviceEnumEngineGroups to get usage info from the engines, and that's working well.
I've tried running as sudo in case there was a permissions issue, but that didn't help.
The above outputs:
I've tried setting processCount to 0 to have it tell me how many process items to use, but that has the same error code returned.
I'm using libze-intel-gpu1 version 24.52.32224.5-1
24.10ppa2, and libze1 version 1.19.2.0-1076~24.10.Thanks,
James
The text was updated successfully, but these errors were encountered: