Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FrameID-labeled Present/Vsync Queue, Vulkan fixes, plus other improvements #196

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

SRSaunders
Copy link

@SRSaunders SRSaunders commented Feb 18, 2024

I have been meaning to post these changes for a while, but am finally getting around to it now. They were developed for use in RBDoom3BFG and run on all platforms: Windows, Linux, and macOS. They are a combination of bug fixes and feature enhancements as follows:

  1. Clear threadTLS on thread exit to improve runtime stability and to avoid memory leaks
  2. Use Vulkan events and floating point doubles to improve Vulkan GPU clock sync for reduced offset errors and drift.
  3. Eliminate Vulkan validation errors (Vulkan GPU implementation not conforming with Validation Layers #124) by using vkResetQueryPool() vs. vkCmdResetQueryPool() when resolving timestamps. Note this requires instance initialization using Vulkan 1.2 and enabling the hostQueryReset feature, or alternatively for non-1.2 applications, enabling the VK_EXT_host_query_reset extension.
  4. Use Monotonic clock on macOS to match other platforms and be compatible with returned timing info from the VK_GOOGLE_display_timing_extension.
  5. Capture and resolve delayed GPU timestamps before dumping data on stop capture - these were previously lost. Also protect data dump with mutex lock to prevent random failures on stop capture.
  6. Add support for Vulkan Present/Vsync queue with frameID labeling via the VK_GOOGLE_display_timing_extension. The feature is non-platform specific, but the extension is currently supported only on macOS/MoltenVK as far as I can tell. If other platforms were to implement the extension, the feature would work there as well.
  7. Add support for labeling the DX12 Present/Vsync queue with frameID information. This is useful for determining CPU to GPU to Present frame latency.
  8. Add typeless prototype for OPTICK_GPU_CONTEXT / GPUContextScope() which allows runtime selection of the graphics API for Optick (i.e. DX12 or Vulkan) without recompilation or reconfiguration.
  9. Add support for: a) disabling static Vulkan functions via a config setting, and b) configuring dynamic Vulkan functions at runtime by providing only the vkGetInstanceProcAddr() function pointer. Now discovers and assigns dynamic Vulkan functions separately for each device/node vs. assigning them globally. Static and manually assigned Vulkan functions remain global across devices/nodes.
  10. Made changes to existing infrastructure to enable reporting of runtime errors with text descriptions both in debugger and to console stderr across all platforms.
  11. Fixed clang warnings for vsprintf() deprecation, as well as unsigned long vs uint32_t and int vs. size_t type mismatches.
  12. Add support for data tags on custom storage events: OPTICK_STORAGE_TAG(STORAGE, CPU_TIMESTAMP, NAME, ...)

Note: To implement these improvements I had to make a small number of Optick API changes. I tried to limit these to as few as possible to preserve some portability. Specifically, I had to add arguments to OPTICK_GPU_INIT_VULKAN() and OPTICK_GPU_FLIP(), plus add new Vulkan function pointers to VulkanFunctions . Please be aware that if you adopt this PR into an existing Optick implementation you may need to make a few changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant