This file outlines some of the most important concepts detailing how the layer works internally. Useful mainly for development on it.
subprojects
: place for meson-built subprojectsdlg
: dlg is a lightweight C/C++ logging library. It pretty much does exactly what we need for logging/assertion functionality, not more and not less. Care has to be taken when an application also uses dlg since then, they share a handler (by design).swa
: swa is a lightweight C window abstraction. It is used to create the external window to display the introspection gui. On X11, it is also used to create an overlay window for the hooked input overlay. We use this over something like glfw because we have some special needs (as we need overlay windows and create windows from inside the layer) that were easier to implement in self-written, more lightweight library.swa
was furthermore not primarily designed for opengl, it's even possible to completely build it without opengl support. We use it over something like gtk or Qt to keep the number of dependencies minimal and the overall layer lightweight.pml
: pml is a lightweight C posix main loop. Dependency of swa on unix.
src
: pretty much all source codes and non-public headers go herecommand
: for all command-related utilityutil
: utility headers that are not directly involved with the vulkan API and or used in potentially multiple places throughout the layer. Most of these utilities were imported from previous projects.gui
: all logic for the introspection gui go here. Resource viewer, command viewer, vertex viewer, image viewer, overview UI and so on.commandHook
: the centralCommandHook
class and its utilities are implemented here. They are responsible for modifying the recorded command buffers submitted by applicationsdata
: shader sources. As we have a small number of rather static shared sources, there is aprebuilt
subfolder that should contain compiled versions of the latest shader sources. Whenglslang
is found, the shaders are automatically recompiled as part of the compilation process. Theprebuilt
folder is mainly there for windows. Since it does not have easy dependency management, it spares us from includingglslang
as subproject.vk
: The place for vulkan headers and external utility. We include our own headers so that we aren't dependent on the version of the installed ones.vkutil
: The place for internal low-level Vulkan utility functions and wrappers, such as the enumString.hpp utility generated by vkpp that is used to transform enum values into strings.imgui
: The sources for Dear ImGui which we use for our introspection gui. We use static sources instead of building it as subproject as it's easier to build this way. We compile it directly into the layer library.tracy
: Source of the profiling tool we use, directly baked into the library. See performance.md for more details on profiling.spirv-cross
: Source of SPIRV-Cross. We use it for SPIR-V reflection and patching. As of may 2021, we still use spriv-reflect in most places but this will likely be replaced with SPIRV-Cross. We need SPIRV-cross for the xfb patching we have to do for the vertex viewer, doing this can involve things such as evaluating (spec-)constant shader expressions (e.g. array sizes) which isn't something we want to implement ourselves in the layer.spvm
: Contains sources for the SPIRV-VM library vil uses for shader debugging. We should probably move that to a git subproject sooner or later since we often need to change/extend/fix it during development.minhook
: sources of the minhook library. Used only for the hooked win32 overlay, to grab/block input from the application.backward
: heavily modified/stripped sources of the backward library used to capture stack tracestao/pegtl
: We use the tao/pegtl library to parse expressions. Used for instance by the buffer viewer to convert a glsl-like type specification to an internal type representation used to format buffer content, see src/util/bufparser.cpp.test/unit
: To test the functions and classes not exported from the layer, we compile the tests directly into the shared library (when tests are enabled, they are off by default, but always run on the CI). The tests compiled into the shared library are defined in this folder. We compile them into the library itself so we can test functions that are not exported. We also have amain.cpp
here that is able to execute the tests. Unit tests are mainly for internal utility.test/integration
: We have some integration tests here. They just use Vulkan like any app would, with the vil layer loaded. Preferably, and to make this work easily in CI, we use the null Vulkan mock_icd driver but load the validation layers after vil to make sure it catches our errors.
include
: Only the public API header lives here. Future API or otherwise public headers should go here.docs
: Documentation, examples, pictures. Theown
subfolder contains many incredibly smart concepts, ideas and design documents disguised as error-ridden gibberish with spelling errors, rhetorical questions and inconsistencies. There you will also find the ever-growing todo list.
vil
is written in C++17. When all supported platforms support C++20,
its features will be used as well.
In most places, vil
does not expect or handle or throw exceptions. We
simply assume, for instance, that no vector re-allocation will throw
since we couldn't properly handle it anyways. And throwing from inside the
layer during an API call is undefined behavior (since it crosses a shared
library C boundary). Therefore most code isn't written in an exception-safe
manner. If an exception is thrown, something we can't fix anymore already
went wrong.
There are capsulated subsystems that may use exceptions (e.g. the buffer format parser).
Everything in the layer lives in the vil
namespace.
For almost every vulkan handle, there is a representation on our side.
VkInstance
is represented by vil::Instance
, VkDevice
by vil::Device
,
VkImageView
by vil::ImageView
and so on.
vil::Device
has tables mapping the vulkan handles to their
representations inside the layer. For dispatchable handles (Instance, Device,
CommandBuffer, Queue; we also use it for VkSurfaceKHR), there is a global
table in src/data.hpp.
The layer optionally wraps handles, see (env.md)[docs/env.md] for configuration, it's even possible to decide on a per-object-type basis whether wrapping is done. See this post for more details on handle wrapping. Wrapping handles allows us to bypass those (potentially large, synchronized) global or per-device lookup tables and instead directly cast into our representation of the handles. But it also decreases the chance that the layer works with extensions it does not explicitly support.
Synchronization is currently done mainly over a single mutex in vil::Device
.
Accesses that are not guaranteed to be externally synchronized by vulkan
must be protected by this mutex. This may lead to a lot of critical
sections and slow down multithreaded applications. During early development
of the layer, there were experiments with more local per-object mutex objects
but this created unmaintainable complexity due to the various new
deadlock possibilities.
Example cases where the mutex is needed:
- A new ImageView is created and add a reference to itself in the associated Image. Critical section is needed since other threads could create ImageViews to the same image or the gui could currently be iterating over all ImageViews for a given Image in another thread.
- When a CommandBuffer is reset (e.g. by vkBeginCommandBuffer or vkResetCommandPool), we need to change its internal state that might at the same time be read from another thread (e.g. by our gui rendering). So we have to lock a mutex.
Since DescriptorSet updates can be a bottleneck and are often done from multiple threads, we have a separate synchronization mechanism for that. Basically, we somewhat separate DescriptorSet handles and their state at a point of time. We can attach a DescriptorSetCow (copy-on-write) object to a DescriptorSet to make sure we can view its state at a current point in time later on, even if the DescriptorSet was destroyed or updated.
In general, we keep track of some connections between handles where performance allows it to make it possible to view them in the gui. While the gui is rendered, it will lock the device mutex in many places when accessing these connections.
In addition to the standard device mutex, there is also the device queue mutex, used to synchronize submissions to queues (since vulkan does not allow to call queue operations from multiple threads at the same time).
API objects created by the application generally have an object counterpart inside the layer, e.g. vil::Image, vil::DescriptorSet, vil::CommandBuffer, or vil::Device. They all derive from vil::Handle which just knows its own debug name.
Many device-level handles have an embedded, intrusive, atomic reference
count and have shared ownership. As long as their API counterpart is
alive, the device has one owned reference to them and they are guaranteed
to be kept alive. But most handles can stay alive beyond that (e.g. a
vil::ImageView
object existing even though the application has called
vkDestroyImageView
). There are multiple reasons for that:
- For transient handles, it allows us to show some information (such as create info or debug name) in the UI even if the handle was destroyed. It's also useful for command matching, which is done to find a selected command in future submissions as well.
- Without shared ownership, we would have to track which
CommandRecord
orDescriptorSet
objects use which handles. That was done in the beginning and had huge overhead (mainly because it's hard to do in parallel with proper synchronization), shared ownership made this a lot easier and faster. - In some cases, the Vulkan spec allows to destroy handles very early
- e.g. destroying the PipelineLayout right after recording of a command buffer. We might still need those API handles later on (e.g. for command hooking, see below). Therefore will not just keep our object-representation of the handle alive, but the handle (with further layers/the driver) itself as well, so we can use it later on. But this only happens for some layout handles.
vil::Command
(see commands.hpp) is our internal representation
of a single command added to a command buffer. When the application
records a command buffer, we build a list (in truth: rather a hierarchy) of
these objects for multiple purposes:
- Viewing the recorded command buffer and its state in gui
- Knowing what the command buffer does, e.g. the image layout transitions are required when the command buffer submitted to keep track of current image layouts.
- Command hooking, as explained below.
vil::CommandRecord
(see record.hpp) holds all state for a
recorded command buffer, i.e.
all commands, the usage flags with which the record was begun, which handles are used.
It's disconnected from the command buffer itself and can outlive it. You can imagine
the CommandBuffer as a builder for CommandRecord object.
CommandRecords are used in multiple places and ownership is shared via an
intrusive reference count.
One speciality is its custom memory allocation mechanism, see
linalloc.hpp and command/alloc.hpp.
Since CommandBuffer recording can be a bottleneck and might involve many thousands
commands, we don't have any time to waste there. Therefore, we always allocate larger
blocks of memory and place all commands and command-related data into them.
We then link them together, making them formed a hierarchy of linked lists.
The memory is freed simply when the CommandRecord is no longer needed (we
explicitly skip all destructor calls of vil::Command
and its derivates)
vil::CommandHook
(in src/gui/commandHook.hpp) and the associated
classes CommandHookRecord
, CommandHookSubmission
allow us to insert
commands into the submissions done by the application.
A CommandHook
can be installed in the vil::Device
and will be considered
every time commands are submitted to a queue of that device via
CommandHook::hook
. That function gets a submission to a queue
and can replace command buffers with internal, patched replacements.
To "insert" commands, it simply reads from the recorded command buffer
(using our vil::Command
objects created at recording time), records the commands
into an internally created command, adding or altering commands as needed.
The reason we don't already do this directly into the application's command
buffers as it records them is that we might not know at that point of time
what is needed. Consider an application that does not re-record command buffers
every frame. We can't know at the time of recording which command the user
will inspect in the introspection gui so we can't know where to insert
our additional introspection commands that copy data as needed.
For every hooked CommandRecord
, the CommandHook
creates a CommandHookRecord
,
holding all the data it needs to hook that specific record.
That data includes a VkQueryPool
to query timings, the newly recorded
VkCommandBuffer
itself, copied buffers, locally created replacement render passes
and so on. When the application submits the same recorded version of a command
buffer multiple times, our CommandHookRecord
may be used multiple times.
But when what we want to hook is changed (e.g. because another command
or command state I/O resource is selected in the gui) we invalidate
all previously created hooked records.
Every time a CommandHookRecord
is used to hook a command buffer by the application,
a CommandHookSubmission
object is created. It serves mainly as a handler
to do all processing and copying needed when the submission is finished.
All that state is hold in CommandHookState
, which is directly accessed
when rendering the command buffer gui.
Aside from copying state at a selected command, we also use CommandHook
to capture bookkeeping data when needed, for instance in
vkCmdBuildAccelerationStructuresKHR
.
One of the main difficult things to figure out was how to make gpu state
inspection inside render passes possible. Vulkan does not allow transfer
commands inside a render pass so simply copying bound textures/buffers
before a draw command is executed does not work. Alternatively, moving
the introspection transfer commands just outside the boundaries of the
render pass does not work either as there are cases where resources
used by draw commands can be modified by earlier draw commands. We furthermore
want to see the effects of single draw commands to the framebuffer attachments.
The solution: render pass splitting. We simply split up the render pass
around the selected draw command so that we can insert our needed
transfer commands before/after the splitted render pass. We achieve this
by splitting the active render pass into 3 parts. Usually, using a pipeline
with a different render pass is not possible but in almost all cases,
we can actually make these 3 new render passes compatible for the old one
since the loadOp
and storeOp
values are ignored for compatibility
and they are the only ones we really need to change.
The only case where this approach can create problems: a render pass with
multiple subpasses where an attachment is first used as resolve attachment
and then later on used in specific different ways. There is no solution
for this at the moment, we simply don't allow command inspection in this case.
See src/rp.hpp for the details. vil::splittable(...)
returns
whether the splitting approach is possible for the given render pass
description. We have a small test rpsplit.cpp that
should be extended when issues with that are found in the future.
vil::splitIterrutable(...)
then spits out render pass create infos
that can be used to create the new render passes.
- write about shader debugging
- write about getting vertex data and transform feedback via spirv injection
- write about local hooks
- write some notes about threading