Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accel struct and faults #14

Merged
merged 10 commits into from
Jul 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .launch
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
/home/jan/code/iro/build/npt
--vil /home/jan/art/assets/sponza/Sponza.gltf --no-validation
48 changes: 43 additions & 5 deletions docs/own/accelStruct.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,14 +50,14 @@ CmdBuildAccelerationStructure(params) {
auto& build = record.accelStructBuilds.emplace_back();
build.accelStruct = ...
// only need to create new state if buffers aren't large
// enough anymore.
// enough anymore.
// Otherwise use existing accelStruct state.
build.dstState = ...
build.dstState->builtStatically = true;

// records the commands needed to capture the data
// directly into this cb
recordCaptureAccelStructData(cb, build.dstDate, params);
recordCaptureAccelStructData(cb, build.dstDate, params);

dispatch.CmdBuildAccelerationStructure(params);
}
Expand Down Expand Up @@ -131,7 +131,7 @@ struct AccelStructBuildOp {
inside the submission/hookRecord that is fired up on activation.
- when a submission that updates AccelStructState (either dynamically
or statically) is submitted (activated? don't think so. Just need to carefully
sync it with potential later submissions that might be executed first),
sync it with potential later submissions that might be executed first),
it first checks if it needs to resolve the cow.
If so, it first submits the copy command.
(wait, but at that point we can't submit it anymore before the activated
Expand All @@ -144,9 +144,47 @@ struct AccelStructBuildOp {

This whole async submission/order thing is making this really hard :(
If we already know at copyDs time that the state is written in future (by
a submission that was not activated yet)(need an extra flag or something)
just copy the state instantly and don't bother with the whole cow thing
a submission that was not activated yet)(need an extra flag or something)
just copy the state instantly and don't bother with the whole cow thing
(important since it won't be resolved correctly otherwise).

we REALLY don't want to delay hooking/submission until a submission is
activated. That is way too intrusive.

---

# Next iteration 2024

- meh, don't do any cow stuff for now. Keep it simple.
- hard case: when viewing a tlas we don't want to blases (or its state)
to be destroyed. Could capture it once.
- current bad case: TLAS is build. BLAS referenced in it is destroyed.
TLAS state is viewed in gui. :(
- capturing would happen in CommandHookSubmission finishing.
Should be enough. But how to capture it? maybe store
"finishedState" in accelStructs? hm when blas is built AFTER
tlas was build (e.g. in same cb), it's wrong again.
At each tlas build, capture a full mapping of all blas
addresses to their current states (at that point in time).
This could be very expensive :/
- fully correct way: maintain some kind of mapping on gpu.
data structure allows getting metadata for blas address.
we then write out the version number of the blas at the time
of building, when copying tlas instances.
uff. Then only need to ensure blas states are still alive when
resolved (for ref count inc).
Complicated!

Hm okay wait. When a tlas is viewed in resource viewer, these
considerations do not matter: when a blas in it was destroyed/updated/rebuilt
afterwards, its invalid anyways. We just show new blas state in ui, no
big deal.
So its only important when viewing tlas in dispatchRays viewer. Or
for shader debugging or something. But when we capture the tlas
via descriptor, we could just create a map of active blas states
at *that* point. `UnorderedMap<u64, IntrusivePtr<AccelStructState>>`

In resource viewer, can compare TLAS build time with BLASes build times
and show "invalid; blas out of date" warning.
When a submission with a TLAS is used, we can be sure that is is valid
*at that point*. So just capture the TLAS and all BLAS states at that point.
14 changes: 8 additions & 6 deletions docs/own/cow.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ layout for sets that don't have any writable bindings to begin with?
__For now, let's just ignore storage image writes, it's too hard to track
efficiently. We just focus on resources (for now: only images) written
explicitly, i.e. via transfer. Add support for attachments later__

# Support mappable resources

in MapMemory:
Expand Down Expand Up @@ -270,7 +270,7 @@ we know all submissions happening before it). And not resolve cows
before they are 'active'.
But even then, we wouldn't catch all cases. We would incorrectly add
the resolve to the first submitted-out-of-order write submission.
So, in conclusion, we would have to postpone the cow adding (or activating)
So, in conclusion, we would have to postpone the cow adding (or activating)
and resolve op scheduling until we know a submission to be logically ready.

Wow, that complicates things a lot!
Expand All @@ -297,6 +297,8 @@ In the end we could still disable cows if events are used or something (or
based on an environment variable).

EDIT: yeah, seems valid per spec. damn.
EDIT EDIT: no, not really. See discussion in
https://github.com/KhronosGroup/Vulkan-Docs/issues/755

---

Expand Down Expand Up @@ -417,7 +419,7 @@ Semaphore::updateUpper(u64 value) {
for(auto& wait : waits) {
// resolve the wait
if(wait.value > value) {
continue;
continue;
}

// check if this activated the submission
Expand Down Expand Up @@ -457,7 +459,7 @@ struct SubmissionGraph {
// We just need to make sure semaphores aren't destroyed, use
// IntrusivePtrs for that.
using SyncedOp = std::variant<
SwapchainAcquireOp,
SwapchainAcquireOp,
PresentOp,
QueueWaitOp,
IntrusivePtr<SubmissionBatch>,
Expand Down Expand Up @@ -492,7 +494,7 @@ __Detecting__ that a resource is potentially changed is easy, we don't
need the whole complexity of the CoW mechanism fore that.

Our main usecase for cows is the shader debugger. Secondary usecase:
sometimes inputs to shaders are HUGE static resources (like >100MB textures,
sometimes inputs to shaders are HUGE static resources (like >100MB textures,
think highly detailed cubemap or something). Having some CoW mechanism
there would be nice as well.

Expand All @@ -504,7 +506,7 @@ Shader debugger:
being modified. We could still "just use it" for debugging.
This means we might get weird results for one frame. So what, nobody
will notice :D
On a more serious note, performance in the shader debugger is mainly a
On a more serious note, performance in the shader debugger is mainly a
concern when not having frozen state. We could always force copies when
our state is frozen so we guarantee coherent input state in that case.
- alternatively, if we detect the change, we could simply show some
Expand Down
6 changes: 6 additions & 0 deletions docs/own/todo.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@ urgent, bugs:
- [ ] maybe show full image size on hover?
- [ ] also fix mip/layer selector that sometimes automatically resets itself
(seen with slice 3D selector e.g. npt surfel lookup tex)
- [ ] finish submissions in order, see CommandHookSubmission::finish
comment for accelStruct captures
- [ ] immediately free HookedRecords that are not to be re-used in finish.

- [ ] test more on laptop, intel gpu
- [ ] seems like we do some nasty stuff in the histogram shaders,
Expand Down Expand Up @@ -54,6 +57,8 @@ urgent, bugs:
(try to test with RDR2 again)

new, workstack:
- [ ] add mode where hooking is disabled. Commands can still be inspected
but, like, just statically.
- [ ] add isStateCmd(const Command&) and remove remaining command dynamic casts
- [ ] handle imgui cursor-to-be-shown and clipboard
- [ ] make sure to pass it via interface
Expand Down Expand Up @@ -87,6 +92,7 @@ new, workstack:
Either make tab-background change color on being focused,
make it more transparent or something or maybe use a thin window border?
{think window border looks good imo}
- [ ] cleanup: move out device_default stuff to gfr fork?

- [ ] (low prio, ui) sparse: for images, allow to show popup with bound
memory on hover in imageViewer. Also bond-to-memory overlay
Expand Down
3 changes: 3 additions & 0 deletions docs/own/workstack.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
- [ ] add serialize support for more commands (e.g. Bind)
- [ ] serialize further gui state, e.g. selected I/O
- [ ] better deep-matching for handles, especially pipelines
- [ ] ~hook.cpp:777 TODO PERF: clear out non-reuse hooked records
- [ ] optimize acceleration structure capturing
- optimal: when UI is not open, don't allocate on every rebuilt
- [ ] support viewing a specific array element in the buffer formatter
- [ ] we might not even copy the needed data for some elements.
Only copy the needed data, we already have paging for the gui
Expand Down
2 changes: 2 additions & 0 deletions meson.build
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,7 @@ src = files(
'src/overlay.cpp',
'src/platform.cpp',
'src/lmm.cpp',
'src/fault.cpp',
'src/util/util.cpp',
'src/util/fmt.cpp',
'src/util/f16.cpp',
Expand Down Expand Up @@ -276,6 +277,7 @@ src = files(
'src/submit.hpp',
'src/frame.hpp',
'src/threadContext.hpp',
'src/fault.hpp',
'src/command/commands.hpp',
'src/command/record.hpp',
'src/command/alloc.hpp',
Expand Down
Loading
Loading