Skip to content

HLSL recompilation and loading

megai2 edited this page Sep 24, 2019 · 4 revisions

Recompilation

HLSL shaders as they are used by DX9 can't be used in DX12 directly, so they are reconstructed from original bytecode into new HLSL source file during runtime and then compiled.

This step is called recompilation.

Typically, this step takes a huge amount of time in comparsion to standard shader loading.

Shader cache

To reduce this recompilation overhead during gameplay, d912pxy saves every recompiled shader bytecode once complete into d912pxy/pck/shader_cso.pck file.

However, this leads to a new source of issues when the recompilation of a given shader fails or results in visual bugs and the user does not opt to do a clean install across different releases.

To combat this problem, the binary package generated by AppVeyor contains an empty shader cache.

In the future and when most errors related to HLSL recompilation are fixed, a complete shader cache will be included in the binary package.

Shader profiles

Not all DX9 "nuances" or features can be translated into DX12 lightly.

To do this properly, without creating huge overhead, every shader usage are tracked by special Release_ps build to generate shader profile.

Shader profile is file with flags showing how to recompile that shader properly.

If shader profile is changed, shader cache needs to be invalidated.

When shader profile is missing, some of features will be lost, ending in wrong visuals.

Shader loading

Shader recompilation creates huge lag if implemented as is.

Adding fact that DX12 uses monolitic PSO, which includes all shaders, there are even more objects to load, in difference to DX9.

This makes DX9-style immediate shader load and even draw command execution too time consuming.

d912pxy solves this problem by asynchronous shader recompilation and loading.

This means that draw calls with newly loaded shader are working properly in DX9, while d912pxy skips them until they are loaded.

TL;DR; d912pxy loads shaders asynchrounosly, because there is no efficient way to load them instantly.

This can create some visual errors, but results in much better (200% of min. FPS boost) performance and smooth frame rates.

If you want to fix this, enable PSO precompile in config file.