-
Notifications
You must be signed in to change notification settings - Fork 316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement a DirectX 11 renderer #675
base: master
Are you sure you want to change the base?
Conversation
Not usable in current state.
This allows users to render to a render target which may either be the swapchain (in most cases it will be) or a render texture which may then be used elsewhere (eg in a 3D scene).
We should provide a image loader (in this case the uncompressed TGA loader), but should also allow users using this renderer backend to overload it with their own one. E.g. a user may have libpng in their project and they want to hook libpng into the load image function to load PNGs.
Currently broken. Needs more work
Also add early exit in the case of a kernel of size 0
Ohh nice PR @hyblocker, I was also trying to implement DX11 renderer by myself. |
The PR is practically ready for merging, I'm waiting for the weekend to find the time to implement the GLFW backend and then I'll move this from draft status. Transient buffers should still be investigated if possible but my schedule has been getting busier lately, so I think it makes more sense to merge it as a separate PR later. |
@mikke89 The DirectX11 renderer is ready for review. |
Awesome, that's great to hear! Very impressive, I'll try to get around to it soon. Would you consider it feature complete compared to the OpenGL 3 renderer? Would be nice to add an entry into the feature table of the renderers in the readme. |
The DirectX 11 renderer has feature parity with the OpenGL 3 renderer, including all effects, shaders, etc. (the effects sample appears to function identically between the two). I've already added the entry to the readme in the PR (see 8ed4753). It also comes with a GLFW and Win32 backend. I've been using the DirectX11 renderer for my own project for around a month and it appears to be stable and function well. |
I will say some words about this implementation and here they're:
It is a very bad practice using .find on render callings. Searching through hash map container without any specific purpose is still slow operation, on real-time rendering applications hash-maps used not often by frequency calling otherwise it will reduce your performance on much complex scenes immediately. You won't even get invalid buffer if you get just don't render it but instead you do .find operation every frame, why? All information about buffers stored and provided by Rml. See: https://github.com/mikke89/RmlUi/blob/master/Backends/RmlUi_Renderer_GL3.cpp#L1023
So you allocate single buffer on init stage say like 1 Mb/512 Kb buffer that's for VB and IB and try to use offset-allocator like this https://github.com/sebbbi/OffsetAllocator (I suggest it because I will use it in my implementations) If you want to write your own you can try why not or you can implement Transient buffers, it will improve performance but at some point using one buffer that stores all things could be a little better I didn't compare a such implementation, but I think it could be just equal. As I understood you're a beginner and it is fine that you have some mistakes 😃, but I expect that you will fix them. Your implementation for many people exists as 'reference' implementation and it is supposed (people expect that at some point) the implementation is done well but in reality if we accept it now after some time some people from community will ask these questions what I state here and will ask: "why did you A instead of B, B is effective and please fix it". I am pretty sure that it might happen so I don't have any choice for staying in shadows and ignoring this review. For you it is improving your understanding of real-time graphics :D, for me I did a review and I do them not often tbh. If you still have some questions or you disagree with my opinions, feel free to write your thoughts. |
As per suggestion 1 from wh1t3lord
I've removed m_geometry_cache, I'll try improving the buffer allocation later when I have some more free time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, I am really impressed. The results look very solid overall, and it's no small feat to get all of this working, especially with all of our new renderer features too.
I haven't taken a detailed dive into the implementation itself, I want to leave you some ownership over this code as well. It would be great if you can help maintain it a bit occasionally in the future, but no obligations required. Looks like it follows the overall structure of the GL3 renderer, which is very nice and helps maintain them together.
I did some quick benchmarks on our benchmark sample:
Renderer | Idle FPS | Active FPS |
---|---|---|
GL2 | 1400 | 83 |
GL3 | 800 | 84 |
DX11 | 1500 | 82 |
VK | 1200 | 134 |
Overall, it's on par with our other renderers. Performance-wise, it is not problematic in my view. It would be nice with some improved allocation techniques as mentioned above, I think that could help with the "active FPS" here. While I would encourage that, it's not a blocker for a first iteration in my view. And we probably want to improve that for our GL3 renderer too.
I implemented the screenshot feature locally and ran a comparison against the GL3 renderer on all our visual tests. It was pixel-perfect equal in 81/102 tests. Most of the remaining ones are only slightly pixel-different along rounded edges and for color interpolation, things that one would expect to be different between renderers. So that's a solid job!
There are a few tests I would like to highlight:
# | similarity scores (%) | max pixel difference | filename | Comment |
---|---|---|---|---|
13 | 30.5% | 86 | filter_blur_orb.rml | Looks good statically, but there's quite a lot of flickering when moving the blur orb. This is very understandable, and quite tricky to get stable. I remember spending a lot of effort with this on the GL3 renderer. |
45 | 29.3% | 238 | ink_overflow_bounding_box.rml | Colors are slightly more saturated for some reason? |
58 | 44.9% | 136 | mask_image.rml | The heart image is slightly more blurry, as if the vertex positions or UV coords are just slightly off. |
84 | 13.5% | 528 | shader_conic_gradient.rml | Looks like the directions are rotated, see the screenshot below. |
Out of these, I would only consider the conic gradient as an actual issue. The other ones are reasonable in a first iteration, not considered blockers, but if you feel like taking a closer look then that would be nice.
Could you format the files according to our .clang-format and .editorconfig? It looks kind of close to our clang-format, but also slightly off, and the indentations are not correct.
Again, great effort! I'm very happy to see some DirectX love, so thank you for that. Let me know if there is anything I can clarify or help out with.
|
||
#include <RmlUi/Core/Platform.h> | ||
|
||
#if defined RMLUI_PLATFORM_WIN32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrapping everything in a big #ifdef
seems unnecessary and error prone to me, and it also messes a bit with formatting. As the renderer is only intended for Win32, I suggest instead that we make it an #error
if that's not the case.
float4x4 m_transform; | ||
float2 m_translate; | ||
|
||
// One to one translation of the OpenGL uniforms results in a LOT of wasted space due to CBuffer alignment rules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice :)
// Helper for a common cleanup pattern in DX11 | ||
#define DX_CLEANUP_RESOURCE_IF_CREATED(resource) \ | ||
if (resource) \ | ||
{ \ | ||
resource->Release(); \ | ||
resource = nullptr; \ | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nicer to replace this with a templated function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are okay with including other headers in this header file, I would suggest using CComPtr<>
from atlbase.h
; smart pointers over raw pointers would definitely be an improvement to avoid any possible memory leakage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the headers are commonly available, that should be fine. This is not one of those microsoft language extensions is it, does it require any special compiler flags? Is it compatible with MinGW & co?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, MinGW might become a problem. I just feel like such a smart pointer (or any way to clean up the resources, really) would be a neater way to handle them, given how many times the helper is used.
// Uncomment the following line to enable additional DirectX debugging. | ||
#define RMLUI_DX_DEBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this supposed to be commented out?
|
||
#include <RmlUi/Core/Platform.h> | ||
|
||
#if defined RMLUI_PLATFORM_WIN32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I would remove this #if
wrapper (or make it an error).
if (window_flipped.Top() == window_flipped.Bottom() || window_flipped.Left() == window_flipped.Right()) | ||
{ | ||
// In some cases the bounds are 0 width, so early exit since there's nothing to blur | ||
return; | ||
} | ||
|
||
int pass_level = 0; | ||
SigmaToParameters(sigma, pass_level, sigma); | ||
if (sigma == 0) | ||
{ | ||
// If sigma is zero we don't have anything to blur, so early exit since blurring will have no effect on the final image | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I made sure that the method still works even if these conditions come through here. And if this is a question of optimization, I feel like all of these conditions should normally be handled at a higher level, so based on that I would rather not add more conditions here.
#ifdef RMLUI_DEBUG | ||
// Enable debug layer | ||
createDeviceFlags |= D3D11_CREATE_DEVICE_DEBUG; | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Should this be RMLUI_DX_DEBUG
?
I guess if the performance implication of the debug device is trivial, then we can leave it as is. Otherwise, it's nice if performance is reasonable also in debug mode, so we can make this opt-in with that flag.
|
||
enum class ShaderGradientFunction { Linear, Radial, Conic, RepeatingLinear, RepeatingRadial, RepeatingConic }; // Must match shader definitions below. | ||
|
||
// We need to round up at compile-time so that we can embed the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very suspenseful end to that sentence! ;)
| SFML | ✔️ | | | | | ||
| GLFW | ✔️ | ✔️ | ✔️ | | | ||
| SDL¹ | ✔️ | ✔️² | ✔️ | ✔️ | | ||
| Platform \ Renderer | OpengGL 2 | OpengGL 3 | Vulkan | SDLrenderer | DirectX11 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding this readme: We also have a table over here with all the renderers and a feature matrix, so it would be great if you could add it there as well.
I'll tackle the issues you've found throughout the week. If I have the time I'll also investigate the tests with poor accuracy to see if I could improve them. |
I've implemented a DirectX 11 backend for RmlUI. Currently this has feature parity with the SDL renderer, but I plan on extending it to match the GL3 renderer in the near future (hence I'm marking this as a draft PR).
As for backends, currently this only has an implementation with the Win32 backend. I'll also be providing a GLFW backend at a later point.
I've introduced a pair of functions pointers only in the DX11
RenderInterface
for this backend to make it more flexible and easily consumable by end users. As the current backends only support uncompressed targa files for images, and in practice one would likely want to connect it to a texture loader or something more sophisticated such as stb_image, I defined the following functions:Where a user may override these as follows for example:
The renderer will fallback to the Targa image loader if the user doesn't override the loader with their own one.
For completeness, below is the entire STB image loader I've been testing with. I have converted the sample assets to PNG to verify that it works correctly.