-
Notifications
You must be signed in to change notification settings - Fork 41
Description
Is your feature request related to a problem?
With rendergraph2, the renderer is still not parallel at all. We should be using frames in flight (either 2 or 3) to get better performance. We can use the rendergraph and buffer/texture wrapper code to abstract this nicely, so the programmer doesn't need to care about the details at all.
Description
The basic idea is that while we render on the gpu, the cpu side is currently just idle and waiting for the gpu to finish. It is also the other way around: the gpu requires cpu data to be finished before it can render. A much better solution would be to prepare data for frame A on the cpu (and gpu), then render frame A, and while frame A is being rendered, proceed to prepare data for frame B. Then start rendering frame B, and once frame A if finished and presented, present frame A and start preparing data for the next frame A on the cpu side again and so on...
The problem is that we need to trade higher memory consumption for better performance here: We need to automatically double or triple buffer all resources which are not read-only. The nice thing here is that the buffer and texture wrappers have a VmaBuffer
handle hidden as a member, and we can replace this with an array of buffers.
static constexpr std::uint32_t FRAMES_IN_FLIGHT = 3;
class Buffer
private:
VkBuffer m_buffer[FRAMES_IN_FLIGHT];
// More arrays for allocation info here too
Here is the nice trick: When the user calls request_update
for a buffer, the buffer wrapper just stores a pointer to the source data and the size of the source data. The rendergraph can then call the internal update mechanism of the Buffer
class with the current frame in flight index, and copy the data into the current frame in flight buffer. Since rendergraph only reads from the external data, it should be possible to avoid double or triple buffering of data outside of the rendergraph entirely, as long as the rendering of a frame does not depend on data which is not part of the buffer, which should not be the case. The same idea also applies to textures if they need update on a per-frame basis.
With this idea, we should be able to keep the gpu fed and to achieve a gpu bottlenecked performance, as it should be.
One thing should be mentioned here: Since we have several frames in flight, we also need to double or triple buffer all Vulkan handles required for rendering. This means for example we need 2 or 3 command pools per thread and per queue:
thread_local VkCommandPool graphics_pool[FRAMES_IN_FLIGHT];
thread_local VkCommandPool transfer_pool[FRAMES_IN_FLIGHT];
// ...
This will be discussed in another issue.
Alternatives
If we don't use any system like this, we waste a lot of performance.
Affected Code
The rendergraph and buffer/texture wrapper code.
Operating System
All
Additional Context
None
Metadata
Metadata
Assignees
Labels
Type
Projects
Status