README.md
Helpers nvvk
Table of Contents
- appbase_vk.hpp
- appbase_vkpp.hpp
- appwindowprofiler_vk.hpp
- buffersuballocator_vk.hpp
- buffers_vk.hpp
- commands_vk.hpp
- context_vk.hpp
- debug_util_vk.hpp
- descriptorsets_vk.hpp
- error_vk.hpp
- extensions_vk.hpp
- gizmos_vk.hpp
- images_vk.hpp
- memallocator_dedicated_vk.hpp
- memallocator_dma_vk.hpp
- memallocator_vk.hpp
- memallocator_vma_vk.hpp
- memorymanagement_vk.hpp
- memorymanagement_vkgl.hpp
- pipeline_vk.hpp
- profiler_vk.hpp
- raypicker_vk.hpp
- raytraceKHR_vk.hpp
- raytraceNV_vk.hpp
- renderpasses_vk.hpp
- resourceallocator_vk.hpp
- samplers_vk.hpp
- sbtwrapper_vk.hpp
- shadermodulemanager_vk.hpp
- shaders_vk.hpp
- stagingmemorymanager_vk.hpp
- structs_vk.hpp
- swapchain_vk.hpp
appbase_vk.hpp
class nvvk::AppBaseVk
nvvk::AppBaseVk is used in a few samples, can serve as base class for various needs. They might differ a bit in setup and functionality, but in principle aid the setup of context and window, as well as some common event processing.
The nvvk::AppBaseVk serves as the base class for many ray tracing examples and makes use of the Vulkan C API. It does the basics for Vulkan, by holding a reference to the instance and device, but also comes with optional default setups for the render passes and the swapchain.
Usage
An example will derive from this class:
class VkSample : public AppBaseVk
{
};
Setup
In the main()
of an application, call setup()
which is taking a Vulkan instance, device, physical device,
and a queue family index. Setup copies the given Vulkan handles into AppBase, and query the 0th VkQueue of the
specified family, which must support graphics operations and drawing to the surface passed to createSurface.
Furthermore, it is creating a VkCommandPool.
Prior to calling setup, if you are using the nvvk::Context
class to create and initialize Vulkan instances,
you may want to create a VkSurfaceKHR from the window (glfw for example) and call setGCTQueueWithPresent()
.
This will make sure the m_queueGCT queue of nvvk::Context can draw to the surface, and m_queueGCT.familyIndex
will meet the requirements of setup().
Creating the swapchain for displaying. Arguments are width and height, color and depth format, and vsync on/off. Defaults will create the best format for the surface.
Creating framebuffers has a dependency on the renderPass and depth buffer. All those are virtual and can be overridden in a sample, but default implementation exist.
- createDepthBuffer: creates a 2D depth/stencil image
- createRenderPass : creates a color/depth pass and clear both buffers.
Here is the dependency order:
vkSample.createDepthBuffer();
vkSample.createRenderPass();
vkSample.createFrameBuffers();
The nvvk::Swapchain will create n images, typically 3. With this information, AppBase is also creating 3 VkFence, 3 VkCommandBuffer and 3 VkFrameBuffer.
Frame Buffers
The created frame buffers are display frame buffers, made to be presented on screen. The frame buffers will be created using one of the images from swapchain, and a depth buffer. There is only one depth buffer because that resource is not used simultaneously. For example, when we clear the depth buffer, it is not done immediately, but done through a command buffer, which will be executed later.
Note: the imageView(s) are part of the swapchain.
Command Buffers
AppBase works with 3 frame command buffers. Each frame is filling a command buffer and gets submitted, one after the other. This is a design choice that can be debated, but makes it simple. I think it is still possible to submit other command buffers in a frame, but those command buffers will have to be submitted before the frame one. The frame command buffer when submitted with submitFrame, will use the current fence.
Fences
There are as many fences as there are images in the swapchain. At the beginning of a frame, we call prepareFrame(). This is calling the acquire() from nvvk::SwapChain and wait until the image is available. The very first time, the fence will not stop, but later it will wait until the submit is completed on the GPU.
ImGui
If the application is using Dear ImGui, there are convenient functions for initializing it and
setting the callbacks (glfw). The first one to call is initGUI(0)
, where the argument is the subpass
it will be using. Default is 0, but if the application creates a renderpass with multi-sampling and
resolves in the second subpass, this makes it possible.
Glfw Callbacks
Call setupGlfwCallbacks(window)
to have all the window callback: key, mouse, window resizing.
By default AppBase will handle resizing of the window and will recreate the images and framebuffers.
If a sample needs to be aware of the resize, it can implement onResize(width, height)
.
To handle the callbacks in Imgui, call ImGui_ImplGlfw_InitForVulkan(window, true)
, where true
will handle the default ImGui callback .
Note: All the methods are virtual and can be overloaded if they are not doing the typical setup.
// Create example
VulkanSample vkSample;
// Window need to be opened to get the surface on which to draw
const VkSurfaceKHR surface = vkSample.getVkSurface(vkctx.m_instance, window);
vkctx.setGCTQueueWithPresent(surface);
vkSample.setup(vkctx.m_instance, vkctx.m_device, vkctx.m_physicalDevice, vkctx.m_queueGCT.familyIndex);
vkSample.createSwapchain(surface, SAMPLE_WIDTH, SAMPLE_HEIGHT);
vkSample.createDepthBuffer();
vkSample.createRenderPass();
vkSample.createFrameBuffers();
vkSample.initGUI(0);
vkSample.setupGlfwCallbacks(window);
ImGui_ImplGlfw_InitForVulkan(window, true);
Drawing loop
The drawing loop in the main() is the typicall loop you will find in glfw examples. Note that AppBase has a convenient function to tell if the window is minimize, therefore not doing any work and contain a sleep(), so the CPU is not going crazy.
// Window system loop
while(!glfwWindowShouldClose(window))
{
glfwPollEvents();
if(vkSample.isMinimized())
continue;
vkSample.display(); // infinitely drawing
}
Display
A typical display() function will need the following:
- Acquiring the next image:
prepareFrame()
- Get the command buffer for the frame. There are n command buffers equal to the number of in-flight frames.
- Clearing values
- Start rendering pass
- Drawing
- End rendering
- Submitting frame to display
void VkSample::display()
{
// Acquire
prepareFrame();
// Command buffer for current frame
auto curFrame = getCurFrame();
const VkCommandBuffer& cmdBuf = getCommandBuffers()[curFrame];
VkCommandBufferBeginInfo beginInfo{VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO};
beginInfo.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT;
vkBeginCommandBuffer(cmdBuf, &beginInfo);
// Clearing values
std::array<VkClearValue, 2> clearValues{};
clearValues[0].color = {{1.f, 1.f, 1.f, 1.f}};
clearValues[1].depthStencil = {1.0f, 0};
// Begin rendering
VkRenderPassBeginInfo renderPassBeginInfo{VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO};
renderPassBeginInfo.clearValueCount = 2;
renderPassBeginInfo.pClearValues = clearValues.data();
renderPassBeginInfo.renderPass = m_renderPass;
renderPassBeginInfo.framebuffer = m_framebuffers[curFram];
renderPassBeginInfo.renderArea = {{0, 0}, m_size};
vkCmdBeginRenderPass(cmdBuf, &renderPassBeginInfo, VK_SUBPASS_CONTENTS_SECONDARY_COMMAND_BUFFERS);
// .. draw scene ...
// Draw UI
ImGui_ImplVulkan_RenderDrawData( ImGui::GetDrawData(),cmdBuff)
// End rendering
vkCmdEndRenderPass(cmdBuf);
// End of the frame and present the one which is ready
vkEndCommandBuffer(cmdBuf);
submitFrame();
}
Closing
Finally, all resources can be destroyed by calling destroy()
at the end of main().
vkSample.destroy();
appbase_vkpp.hpp
class nvvk::AppBase
nvvk::AppBaseVk is the same as nvvk::AppBaseVk but makes use of the Vulkan C++ API (vulkan.hpp
).
appwindowprofiler_vk.hpp
class nvvk::AppWindowProfilerVK
nvvk::AppWindowProfilerVK derives from nvh::AppWindowProfiler and overrides the context and swapbuffer functions. The nvh class itself provides several utilities and command line options to run automated benchmarks etc.
To influence the vulkan instance/device creation modify
m_contextInfo
prior running AppWindowProfiler::run,
which triggers instance, device, window, swapchain creation etc.
The class comes with a nvvk::ProfilerVK instance that references the AppWindowProfiler::m_profiler's data.
buffersuballocator_vk.hpp
class nvvk::BufferSubAllocator
nvvk::BufferSubAllocator provides buffer sub allocation using larger buffer blocks. The blocks are one VkBuffer each and are allocated via the provided nvvk::MemAllocator.
The requested buffer space is sub-allocated and recycled in blocks internally. This way we avoid creating lots of small VkBuffers and can avoid calling the Vulkan API at all, when there are blocks with sufficient empty space. While Vulkan is more efficient than previous APIs, creating lots of objects for it, is still not good for overall performance. It will result into more cache misses and use more system memory over all.
Be aware that each sub-allocation is always BASE_ALIGNMENT aligned. A custom alignment during allocation can be requested, it will ensure that the returned sub-allocation range of offset & size can account for the original requested size fitting within and respecting the requested
This, however, means the regular offset and may not match the requested alignment, and the regular size can be bigger to account for the shift caused by manual alignment.
It is therefore necessary to pass the alignment that was used at allocation time to the query functions as well.
// alignment <= BASE_ALIGNMENT
handle = subAllocator.subAllocate(size);
binding = subAllocator.getSubBinding(handle);
// alignment > BASE_ALIGNMENT
handle = subAllocator.subAllocate(size, alignment);
binding = subAllocator.getSubBinding(handle, alignment);
buffers_vk.hpp
functions in nvvk
- makeBufferCreateInfo : wraps setup of VkBufferCreateInfo (implicitly sets VK_BUFFER_USAGE_TRANSFER_DST_BIT)
- makeBufferViewCreateInfo : wraps setup of VkBufferViewCreateInfo
- createBuffer : wraps vkCreateBuffer
- createBufferView : wraps vkCreateBufferView
- getBufferDeviceAddressKHR : wraps vkGetBufferDeviceAddressKHR
- getBufferDeviceAddress : wraps vkGetBufferDeviceAddress
VkBufferCreateInfo bufferCreate = makeBufferCreateInfo (size, VK_BUFFER_USAGE_UNIFORM_TEXEL_BUFFER_BIT);
VkBuffer buffer = createBuffer(device, bufferCreate);
VkBufferView bufferView = createBufferView(device, makeBufferViewCreateInfo(buffer, VK_FORMAT_R8G8B8A8_UNORM, size));
commands_vk.hpp
functions in nvvk
- makeAccessMaskPipelineStageFlags : depending on accessMask returns appropriate VkPipelineStageFlagBits
- cmdBegin : wraps vkBeginCommandBuffer with VkCommandBufferUsageFlags and implicitly handles VkCommandBufferBeginInfo setup
- makeSubmitInfo : VkSubmitInfo struct setup using provided arrays of signals and commandbuffers, leaving rest zeroed
class nvvk::CommandPool
nvvk::CommandPool stores a single VkCommandPool and provides utility functions to create VkCommandBuffers from it.
Example:
{
nvvk::CommandPool cmdPool;
cmdPool.init(...);
// some setup/one shot work
{
vkCommandBuffer cmd = scopePool.createAndBegin();
... record commands ...
// trigger execution with a blocking operation
// not recommended for performance
// but useful for sample setup
scopePool.submitAndWait(cmd, queue);
}
// other cmds you may batch, or recycle
std::vector<VkCommandBuffer> cmds;
{
vkCommandBuffer cmd = scopePool.createAndBegin();
... record commands ...
cmds.push_back(cmd);
}
{
vkCommandBuffer cmd = scopePool.createAndBegin();
... record commands ...
cmds.push_back(cmd);
}
// do some form of batched submission of cmds
// after completion destroy cmd
cmdPool.destroy(cmds.size(), cmds.data());
cmdPool.deinit();
}
class nvvk::ScopeCommandBuffer
nvvk::ScopeCommandBuffer provides a single VkCommandBuffer that lives within the scope and is directly submitted and deleted when the scope is left. Not recommended for efficiency, since it results in a blocking operation, but aids sample writing.
Example:
{
ScopeCommandBuffer cmd(device, queueFamilyIndex, queue);
... do stuff
vkCmdCopyBuffer(cmd, ...);
}
class nvvk::RingFences
nvvk::RingFences recycles a fixed number of fences, provides information in which cycle we are currently at, and prevents accidental access to a cycle in-flight.
A typical frame would start by "setCycleAndWait", which waits for the requested cycle to be available.
class nvvk::RingCommandPool
nvvk::RingCommandPool manages a fixed cycle set of VkCommandBufferPools and one-shot command buffers allocated from them.
The usage of multiple command buffer pools also means we get nice allocation behavior (linear allocation from frame start to frame end) without fragmentation. If we were using a single command pool over multiple frames, it could fragment easily.
You must ensure cycle is available manually, typically by keeping in sync with ring fences.
Example:
{
frame++;
// wait until we can use the new cycle
// (very rare if we use the fence at then end once per-frame)
ringFences.setCycleAndWait( frame );
// update cycle state, allows recycling of old resources
ringPool.setCycle( frame );
VkCommandBuffer cmd = ringPool.createCommandBuffer(...);
... do stuff / submit etc...
VkFence fence = ringFences.getFence();
// use this fence in the submit
vkQueueSubmit(...fence..);
}
class nvvk::BatchSubmission
nvvk::BatchSubmission batches the submission arguments of VkSubmitInfo for VkQueueSubmit.
vkQueueSubmit is a rather costly operation (depending on OS) and should be avoided to be done too often (e.g. < 10 per frame). Therefore this utility class allows adding commandbuffers, semaphores etc. and submit them later in a batch.
When using manual locks, it can also be useful to feed commandbuffers from different threads and then later kick it off.
Example
// within upload logic
{
semTransfer = handleUpload(...);
// for example trigger async upload on transfer queue here
vkQueueSubmit(... semTransfer ...);
// tell next frame's batch submission
// that its commandbuffers should wait for transfer
// to be completed
graphicsSubmission.enqueWait(semTransfer)
}
// within present logic
{
// for example ensure the next frame waits until proper present semaphore was triggered
graphicsSubmission.enqueueWait(presentSemaphore);
}
// within drawing logic
{
// enqueue some graphics work for submission
graphicsSubmission.enqueue(getSceneCmdBuffer());
graphicsSubmission.enqueue(getUiCmdBuffer());
graphicsSubmission.execute(frameFence);
}
class nvvk::FencedCommandPools
nvvk::FencedCommandPools container class contains the typical utilities to handle command submission. It contains RingFences, RingCommandPool and BatchSubmission with a convenient interface.
context_vk.hpp
class nvvk::Context
nvvk::Context class helps creating the Vulkan instance and to choose the logical device for the mandatory extensions. First is to fill the ContextCreateInfo
structure, then call:
// Creating the Vulkan instance and device
nvvk::ContextCreateInfo ctxInfo;
... see above ...
nvvk::Context vkctx;
vkctx.init(ctxInfo);
// after init the ctxInfo is no longer needed
At this point, the class will have created the VkInstance
and VkDevice
according to the information passed. It will also keeps track or have query the information of:
- Physical Device information that you can later query :
PhysicalDeviceInfo
in which lots ofVkPhysicalDevice...
are stored VkInstance
: the one instance being used for the programVkPhysicalDevice
: physical device(s) used for the logical device creation. In case of more than one physical device, we have a std::vector for this purpose...VkDevice
: the logical device instantiatedVkQueue
: By default, 3 queues are created, one per family: Graphic-Compute-Transfer, Compute and Transfer. For any additionnal queue, they need to be requested withContextCreateInfo::addRequestedQueue()
. This is creating information of the best suitable queues, but not creating them. To create the additional queues,Context::createQueue()
must be call after creating the Vulkan context.
The following queues are always created and can be directly accessed without calling createQueue :Queue m_queueGCT
: Graphics/Compute/Transfer Queue + family indexQueue m_queueT
: async Transfer Queue + family indexQueue m_queueC
: async Compute Queue + family index
- maintains what extensions are finally available
- implicitly hooks up the debug callback
Choosing the device
When there are multiple devices, the init
method is choosing the first compatible device available, but it is also possible the choose another one.
vkctx.initInstance(deviceInfo);
// Find all compatible devices
auto compatibleDevices = vkctx.getCompatibleDevices(deviceInfo);
assert(!compatibleDevices.empty());
// Use first compatible device
vkctx.initDevice(compatibleDevices[0], deviceInfo);
Multi-GPU
When multiple graphic cards should be used as a single device, the ContextCreateInfo::useDeviceGroups
need to be set to true
.
The above methods will transparently create the VkDevice
using VkDeviceGroupDeviceCreateInfo
.
Especially in the context of NVLink connected cards this is useful.
debug_util_vk.hpp
class DebugUtil
This is a companion utility to add debug information to an application See https://vulkan.lunarg.com/doc/sdk/1.1.114.0/windows/chunked_spec/chap39.html
- User defined name to objects
- Logically annotate region of command buffers
- Scoped command buffer label to make thing simpler
descriptorsets_vk.hpp
functions in nvvk
- createDescriptorPool : wrappers for vkCreateDescriptorPool
- allocateDescriptorSet : allocates a single VkDescriptorSet
- allocateDescriptorSets : allocates multiple VkDescriptorSets
class nvvk::DescriptorSetBindings
nvvk::DescriptorSetBindings is a helper class that keeps a vector of VkDescriptorSetLayoutBinding
for a single
VkDescriptorSetLayout
. Provides helper functions to create VkDescriptorSetLayout
as well as VkDescriptorPool
based on this information, as well as utilities
to fill the VkWriteDescriptorSet
structure with binding information stored
within the class.
The class comes with the convenience functionality that when you make a VkWriteDescriptorSet you provide the binding slot, rather than the index of the binding's storage within this class. This results in a small linear search, but makes it easy to change the content/order of bindings at creation time.
Example :
DescriptorSetBindings binds;
binds.addBinding( VIEW_BINDING, VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER, 1, VK_SHADER_STAGE_VERTEX_BIT);
binds.addBinding(XFORM_BINDING, VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, 1, VK_SHADER_STAGE_VERTEX_BIT);
VkDescriptorSetLayout layout = binds.createLayout(device);
#if SINGLE_LAYOUT_POOL
// let's create a pool with 2 sets
VkDescriptorPool pool = binds.createPool(device, 2);
#else
// if you want to combine multiple layouts into a common pool
std::vector<VkDescriptorPoolSize> poolSizes;
bindsA.addRequiredPoolSizes(poolSizes, numSetsA);
bindsB.addRequiredPoolSizes(poolSizes, numSetsB);
VkDescriptorPool pool = nvvk::createDescriptorPool(device, poolSizes,
numSetsA + numSetsB);
#endif
// fill them
std::vector<VkWriteDescriptorSet> updates;
updates.push_back(binds.makeWrite(0, VIEW_BINDING, &view0BufferInfo));
updates.push_back(binds.makeWrite(1, VIEW_BINDING, &view1BufferInfo));
updates.push_back(binds.makeWrite(0, XFORM_BINDING, &xform0BufferInfo));
updates.push_back(binds.makeWrite(1, XFORM_BINDING, &xform1BufferInfo));
vkUpdateDescriptorSets(device, updates.size(), updates.data(), 0, nullptr);
class nvvk::DescriptorSetContainer
nvvk::DescriptorSetContainer is a container class that stores allocated DescriptorSets as well as reflection, layout and pool for a single VkDescripterSetLayout.
Example:
container.init(device, allocator);
// setup dset layouts
container.addBinding(0, UBO...)
container.addBinding(1, SSBO...)
container.initLayout();
// allocate descriptorsets
container.initPool(17);
// update descriptorsets
writeUpdates.push_back( container.makeWrite(0, 0, &..) );
writeUpdates.push_back( container.makeWrite(0, 1, &..) );
writeUpdates.push_back( container.makeWrite(1, 0, &..) );
writeUpdates.push_back( container.makeWrite(1, 1, &..) );
writeUpdates.push_back( container.makeWrite(2, 0, &..) );
writeUpdates.push_back( container.makeWrite(2, 1, &..) );
...
// at render time
vkCmdBindDescriptorSets(cmd, GRAPHICS, pipeLayout, 1, 1, container.at(7).getSets());
class nvvk::TDescriptorSetContainer<SETS,PIPES=1>
nvvk::TDescriptorSetContainer is a templated version of DescriptorSetContainer :
- SETS - many DescriptorSetContainers
- PIPES - many VkPipelineLayouts
The pipeline layouts are stored separately, the class does not use the pipeline layouts of the embedded DescriptorSetContainers.
Example :
Usage, e.g.SETS = 2, PIPES = 2
container.init(device, allocator);
// setup dset layouts
container.at(0).addBinding(0, UBO...)
container.at(0).addBinding(1, SSBO...)
container.at(0).initLayout();
container.at(1).addBinding(0, COMBINED_SAMPLER...)
container.at(1).initLayout();
// pipe 0 uses set 0 alone
container.initPipeLayout(0, 1);
// pipe 1 uses sets 0, 1
container.initPipeLayout(1, 2);
// allocate descriptorsets
container.at(0).initPool(1);
container.at(1).initPool(16);
// update descriptorsets
writeUpdates.push_back(container.at(0).makeWrite(0, 0, &..));
writeUpdates.push_back(container.at(0).makeWrite(0, 1, &..));
writeUpdates.push_back(container.at(1).makeWrite(0, 0, &..));
writeUpdates.push_back(container.at(1).makeWrite(1, 0, &..));
writeUpdates.push_back(container.at(1).makeWrite(2, 0, &..));
...
// at render time
vkCmdBindDescriptorSets(cmd, GRAPHICS, container.getPipeLayout(0), 0, 1, container.at(0).getSets());
..
vkCmdBindDescriptorSets(cmd, GRAPHICS, container.getPipeLayout(1), 1, 1, container.at(1).getSets(7));
error_vk.hpp
function nvvk::checkResult
Returns true on critical error result, logs errors.
Use NVVK_CHECK(result)
to automatically log filename/linenumber.
extensions_vk.hpp
function load_VK_EXTENSIONS
load_VK_EXTENSIONS : Vulkan Extension Loader
The extensions_vk files takes care of loading and providing the symbols of Vulkan C Api extensions.
It is generated by extensions_vk.py
and generates all extensions found in vk.xml. See script for details.
.
The framework triggers this implicitly in the nvvk::Context
class, immediately after creating the device.
// loads all known extensions
load_VK_EXTENSIONS(instance, vkGetInstanceProcAddr, device, vkGetDeviceProcAddr);
gizmos_vk.hpp
class nvvk::Axis
nvvk::Axis displays an Axis representing the orientation of the camera in the bottom left corner of the window.
- Initialize the Axis using
init()
- Add
display()
in a inline rendering pass, one of the lass command
Example:
m_axis.display(cmdBuf, CameraManip.getMatrix(), windowSize);
images_vk.hpp
functions in nvvk
-
makeImageMemoryBarrier : returns VkImageMemoryBarrier for an image based on provided layouts and access flags.
-
mipLevels : return number of mips for 2d/3d extent
-
accessFlagsForImageLayout : helps resource transtions
-
pipelineStageForLayout : helps resource transitions
-
cmdBarrierImageLayout : inserts barrier for image transition
-
cmdGenerateMipmaps : basic mipmap creation for images (meant for one-shot operations)
-
makeImage2DCreateInfo : aids 2d image creation
-
makeImage3DCreateInfo : aids 3d descriptor set updating
-
makeImageCubeCreateInfo : aids cube descriptor set updating
-
makeImageViewCreateInfo : aids common image view creation, derives info from VkImageCreateInfo
-
makeImage2DViewCreateInfo : aids 2d image view creation
memallocator_dedicated_vk.hpp
class nvvk::DedicatedMemoryAllocator
nvvk::DedicatedMemoryAllocator is a simple implementation of the MemAllocator interface, using one vkDeviceMemory allocation per allocMemory() call. The simplicity of the implementation is bought with potential slowness (vkAllocateMemory tends to be very slow) and running out of operating system resources quickly (as some OSs limit the number of physical memory allocations per process).
memallocator_dma_vk.hpp
class nvvk::DMAMemoryAllocator
nvvk::DMAMemoryAllocator is using nvvk::DeviceMemoryAllocator internally. nvvk::DeviceMemoryAllocator derives from nvvk::MemAllocator as well, so this class here is for those prefering a reduced wrapper;
memallocator_vk.hpp
class nvvk::MemHandle
nvvk::MemHandle represents a memory allocation or sub-allocation from the
generic nvvk::MemAllocator interface. Ideally use nvvk::NullMemHandle
for
setting to 'NULL'. MemHandle may change to a non-pointer type in future.
\class nvvk::MemAllocateInfo
nvvk::MemAllocateInfo is collecting almost all parameters a Vulkan allocation could potentially need. This keeps MemAllocator's interface simple and extensible.
class nvvk::MemAllocator
nvvk::MemAllocator is a Vulkan memory allocator interface extensively used by ResourceAllocator. It provides means to allocate, free, map and unmap pieces of Vulkan device memory. Concrete implementations derive from nvvk::MemoryAllocator. They can implement the allocator dunctionality themselves or act as an adapter to another memory allocator implementation.
A nvvk::MemAllocator hands out opaque 'MemHandles'. The implementation of the MemAllocator interface may chose any type of payload to store in a MemHandle. A MemHandle's relevant information can be retrieved via getMemoryInfo().
memallocator_vma_vk.hpp
class nvvk::VMAMemoryAllocator
nvvk::VMAMemoryAllocator using the GPUOpen Vulkan Memory Allocator underneath. As VMA comes as a header-only library, when using it you'll have to:
- provide _add_package_VMA() in your CMakeLists.txt
- put these lines into one of your compilation units:
#define VMA_IMPLEMENTATION
#include "vk_mem_alloc.h"
class nvvk::ResourceAllocatorVMA
nvvk::ResourceAllocatorVMA is a convencience class creating, initializing and owning a nvvk::VmaAllocator and associated nvvk::MemAllocator object.
memorymanagement_vk.hpp
functions in nvvk
- getMemoryInfo : fills the VkMemoryAllocateInfo based on device's memory properties and memory requirements and property flags. Returns
true
on success.
class nvvk::DeviceMemoryAllocator
The nvvk::DeviceMemoryAllocator allocates and manages device memory in fixed-size memory blocks. It implements the nvvk::MemAllocator interface.
It sub-allocates from the blocks, and can re-use memory if it finds empty regions. Because of the fixed-block usage, you can directly create resources and don't need a phase to compute the allocation sizes first.
It will create compatible chunks according to the memory requirements and usage flags. Therefore you can easily create mappable host allocations and delete them after usage, without inferring device-side allocations.
An AllocationID
is returned rather than the allocation details directly, which
one can query separately.
Several utility functions are provided to handle the binding of memory directly with the resource creation of buffers, images and acceleration structures. These utilities also make implicit use of Vulkan's dedicated allocation mechanism.
We recommend the use of the nvvk::ResourceAllocator class, rather than the various create functions provided here, as we may deprecate them.
WARNING : The memory manager serves as proof of concept for some key concepts however it is not meant for production use and it currently lacks de-fragmentation logic as well. You may want to look at VMA for a more production-focused solution.
You can derive from this class and overload a few functions to alter the chunk allocation behavior.
Example :
nvvk::DeviceMemoryAllocator memAllocator;
memAllocator.init(device, physicalDevice);
// low-level
aid = memAllocator.alloc(memRequirements,...);
...
memAllocator.free(aid);
// utility wrapper
buffer = memAllocator.createBuffer(bufferSize, bufferUsage, bufferAid);
...
memAllocator.free(bufferAid);
// It is also possible to not track individual resources
// and free everything in one go. However, this is
// not recommended for general purpose use.
bufferA = memAllocator.createBuffer(sizeA, usageA);
bufferB = memAllocator.createBuffer(sizeB, usageB);
...
memAllocator.freeAll();
memorymanagement_vkgl.hpp
class nvvk::DeviceMemoryAllocatorGL
nvvk::DeviceMemoryAllocatorGL is derived from nvvk::DeviceMemoryAllocator it uses vulkan memory that is exported and directly imported into OpenGL. Requires GL_EXT_memory_object.
Used just like the original class however a new function to get the
GL memory object exists: getAllocationGL
.
Look at source of nvvk::AllocatorDmaGL for usage.
pipeline_vk.hpp
functions in nvvk
- nvprintPipelineStats : prints stats of the pipeline using VK_KHR_pipeline_executable_properties (don't forget to enable extension and set VK_PIPELINE_CREATE_CAPTURE_STATISTICS_BIT_KHR)
- dumpPipelineStats : dumps stats of the pipeline using VK_KHR_pipeline_executable_properties to a text file (don't forget to enable extension and set VK_PIPELINE_CREATE_CAPTURE_STATISTICS_BIT_KHR)
- dumpPipelineBinCodes : dumps shader binaries using VK_KHR_pipeline_executable_properties to multiple binary files (don't forget to enable extension and set VK_PIPELINE_CREATE_CAPTURE_INTERNAL_REPRESENTATIONS_BIT_KHR)
struct nvvk::GraphicsPipelineState
Most graphic pipelines have similar states, therefore the helper GraphicsPipelineStage
holds all the elements and
initialize the structures with the proper default values, such as the primitive type, PipelineColorBlendAttachmentState
with their mask, DynamicState
for viewport and scissor, adjust depth test if enabled, line width to 1 pixel, for
example.
nvvk::GraphicsPipelineState structure is instantiated using C++ Vulkan objects if VULKAN_HPP is defined, and C otherwise.
Example of usage :
nvvk::GraphicsPipelineState pipelineState();
pipelineState.depthStencilState.setDepthTestEnable(true);
pipelineState.rasterizationState.setCullMode(vk::CullModeFlagBits::eNone);
pipelineState.addBindingDescription({0, sizeof(Vertex)});
pipelineState.addAttributeDescriptions ({
{0, 0, vk::Format::eR32G32B32Sfloat, static_cast<uint32_t>(offsetof(Vertex, pos))},
{1, 0, vk::Format::eR32G32B32Sfloat, static_cast<uint32_t>(offsetof(Vertex, nrm))},
{2, 0, vk::Format::eR32G32B32Sfloat, static_cast<uint32_t>(offsetof(Vertex, col))}});
struct nvvk::GraphicsPipelineGenerator
The graphics pipeline generator takes a GraphicsPipelineState object and pipeline-specific information such as the render pass and pipeline layout to generate the final pipeline.
nvvk::GraphicsPipelineGenerator structure is instantiated using C++ Vulkan objects if VULKAN_HPP is defined, and C otherwise.
Example of usage :
nvvk::GraphicsPipelineState pipelineState();
...
nvvk::GraphicsPipelineGenerator pipelineGenerator(m_device, m_pipelineLayout, m_renderPass, pipelineState);
pipelineGenerator.addShader(readFile("spv/vert_shader.vert.spv"), VkShaderStageFlagBits::eVertex);
pipelineGenerator.addShader(readFile("spv/frag_shader.frag.spv"), VkShaderStageFlagBits::eFragment);
m_pipeline = pipelineGenerator.createPipeline();
class nvvk::GraphicsPipelineGeneratorCombined
In some cases the application may have each state associated to a single pipeline. For convenience, nvvk::GraphicsPipelineGeneratorCombined combines both the state and generator into a single object.
Example of usage :
nvvk::GraphicsPipelineGeneratorCombined pipelineGenerator(m_device, m_pipelineLayout, m_renderPass);
pipelineGenerator.depthStencilState.setDepthTestEnable(true);
pipelineGenerator.rasterizationState.setCullMode(vk::CullModeFlagBits::eNone);
pipelineGenerator.addBindingDescription({0, sizeof(Vertex)});
pipelineGenerator.addAttributeDescriptions ({
{0, 0, vk::Format::eR32G32B32Sfloat, static_cast<uint32_t>(offsetof(Vertex, pos))},
{1, 0, vk::Format::eR32G32B32Sfloat, static_cast<uint32_t>(offsetof(Vertex, nrm))},
{2, 0, vk::Format::eR32G32B32Sfloat, static_cast<uint32_t>(offsetof(Vertex, col))}});
pipelineGenerator.addShader(readFile("spv/vert_shader.vert.spv"), VkShaderStageFlagBits::eVertex);
pipelineGenerator.addShader(readFile("spv/frag_shader.frag.spv"), VkShaderStageFlagBits::eFragment);
m_pipeline = pipelineGenerator.createPipeline();
profiler_vk.hpp
class nvvk::ProfilerVK
nvvk::ProfilerVK derives from nvh::Profiler and uses vkCmdWriteTimestamp to measure the gpu time within a section.
If profiler.setLabelUsage(true) was used then it will make use of vkCmdDebugMarkerBeginEXT and vkCmdDebugMarkerEndEXT for each section so that it shows up in tools like NsightGraphics and renderdoc.
Currently the commandbuffers must support vkCmdResetQueryPool as well.
When multiple queues are used there could be problems with the "nesting" of sections. In that case multiple profilers, one per queue, are most likely better.
Example:
nvvk::ProfilerVK profiler;
std::string profilerStats;
profiler.init(device, physicalDevice, queueFamilyIndex);
profiler.setLabelUsage(true); // depends on VK_EXT_debug_utils
while(true)
{
profiler.beginFrame();
... setup frame ...
{
// use the Section class to time the scope
auto sec = profiler.timeRecurring("draw", cmd);
vkCmdDraw(cmd, ...);
}
... submit cmd buffer ...
profiler.endFrame();
// generic print to string
profiler.print(profilerStats);
// or access data directly
nvh::Profiler::TimerInfo info;
if( profiler.getTimerInfo("draw", info)) {
// do some updates
updateProfilerUi("draw", info.gpu.average);
}
}
raypicker_vk.hpp
class nvvk::RayPickerKHR
nvvk::RayPickerKHR is a utility to get hit information under a screen coordinate.
The information returned is:
- origin and direction in world space
- hitT, the distance of the hit along the ray direction
- primitiveID, instanceID and instanceCustomIndex
- the barycentric coordinates in the triangle
Setting up:
- call setup() once with the Vulkan device, and allocator
- call setTlas with the TLAS previously build
Getting results, for example, on mouse down:
- fill the PickInfo structure
- call run()
- call getResult() to get all the information above
Example to set the camera interest point
RayPickerKHR::PickResult pr = m_picker.getResult();
if(pr.instanceID != ~0) // Hit something
{
nvmath::vec3 worldPos = pr.worldRayOrigin + pr.worldRayDirection * pr.hitT;
nvmath::vec3f eye, center, up;
CameraManip.getLookat(eye, center, up);
CameraManip.setLookat(eye, worldPos, up, false); // Nice with CameraManip.updateAnim();
}
raytraceKHR_vk.hpp
class nvvk::RaytracingBuilderKHR
nvvk::RaytracingBuilderKHR is a base functionality of raytracing
This class acts as an owning container for a single top-level acceleration structure referencing any number of bottom-level acceleration structures. We provide functions for building (on the device) an array of BLASs and a single TLAS from vectors of BlasInput and Instance, respectively, and a destroy function for cleaning up the created acceleration structures.
Generally, we reference BLASs by their index in the stored BLAS array, rather than using raw device pointers as the pure Vulkan acceleration structure API uses.
This class does not support replacing acceleration structures once built, but you can update the acceleration structures. For educational purposes, this class prioritizes (relative) understandability over performance, so vkQueueWaitIdle is implicitly used everywhere.
Setup and Usage
// Borrow a VkDevice and memory allocator pointer (must remain
// valid throughout our use of the ray trace builder), and
// instantiate an unspecified queue of the given family for use.
m_rtBuilder.setup(device, memoryAllocator, queueIndex);
// You create a vector of RayTracingBuilderKHR::BlasInput then
// pass it to buildBlas.
std::vector<RayTracingBuilderKHR::BlasInput> inputs = // ...
m_rtBuilder.buildBlas(inputs);
// You create a vector of RaytracingBuilder::Instance and pass to
// buildTlas. The blasId member of each instance must be below
// inputs.size() (above).
std::vector<VkAccelerationStructureInstanceKHR> instances = // ...
m_rtBuilder.buildTlas(instances);
// Retrieve the handle to the acceleration structure.
const VkAccelerationStructureKHR tlas = m.rtBuilder.getAccelerationStructure()
raytraceNV_vk.hpp
class nvvk::RaytracingBuilderNV
nvvk::RaytracingBuilderNV is a base functionality of raytracing
This class does not implement all what you need to do raytracing, but helps creating the BLAS and TLAS, which then can be used by different raytracing usage.
Setup and Usage
m_rtBuilder.setup(device, memoryAllocator, queueIndex);
// Create array of VkGeometryNV
m_rtBuilder.buildBlas(allBlas);
// Create array of RaytracingBuilder::instance
m_rtBuilder.buildTlas(instances);
// Retrieve the acceleration structure
const VkAccelerationStructureNV& tlas = m.rtBuilder.getAccelerationStructure()
renderpasses_vk.hpp
functions in nvvk
- findSupportedFormat : returns supported VkFormat from a list of candidates (returns first match)
- findDepthFormat : returns supported depth format (24, 32, 16-bit)
- findDepthStencilFormat : returns supported depth-stencil format (24/8, 32/8, 16/8-bit)
- createRenderPass : wrapper for vkCreateRenderPass
resourceallocator_vk.hpp
class nvvk::ResourceAllocatorDedicated)
- [ResourceAllocatorDma](#class nvvk::ResourceAllocatorDma)
- [ResourceAllocatorVma](#cass nvvk::ResourceAllocatorVma)
In these cases, only one object needs to be created and initialized.
ResourceAllocator can also be subclassed to specialize some of its functionality. Examples are [ExportResourceAllocator](#class ExportResourceAllocator) and [ExplicitDeviceMaskResourceAllocator](#class ExplicitDeviceMaskResourceAllocator). ExportResourceAllocator injects itself into the object allocation process such that the resulting allocations can be exported or created objects may be bound to exported memory ExplicitDeviceMaskResourceAllocator overrides the devicemask of allocations such that objects can be created on a specific device in a device group.
class nvvk::ResourceAllocator
The goal of nvvk::ResourceAllocator is to aid creation of typical Vulkan resources (VkBuffer, VkImage and VkAccelerationStructure). All memory is allocated using the provided nvvk::MemAllocator and bound to the appropriate resources. The allocator contains a nvvk::StagingMemoryManager and nvvk::SamplerPool to aid this process.
ResourceAllocator separates object creation and memory allocation by delegating allocation of memory to an object of interface type 'nvvk::MemAllocator'. This way the ResourceAllocator can be used with different memory allocation strategies, depending on needs. nvvk provides three implementations of MemAllocator:
- nvvk::DedicatedMemoryAllocator is using a very simple allocation scheme, one VkDeviceMemory object per allocation. This strategy is only useful for very simple applications due to the overhead of vkAllocateMemory and an implementation dependent bounded number of vkDeviceMemory allocations possible.
- nvvk::DMAMemoryAllocator delegates memory requests to a 'nvvk:DeviceMemoryAllocator', as an example implemention of a suballocator
- nvvk::VMAMemoryAllocator delegates memory requests to a Vulkan Memory Allocator
Utility wrapper structs contain the appropriate Vulkan resource and the appropriate nvvk::MemHandle :
- nvvk::Buffer
- nvvk::Image
- nvvk::Texture contains VkImage and VkImageView as well as an optional VkSampler stored witin VkDescriptorImageInfo
- nvvk::AccelNV
- nvvk::AccelKHR
nvvk::Buffer, nvvk::Image, nvvk::Texture and nvvk::AccelKHR nvvk::AccelNV objects can be copied by value. They do not track lifetime of the underlying Vulkan objects and memory allocations. The corresponding destroy() functions of nvvk::ResourceAllocator destroy created objects and free up their memory. ResourceAllocator does not track usage of objects either. Thus, one has to make sure that objects are no longer in use by the GPU when they get destroyed.
Note: These classes are foremost to showcase principle components that a Vulkan engine would most likely have. They are geared towards ease of use in this sample framework, and not optimized nor meant for production code.
nvvk::DeviceMemoryAllocator memAllocator;
nvvk::ResourceAllocator resAllocator;
memAllocator.init(device, physicalDevice);
resAllocator.init(device, physicalDevice, &memAllocator);
...
VkCommandBuffer cmd = ... transfer queue command buffer
// creates new resources and
// implicitly triggers staging transfer copy operations into cmd
nvvk::Buffer vbo = resAllocator.createBuffer(cmd, vboSize, vboData, vboUsage);
nvvk::Buffer ibo = resAllocator.createBuffer(cmd, iboSize, iboData, iboUsage);
// use functions from staging memory manager
// here we associate the temporary staging resources with a fence
resAllocator.finalizeStaging( fence );
// submit cmd buffer with staging copy operations
vkQueueSubmit(... cmd ... fence ...)
...
// if you do async uploads you would
// trigger garbage collection somewhere per frame
resAllocator.releaseStaging();
Separation of memory allocation and resource creation is very flexible, but it can be tedious to set up for simple usecases. nvvk offers three helper ResourceAllocator derived classes which internally contain the MemAllocator object and manage its lifetime:
- [ResourceAllocatorDedicated](#class nvvk::ResourceAllocatorDedicated)
- [ResourceAllocatorDma](#class nvvk::ResourceAllocatorDma)
- [ResourceAllocatorVma](#cass nvvk::ResourceAllocatorVma)
In these cases, only one object needs to be created and initialized.
ResourceAllocator can also be subclassed to specialize some of its functionality. Examples are [ExportResourceAllocator](#class ExportResourceAllocator) and [ExplicitDeviceMaskResourceAllocator](#class ExplicitDeviceMaskResourceAllocator). ExportResourceAllocator injects itself into the object allocation process such that the resulting allocations can be exported or created objects may be bound to exported memory ExplicitDeviceMaskResourceAllocator overrides the devicemask of allocations such that objects can be created on a specific device in a device group.
class nvvk::ResourceAllocatorDma
nvvk::ResourceAllocatorDMA is a convencience class owning a nvvk::DMAMemoryAllocator and nvvk::DeviceMemoryAllocator object
class nvvk::ResourceAllocatorDedicated
nvvk::ResourceAllocatorDedicated is a convencience class automatically creating and owning a DedicatedMemoryAllocator object
class nvvk::ExportResourceAllocator
ExportResourceAllocator specializes the object allocation process such that resulting memory allocations are exportable and buffers and images can be bound to external memory.
class nvvk::ExportResourceAllocatorDedicated
nvvk::ExportResourceAllocatorDedicated is a resource allocator that is using DedicatedMemoryAllocator to allocate memory and at the same time it'll make all allocations exportable.
class nvvk::ExplicitDeviceMaskResourceAllocator
nvvk::ExplicitDeviceMaskResourceAllocator is a resource allocator that will inject a specific devicemask into each allocation, making the created allocations and objects available to only the devices in the mask.
samplers_vk.hpp
class nvvk::SamplerPool
This nvvk::SamplerPool class manages unique VkSampler objects. To minimize the total number of sampler objects, this class ensures that identical configurations return the same sampler
Example :
nvvk::SamplerPool pool(device);
for (auto it : textures) {
VkSamplerCreateInfo info = {...};
// acquire ensures we create the minimal subset of samplers
it.sampler = pool.acquireSampler(info);
}
// you can manage releases individually, or just use deinit/destructor of pool
for (auto it : textures) {
pool.releaseSampler(it.sampler);
}
- makeSamplerCreateInfo : aids for sampler creation
sbtwrapper_vk.hpp
class nvvk::SBTWrapper
nvvk::SBTWrapper is a generic SBT builder from the ray tracing pipeline
The builder will iterate through the pipeline create info VkRayTracingPipelineCreateInfoKHR
to find the number of raygen, miss, hit and callable shader groups were created.
The handles for those group will be retrieved from the pipeline and written in the right order in
separated buffer.
Convenient functions exist to retrieve all information to be used in TraceRayKHR.
Usage
- Setup the builder (
setup()
) - After the pipeline creation, call
create()
with the same info used for the creation of the pipeline. - Use
getRegions()
to get all the vk::StridedDeviceAddressRegionKHR needed by TraceRayKHR()
Example
m_sbtWrapper.setup(m_device, m_graphicsQueueIndex, &m_alloc, m_rtProperties);
// ...
m_sbtWrapper.create(m_rtPipeline, rayPipelineInfo);
// ...
auto& regions = m_stbWrapper.getRegions();
vkCmdTraceRaysKHR(cmdBuf, ®ions[0], ®ions[1], ®ions[2], ®ions[3], size.width, size.height, 1);
Extra
If data are attached to a shader group (see shaderRecord), it need to be provided independently. In this case, the user must know the group index for the group type.
Here the Hit group 1 and 2 has data, but not the group 0. Those functions must be called before create.
m_sbtWrapper.addData(SBTWrapper::eHit, 1, m_hitShaderRecord[0]);
m_sbtWrapper.addData(SBTWrapper::eHit, 2, m_hitShaderRecord[1]);
Special case
It is also possible to create a pipeline with only a few groups but having a SBT representing many more groups.
The following example shows a more complex setup. There are: 1 x raygen, 2 x miss, 2 x hit. BUT the SBT will have 3 hit by duplicating the second hit in its table. So, the same hit shader defined in the pipeline, can be called with different data.
In this case, the use must provide manually the information to the SBT. All extra group must be explicitly added.
The following show how to get handle indices provided in the pipeline, and we are adding another hit group, re-using the 4th pipeline entry. Note: we are not providing the pipelineCreateInfo, because we are manually defining it.
// Manually defining group indices
m_sbtWrapper.addIndices(rayPipelineInfo); // Add raygen(0), miss(1), miss(2), hit(3), hit(4) from the pipeline info
m_sbtWrapper.addIndex(SBTWrapper::eHit, 4); // Adding a 3rd hit, duplicate from the hit:1, which make hit:2 available.
m_sbtWrapper.addHitData(SBTWrapper::eHit, 2, m_hitShaderRecord[1]); // Adding data to this hit shader
m_sbtWrapper.create(m_rtPipeline);
shadermodulemanager_vk.hpp
class nvvk::ShaderModuleManager
The nvvk::ShaderModuleManager manages VkShaderModules stored in files (SPIR-V or GLSL)
Using ShaderFileManager it will find the files and resolve #include for GLSL. You must add include directories to the base-class for this.
It also comes with some convenience functions to reload shaders etc. That is why we pass out the ShaderModuleID rather than a VkShaderModule directly.
To change the compilation behavior manipulate the public member variables prior createShaderModule.
m_filetype is crucial for this. You can pass raw spir-v files or GLSL. If GLSL is used, shaderc must be used as well (which must be added via _add_package_ShaderC() in CMake of the project)
Example:
ShaderModuleManager mgr(myDevice);
// derived from ShaderFileManager
mgr.addDirectory("spv/");
// all shaders get this injected after #version statement
mgr.m_prepend = "#define USE_NOISE 1\n";
vid = mgr.createShaderModule(VK_SHADER_STAGE_VERTEX_BIT, "object.vert.glsl");
fid = mgr.createShaderModule(VK_SHADER_STAGE_FRAGMENT_BIT, "object.frag.glsl");
// ... later use module
info.module = mgr.get(vid);
shaders_vk.hpp
functions in nvvk
- createShaderModule : create the shader module from various binary code inputs
- createShaderStageInfo: create the shader module and setup the stage from the incoming binary code
stagingmemorymanager_vk.hpp
class nvvk::StagingMemoryManager
nvvk::StagingMemoryManager class is a utility that manages host visible buffers and their allocations in an opaque fashion to assist asynchronous transfers between device and host. The memory for this is allocated using the provided nvvk::MemAllocator.
The collection of the transfer resources is represented by nvvk::StagingID.
The necessary buffer space is sub-allocated and recycled by using one nvvk::BufferSubAllocator per transfer direction (to or from device).
WARNING:
- cannot manage a copy > 4 GB
Usage:
- Enqueue transfers into your VkCommandBuffer and then finalize the copy operations.
- Associate the copy operations with a VkFence or retrieve a SetID
- The release of the resources allows to safely recycle the buffer space for future transfers.
We use fences as a way to garbage collect here, however a more robust solution may be implementing some sort of ticketing/timeline system. If a fence is recycled, then this class may not be aware that the fence represents a different submission, likewise if the fence is deleted elsewhere problems can occur. You may want to use the manual "SetID" system in that case.
Example :
StagingMemoryManager staging;
staging.init(memAllocator);
// Enqueue copy operations of data to target buffer.
// This internally manages the required staging resources
staging.cmdToBuffer(cmd, targetBufer, 0, targetSize, targetData);
// you can also get access to a temporary mapped pointer and fill
// the staging buffer directly
vertices = staging.cmdToBufferT<Vertex>(cmd, targetBufer, 0, targetSize);
// OPTION A:
// associate all previous copy operations with a fence (or not)
staging.finalizeResources( fence );
..
// every once in a while call
staging.releaseResources();
// this will release all those without fence, or those
// who had a fence that completed (but never manual SetIDs, see next).
// OPTION B
// alternatively manage the resource release yourself.
// The SetID represents the staging resources
// since any last finalize.
sid = staging.finalizeResourceSet();
...
// You need to ensure these transfers and their staging
// data access completed yourself prior releasing the set.
//
// This is particularly useful for managing downloads from
// device. The "from" functions return a pointer where the
// data will be copied to. You want to use this pointer
// after the device-side transfer completed, and then
// release its resources once you are done using it.
staging.releaseResourceSet(sid);
structs_vk.hpp
function nvvk::make
Contains templated nvvk::make<T>
function that is
auto-generated by structs.lua
. The function provide default
structs for the Vulkan C api by initializing the VkStructureType sType
field (also for nested structs) and clearing the rest to zero.
auto compCreateInfo = nvvk::make<VkComputePipelineCreateInfo>;
function nvvk::clear
Contains templated nvvk::clear<T>
function
auto-generated by structs.lua
.
swapchain_vk.hpp
class nvvk::SwapChain
nvvk::SwapChain is a helper to handle swapchain setup and use
In Vulkan, we have to use VkSwapchainKHR
to request a swap chain
(front and back buffers) from the operating system and manually
synchronize our and OS's access to the images within the swap chain.
This helper abstracts that process.
For each swap chain image there is an ImageView, and one read and write
semaphore synchronizing it (see SwapChainAcquireState
).
To start, you need to call init
, then update
with the window's
initial framebuffer size (for example, use glfwGetFramebufferSize
).
Then, in your render loop, you need to call acquire()
to get the
swap chain image to draw to, draw your frame (waiting and signalling
the appropriate semaphores), and call present()
.
Sometimes, the swap chain needs to be re-created (usually due to
window resizes). nvvk::SwapChain
detects this automatically and
re-creates the swap chain for you. Every new swap chain is assigned a
unique ID (getChangeID()
), allowing you to detect swap chain
re-creations. This usually triggers a VkDeviceWaitIdle
; however, if
this is not appropriate, see setWaitQueue()
.
Finally, there is a utility function to setup the image transitions from VK_IMAGE_LAYOUT_UNDEFINED to VK_IMAGE_LAYOUT_PRESENT_SRC_KHR, which is the format an image must be in before it is presented.
Example in combination with nvvk::Context :
- get the window handle
- create its related surface
- make sure the Queue is the one we need to render in this surface
// could {.cpp}be arguments of a function/method :
nvvk::Context ctx;
NVPWindow win;
...
// get the surface of the window in which to render
VkWin32SurfaceCreateInfoKHR createInfo = {};
... populate the fields of createInfo ...
createInfo.hwnd = glfwGetWin32Window(win.m_internal);
result = vkCreateWin32SurfaceKHR(ctx.m_instance, &createInfo, nullptr, &m_surface);
...
// make sure we assign the proper Queue to m_queueGCT, from what the surface tells us
ctx.setGCTQueueWithPresent(m_surface);
The initialization can happen now :
m_swapChain.init(ctx.m_device, ctx.m_physicalDevice, ctx.m_queueGCT, ctx.m_queueGCT.familyIndex,
m_surface, VK_FORMAT_B8G8R8A8_UNORM);
...
// after init or update you also have to setup the image layouts at some point
VkCommandBuffer cmd = ...
m_swapChain.cmdUpdateBarriers(cmd);
During a resizing of a window, you can update the swapchain as well :
bool WindowSurface::resize(int w, int h)
{
...
m_swapChain.update(w, h);
// be cautious to also transition the image layouts
...
}
A typical renderloop would look as follows:
// handles vkAcquireNextImageKHR and setting the active image
// w,h only needed if update(w,h) not called reliably.
int w, h;
bool recreated;
glfwGetFramebufferSize(window, &w, &h);
if(!m_swapChain.acquire(w, h, &recreated, [, optional SwapChainAcquireState ptr]))
{
... handle acquire error (shouldn't happen)
}
VkCommandBuffer cmd = ...
// acquire might have recreated the swap chain: respond if needed here.
// NOTE: you can also check the recreated variable above, but this
// only works if the swap chain was recreated this frame.
if (m_swapChain.getChangeID() != lastChangeID){
// after init or resize you have to setup the image layouts
m_swapChain.cmdUpdateBarriers(cmd);
lastChangeID = m_swapChain.getChangeID();
}
// do render operations either directly using the imageview
VkImageView swapImageView = m_swapChain.getActiveImageView();
// or you may always render offline int your own framebuffer
// and then simply blit into the backbuffer. NOTE: use
// m_swapChain.getWidth() / getHeight() to get blit dimensions,
// actual swap chain image size may differ from requested width/height.
VkImage swapImage = m_swapChain.getActiveImage();
vkCmdBlitImage(cmd, ... swapImage ...);
// setup submit
VkSubmitInfo submitInfo = {VK_STRUCTURE_TYPE_SUBMIT_INFO};
submitInfo.commandBufferCount = 1;
submitInfo.pCommandBuffers = &cmd;
// we need to ensure to wait for the swapchain image to have been read already
// so we can safely blit into it
VkSemaphore swapchainReadSemaphore = m_swapChain->getActiveReadSemaphore();
VkPipelineStageFlags swapchainReadFlags = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
submitInfo.waitSemaphoreCount = 1;
submitInfo.pWaitSemaphores = &swapchainReadSemaphore;
submitInfo.pWaitDstStageMask = &swapchainReadFlags);
// once this submit completed, it means we have written the swapchain image
VkSemaphore swapchainWrittenSemaphore = m_swapChain->getActiveWrittenSemaphore();
submitInfo.signalSemaphoreCount = 1;
submitInfo.pSignalSemaphores = &swapchainWrittenSemaphore;
// submit it
vkQueueSubmit(m_queue, 1, &submitInfo, fence);
// present via a queue that supports it
// this will also setup the dependency for the appropriate written semaphore
// and bump the semaphore cycle
m_swapChain.present(m_queue);