Hi there !
A Few weeks ago, February 16th to be precise, Vulkan, the new graphic API from Khronos was released. It is a new API which gives much more control about the GPUs than OpenGL (API I loved before Vulkan ^_^).
OpenGL’s problems
Driver Overhead
Fast rendering problems could be from the driver, video games don’t use perfectly the GPU (maybe 80% instead of 95-100% of use). Driver overheads have big costs and more recent OpenGL version tend to solve this problem with Bindless Textures, multi draws, direct state access, etc.
Keep in mind that each GPU calls could have a big cost.
Cass Everitt, Tim Foley, John McDonald, Graham Sellers presented Approaching Zero Driver Overhead with OpenGL in 2014.
Multi threading
With OpenGL, it is not possible to have an efficient multi threading, because an OpenGL context is for one and only one thread that is why it is not so easy to make a draw call from another thread ^_^.
Vulkan
Vulkan is not really a low level API, but it provides a far better abstraction for moderns hardwares. Vulkan is more than AZDO, it is, as Graham Sellers said, PDCTZO (Pretty Darn Close To Zero Overhead).
Series of articles about Lava
What is Lava ?
Lava is the name I gave to my new graphic (physics?) engine. It will let me learn how Vulkan work, play with it, implement some global illumination algorithms, and probably share with you my learnings and feelings about Vulkan. It is possible that I’ll make some mistakes, so, If I do, please let me know !
Why Lava ?
Vulkan makes me think about Volcano that make me think about Lava, so… I chose it 😀 .
Initialization
Now begins what I wanted to discuss, initialization of Vulkan.
First of all, you have to really know and understand what you will attend to do. For the beginning, we are going to see how to have a simple pink window.
When you are developing with Vulkan, I advise you to have specifications from Khronos on another window (or screen if you are using multiple screens).
To have an easier way to manage windows, I am using GLFW 3.2, and yes, you are mandatory to compile it yourself ^_^, but it is not difficult at all, so it is not a big deal.
Instance
Contrary to OpenGL, in Vulkan, there is no global state, an instance could be similar to an OpenGL Context. An instance doesn’t know anything about other instances, is utterly isolate. The creation of an instance is really easy.
Instance::Instance(unsigned int nExtensions, const char * const *extensions) { VkInstanceCreateInfo info; info.sType = VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO; info.pNext = nullptr; info.flags = 0; info.pApplicationInfo = nullptr; info.enabledLayerCount = 0; info.ppEnabledLayerNames = nullptr; info.enabledExtensionCount = nExtensions; info.ppEnabledExtensionNames = extensions; vulkanCheckError(vkCreateInstance(&info, nullptr, &mInstance)); }
Physical devices, devices and queues
From this Instance, you could retrieve all GPUs on your computer.
You could create a connection between your application and the GPU you want using a VkDevice.
Creating this connection, you have to create as well queues.
Queues are used to perform tasks, you submit the task to a queue and it will be performed.
The queues are separated between several families.
A good way could be use several queues, for example, one for the physics and one for the graphics (or even 2 or three for this last).
You could as well give a priority (between 0 and 1) to a queue. Thanks to that, if you consider a task not so important, you just have to give to the used queue a low priority :).
Device::Device(const PhysicalDevices &physicalDevices, unsigned i, std::vector<float> const &priorities, unsigned nQueuePerFamily) { VkDeviceCreateInfo info; std::vector<VkDeviceQueueCreateInfo> infoQueue; mPhysicalDevice = physicalDevices[i]; infoQueue.resize(physicalDevices.queueFamilyProperties(i).size()); info.sType = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO; info.pNext = nullptr; info.flags = 0; info.queueCreateInfoCount = infoQueue.size(); info.pQueueCreateInfos = &infoQueue[0]; info.enabledExtensionCount = info.enabledLayerCount = 0; info.pEnabledFeatures = &physicalDevices.features(i); for(auto j(0u); j < infoQueue.size(); ++j) { infoQueue[j].sType = VK_STRUCTURE_TYPE_DEVICE_QUEUE_CREATE_INFO; infoQueue[j].pNext = nullptr; infoQueue[j].flags = 0; infoQueue[j].pQueuePriorities = &priorities[j]; infoQueue[j].queueCount = std::min(nQueuePerFamily, physicalDevices.queueFamilyProperties(i)[j].queueCount); infoQueue[j].queueFamilyIndex = j; } vulkanCheckError(vkCreateDevice(physicalDevices[i], &info, nullptr, &mDevice)); }
Image, ImageViews and FrameBuffers
The images represent a mono or multi dimensional array (1D, 2D or 3D).
The images don’t give any get or set for data. If you want to use them in your application, then you must use ImageViews.
ImageViews are directly relied to an image. The creation of an ImageView is not really complicated.
ImageView::ImageView(Device &device, Image image, VkFormat format, VkImageViewType viewType, VkImageSubresourceRange const &subResourceRange) : mDevice(device), mImage(image) { VkImageViewCreateInfo info; info.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO; info.pNext = nullptr; info.flags = 0; info.image = image; info.viewType = viewType; info.format = format; info.components.r = VK_COMPONENT_SWIZZLE_R; info.components.g = VK_COMPONENT_SWIZZLE_G; info.components.b = VK_COMPONENT_SWIZZLE_B; info.components.a = VK_COMPONENT_SWIZZLE_A; info.subresourceRange = subResourceRange; vulkanCheckError(vkCreateImageView(device, &info, nullptr, &mImageView)); }
You could write into ImageViews via FrameBuffers. A FrameBuffer owns multiple imageViews (attachments) and is used to write into them.
FrameBuffer::FrameBuffer(Device &device, RenderPass &renderPass, std::vector<ImageView> &&imageViews, uint32_t width, uint32_t height, uint32_t layers) : mDevice(device), mRenderPass(renderPass), mImageViews(std::move(imageViews)), mWidth(width), mHeight(height), mLayers(layers){ VkFramebufferCreateInfo info; std::vector<VkImageView> views(mImageViews.size()); for(auto i(0u); i < views.size(); ++i) views[i] = mImageViews[i]; info.sType = VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO; info.pNext = nullptr; info.flags = 0; info.renderPass = renderPass; info.attachmentCount = views.size(); info.pAttachments = &views[0]; info.width = width; info.height = height; info.layers = layers; vulkanCheckError(vkCreateFramebuffer(mDevice, &info, nullptr, &mFrameBuffer)); }
The way to render something
A window is assigned to a Surface (VkSurfaceKHR). To draw something, you have to render into this surface via swapchains.
From notions of Swapchains
In Vulkan, you have to manage the double buffering by yourself via Swapchain. When you create a swapchain, you link it to a Surface and tell it how many images you need. For a double buffering, you need 2 images.
Once the swapchain was created, you should retrieve images and create frame buffers using them.
The steps to have a correct swapchain is :
- Create a Window
- Create a Surface assigned to this Window
- Create a Swapchain with several images assigned to this Surface
- Create FrameBuffers using all of these images.
vulkanCheckError(glfwCreateWindowSurface(instance, mWindow, nullptr, &mSurface));
void SurfaceWindow::createSwapchain() { VkSwapchainCreateInfoKHR info; uint32_t nFormat; vkGetPhysicalDeviceSurfaceFormatsKHR(mDevice, mSurface, &nFormat, nullptr); std::vector<VkSurfaceFormatKHR> formats(nFormat); vkGetPhysicalDeviceSurfaceFormatsKHR(mDevice, mSurface, &nFormat, &formats[0]); if(nFormat == 1 && formats[0].format == VK_FORMAT_UNDEFINED) formats[0].format = VK_FORMAT_B8G8R8A8_SRGB; mFormat = formats[0].format; mRenderPass = std::make_unique<RenderPass>(mDevice, mFormat); info.sType = VK_STRUCTURE_TYPE_SWAPCHAIN_CREATE_INFO_KHR; info.pNext = nullptr; info.flags = 0; info.imageFormat = formats[0].format; info.imageColorSpace = formats[0].colorSpace; info.imageUsage = VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT; info.imageSharingMode = VK_SHARING_MODE_EXCLUSIVE; info.preTransform = VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR; info.compositeAlpha = VK_COMPOSITE_ALPHA_INHERIT_BIT_KHR; info.presentMode = VK_PRESENT_MODE_MAILBOX_KHR; info.surface = mSurface; info.minImageCount = 2; // Double buffering... info.imageExtent.width = mWidth; info.imageExtent.height = mHeight; vulkanCheckError(vkCreateSwapchainKHR(mDevice, &info, nullptr, &mSwapchain)); initFrameBuffers(); }
void SurfaceWindow::initFrameBuffers() { VkImage images[2]; uint32_t nImg = 2; vkGetSwapchainImagesKHR(mDevice, mSwapchain, &nImg, images); for(auto i(0u); i < nImg; ++i) { std::vector<ImageView> allViews; allViews.emplace_back(mDevice, images[i], mFormat); mFrameBuffers[i] = std::make_unique<FrameBuffer>(mDevice, *mRenderPass, std::move(allViews), mWidth, mHeight, 1); } }
Using swapchain is not difficult.
- Acquire the new image index
- Present queue
void SurfaceWindow::begin() { // No checking because could be in lost state if change res vkAcquireNextImageKHR(mDevice, mSwapchain, UINT64_MAX, VK_NULL_HANDLE, VK_NULL_HANDLE, &mCurrentSwapImage); } void SurfaceWindow::end(Queue &queue) { VkPresentInfoKHR info; info.sType = VK_STRUCTURE_TYPE_PRESENT_INFO_KHR; info.pNext = nullptr; info.waitSemaphoreCount = 0; info.pWaitSemaphores = nullptr; info.swapchainCount = 1; info.pSwapchains = &mSwapchain; info.pImageIndices = &mCurrentSwapImage; info.pResults = nullptr; vkQueuePresentKHR(queue, &info); }
To notions of Render Pass
Right now, Vulkan should be initialized. To render something, we have to use render pass, and command buffer.
Command Buffers
Command buffer is quite similar to vertex array object (VAO) or display list (old old old OpenGL 😀 ).
You begin the recorded state, you record some “information” and you end the recorded state.
Command buffers are allocated from the CommandPool.
Vulkan provides two types of Command Buffer.
- Primary level : They should be submitted within a queue.
- Secondary level : They should be executed by a primary level command buffer.
std::size_t CommandPool::allocateCommandBuffer() { VkCommandBuffer cmd; VkCommandBufferAllocateInfo info; info.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO; info.pNext = nullptr; info.commandPool = mCommandPool; info.commandBufferCount = 1; info.level = VK_COMMAND_BUFFER_LEVEL_PRIMARY; vulkanCheckError(vkAllocateCommandBuffers(mDevice, &info, &cmd)); mCommandBuffers.emplace_back(cmd); return mCommandBuffers.size() - 1; }
Renderpass
One render pass is executed on one framebuffer. The creation is not easy at all. One render pass is componed with one or several subpasses.
I remind that framebuffers could have several attachments.
Each attachment are not mandatory to be used for all subpasses.
This piece of code to create one renderpass is not definitive at all and will be changed as soon as possible ^^. But for our example, it is correct.
RenderPass::RenderPass(Device &device, VkFormat format) : mDevice(device) { VkRenderPassCreateInfo info; VkAttachmentDescription attachmentDescription; VkSubpassDescription subpassDescription; VkAttachmentReference attachmentReference; attachmentReference.attachment = 0; attachmentReference.layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL; attachmentDescription.flags = VK_ATTACHMENT_DESCRIPTION_MAY_ALIAS_BIT; attachmentDescription.format = format; attachmentDescription.samples = VK_SAMPLE_COUNT_1_BIT; attachmentDescription.loadOp = VK_ATTACHMENT_LOAD_OP_CLEAR; attachmentDescription.storeOp = VK_ATTACHMENT_STORE_OP_STORE; attachmentDescription.stencilLoadOp = VK_ATTACHMENT_LOAD_OP_CLEAR; attachmentDescription.stencilStoreOp = VK_ATTACHMENT_STORE_OP_STORE; attachmentDescription.initialLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL; attachmentDescription.finalLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL; subpassDescription.flags = 0; subpassDescription.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS; subpassDescription.inputAttachmentCount = 0; subpassDescription.colorAttachmentCount = 1; subpassDescription.pColorAttachments = &attachmentReference; subpassDescription.pResolveAttachments = nullptr; subpassDescription.pDepthStencilAttachment = nullptr; subpassDescription.preserveAttachmentCount = 0; subpassDescription.pPreserveAttachments = nullptr; info.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO; info.pNext = nullptr; info.flags = 0; info.attachmentCount = 1; info.pAttachments = &attachmentDescription; info.subpassCount = 1; info.pSubpasses = &subpassDescription; info.dependencyCount = 0; info.pDependencies = nullptr; vulkanCheckError(vkCreateRenderPass(mDevice, &info, nullptr, &mRenderPass)); }
In the same way as for command buffer, render pass should be began and ended!
void CommandPool::beginRenderPass(std::size_t index, FrameBuffer &frameBuffer, const std::vector<VkClearValue> &clearValues) { assert(index < mCommandBuffers.size()); VkRenderPassBeginInfo info; VkRect2D area; area.offset = VkOffset2D{0, 0}; area.extent = VkExtent2D{frameBuffer.width(), frameBuffer.height()}; info.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO; info.pNext = nullptr; info.renderPass = frameBuffer.renderPass(); info.framebuffer = frameBuffer; info.renderArea = area; info.clearValueCount = clearValues.size(); info.pClearValues = &clearValues[0]; vkCmdBeginRenderPass(mCommandBuffers[index], &info, VK_SUBPASS_CONTENTS_INLINE); }
Our engine in action
Actually, our “engine” is not really usable ^^.
But in the future, command pool, render pass should don’t appear in the user files !
#include "System/contextinitializer.hpp" #include "System/Vulkan/instance.hpp" #include "System/Vulkan/physicaldevices.hpp" #include "System/Vulkan/device.hpp" #include "System/Vulkan/queue.hpp" #include "System/surfacewindow.hpp" #include "System/Vulkan/exception.hpp" #include "System/Vulkan/commandpool.hpp" #include "System/Vulkan/fence.hpp" void init(CommandPool &commandPool, SurfaceWindow &window) { commandPool.reset(); VkClearValue value; value.color.float32[0] = 0.8; value.color.float32[1] = 0.2; value.color.float32[2] = 0.2; value.color.float32[3] = 1; for(int i = 0; i < 2; ++i) { commandPool.allocateCommandBuffer(); commandPool.beginCommandBuffer(i); commandPool.beginRenderPass(i, window.frameBuffer(i), {value}); commandPool.endRenderPass(i); commandPool.endCommandBuffer(i); } commandPool.allocateCommandBuffer(); } void mainLoop(SurfaceWindow &window, Device &device, Queue &queue) { Fence fence(device, 1); CommandPool commandPool(device, 0); while(window.isRunning()) { window.updateEvent(); if(window.neetToInit()) { init(commandPool, window); std::cout << "Initialisation" << std::endl; window.initDone(); } window.begin(); queue.submit(commandPool.commandBuffer(window.currentSwapImage()), 1, *fence.fence(0)); fence.wait(); window.end(queue); } } int main() { ContextInitializer context; Instance instance(context.extensionNumber(), context.extensions()); PhysicalDevices physicalDevices(instance); Device device(physicalDevices, 0, {1.f}, 1); Queue queue(device, 0, 0); SurfaceWindow window(instance, device, 800, 600, "Lava"); mainLoop(window, device, queue); glfwTerminate(); return 0; }
If you want the whole source code :
GitHub
Reference
Approaching Zero Driver Overhead :Lecture
Approaching Zero Driver Overhead : Slides
Vulkan Overview 2015
Vulkan in 30 minutes
VkCube
GLFW with Vulkan
Leave a Reply