Simple and efficient Vulkan loading with flextGL

Play­ing with Vulkan but don’t want to in­clude thou­sands lines of vari­ous head­ers just to call a few func­tions? Flex­t­GL just learned Vulkan sup­port and it’s here to speed up your turn­around times.

If you don’t know what flex­t­GL is, it’s a func­tion load­er gen­er­at­or for OpenGL, OpenGL ES and now also Vulkan. In com­par­is­on to GLEW, GL3W, GLAD, gl­LoadGen and all oth­er func­tion point­er load­ers it al­lows you to provide a tem­plate and a whitel­ist of ver­sions, ex­ten­sions and func­tions to load, so you can load what you want, how­ever you want.

Chances are you’re us­ing flex­t­GL for func­tion point­er load­ing in your GL / GLES code, so now you can use the same tool for your Vulkan backend as well.

How?

Flex­t­GL con­tains a built­in Vulkan tem­plate that you can use to gen­er­ate a ba­sic load­er. In ad­di­tion you need Py­thon 3 and Wheezy Tem­plate:

pip3 install wheezy.template
git clone git://github.com/mosra/flextGL --branch vulkan

De­sired Vulkan ver­sion, ex­ten­sions and an op­tion­al func­tion white- or black­list is spe­cified us­ing a pro­file file:

version 1.0 vulkan

# Extensions to include on top of the core functionality. The VK_ prefix
# is omitted.
extension KHR_swapchain optional
extension KHR_maintenance1 optional
# ...

# Function whitelist. If you omit this section, all functions from the
# above version and extensions will be pulled in. The vk prefix is omitted.
begin functions
    CreateInstance
    EnumeratePhysicalDevices
    GetPhysicalDeviceProperties
    GetPhysicalDeviceQueueFamilyProperties
    GetPhysicalDeviceMemoryProperties
    # ...
end functions

# You can also choose to have a function blacklist instead, delimited by
# begin functions blacklist and end functions blacklist.

With a pro­file file you can then gen­er­ate the Vulkan load­er head­er + source file like this. The generated/ dir­ect­ory will then con­tain a pair of flextVk.h and flextVk.cpp files:

./flextGLgen.py -D generated -t vulkan profile.txt

Now you can just in­clude it and use. The ac­tu­al func­tion point­er load­ing is done by call­ing flextVkInit() and after that all Vulkan func­tions are avail­able glob­ally:

#include "flextVk.h"

int main() {
    /* Create an instance, load function pointers */
    VkInstance instance;
    {
        VkInstanceCreateInfo info{};
        info.sType = VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO;
        vkCreateInstance(&info, nullptr, &instance);
    }
    flextVkInit(instance);

    VkPhysicalDevice physicalDevices[5];
    std::uint32_t count = 5;
    vkEnumeratePhysicalDevices(instance, &count, &physicalDevice);
    ...
}

Why both­er?

Com­pared to OpenGL, Vulkan is still do­ing baby steps, how­ever the amount of avail­able ex­ten­sions is grow­ing at an alarm­ing rate and soon the size of stock “all you can eat” head­ers will have a sig­ni­fic­ant im­pact on your build times. Be­cause Vulkan API is more about vari­ous types than just func­tion point­ers, flex­t­GL en­sures that only the struc­tures, enums and defines that are ac­tu­ally ref­er­enced by func­tions are pulled in, to shrink head­er sizes even fur­ther.

So, let’s have some meas­ure­ments!

Head­er sizes

The fol­low­ing table com­pares raw line count and line count of pre­pro­cessed out­put when us­ing vari­ous Vulkan load­ers, gen­er­ated by the fol­low­ing two com­mands us­ing GCC 7.3.1 for Vulkan 1.1.74:

wc -l /path/to/header

echo "#include <header>" | g++ -std=c++11 -E -x c++ - | wc -l
Head­er Line count After pre­pro­cessing
#include "flextVk.h" 1 1 710 1 929
#include <MagnumExternal/Vulkan/flextVk.h> 2 3 577 3 592
#include "volk.h" 837 3 6 352
#include <vulkan/vulkan.h> 4 7 470 7 363
#include <vulkan/vulkan.hpp> 5 42 544 ! 83 530 !!
#include <GL/glew.h> (for com­par­is­on) 23 686 ?! 7 464
1.
^ A min­im­al gen­er­ated head­er whitel­ist­ing only func­tions re­quired to build my First Tri­angle in Vulkan. The pro­file file used to gen­er­ate the head­er is in­cluded in the gist.
2.
^ Ex­per­i­ment­al Vulkan head­er in­cluded in latest Mag­num mas­ter, in­clud­ing everything from Vulkan 1.1 + all ex­ten­sions that were pro­moted to 1.1 for back­wards com­pat­ib­il­ity
3.
^ The Volk meta-load­er. While small on its own, it de­pends on the stock vulkan.h for all type and enum defin­i­tions
4.
^ The stock Vulkan head­er provides only func­tion point­er ty­pedefs, not ac­tu­al func­tions, so can’t be used as-is. The vulkan.h head­er it­self has only 79 lines, this counts lines of vulkan_core.h.
5.
^ vulkan.hpp, aim­ing to provide C++11 head­er-only Vulkan “bind­ings” with bet­ter type safety. But, look at those num­bers, ser­i­ously, don’t use this thing. Please.

Com­pile times

I ab­used Cor­rade::Test­Suite and std::system() a bit to bench­mark how long it takes GCC to com­pile each case from the above table in­to an ex­ecut­able that cre­ates the Vulkan in­stance and pop­u­lates func­tion point­ers us­ing giv­en load­er. Only com­pil­a­tion of the ac­tu­al main file is meas­ured, ex­clud­ing time needed to com­pile ex­tra *.cpp, *.c or *.so files, be­cause their cost is usu­ally amort­ized in the pro­ject. Here are the res­ults (hov­er over the bars to get the con­crete val­ues):

62.69 ± 0.84 ms 69.98 ± 2.04 ms 74.78 ± 3.65 ms 76.76 ± 3.34 ms 719.71 ± 6.95 ms 0 100 200 300 400 500 600 700 ms flextVk minimal flextVk Magnum Volk vulkan.h vulkan.hpp 1929 lines 3592 lines 6352 lines 7363 lines 83530 lines Compile time

As ex­pec­ted, vulkan.hpp takes an in­sane amount of time to com­pile — ten times as much as the oth­ers, al­most a second — and this is for every file that (trans­it­ively) in­cludes it! The com­pile time roughly cor­res­ponds to pre­pro­cessed line count from the above table, with flex­t­GL-gen­er­ated head­ers be­ing the smal­lest and fast­est to com­pile.

As is usu­al, the head­ers usu­ally get trans­it­ively in­cluded in­to ma­jor­ity of a pro­ject, so sav­ing 15 mil­li­seconds per file when go­ing from stock head­ers to flex­t­GL-gen­er­ated ones can save you 15 seconds in mod­er­ately sized pro­ject hav­ing 1000 tar­gets. And this gap will be in­creas­ing as more ex­ten­sions get ad­ded to the stock head­ers.

Runtime cost

Be­cause flex­t­GL loads only the func­tions you ac­tu­ally re­ques­ted in­stead of everything that any­body could ever need, it has also some im­pact on star­tup time. The fol­low­ing bench­mark meas­ures the time it takes to call load­er-spe­cif­ic ini­tial­iz­a­tion func­tions. The vulkan.h and vulkan.hpp head­ers aren’t in­cluded, be­cause these rely on ex­tern­al func­tion point­er load­ing and don’t do any on their own.

15.09 ± 0.74 µs 84.98 ± 3.65 µs 197.13 ± 9.45 µs 934.27 ± 25.66 µs 0 200 400 600 800 1000 µs flextVk minimal flextVk Magnum Volk vkCreateInstance() 49 ptrs 192 ptrs 302 ptrs (for comparison) Runtime cost

Again, the meas­ured time cor­res­ponds to ac­tu­al amount of loaded func­tion point­ers. The Vulkan Tri­angle needs just 49 func­tion point­ers, Mag­num loads everything from Vulkan 1.1 to­geth­er with com­mand ali­ases from pro­moted ex­ten­sions, while Volk adds also all known ex­ten­sions. How­ever, note that these are mi­cro­seconds — and com­pared to time that’s needed to cre­ate a Vulkan in­stance (last meas­ure­ment), the sav­ings are only very minor.

Vulkan load­ing in Mag­num

As of mosra/mag­num@b137703, Mag­num ships flex­t­GL-gen­er­ated Vulkan head­ers. To save on del­eg­a­tion over­head, the de­cision was to load per-device func­tion point­ers in­stead of go­ing through per-in­stance func­tion point­ers for everything — that’s also what Volk does with great suc­cess, sav­ing as much as 5% to 10% of driver over­head, de­pend­ing on the work­flow.

Be­sides that, loaded Vulkan func­tions are not glob­al by de­fault in or­der to sup­port mul­tiple co­ex­ist­ing Vulkan in­stances:

#include <MagnumExternal/Vulkan/flextVk.h>

int main() {
    /* Create an instance */
    VkInstance instance;
    {
        VkInstanceCreateInfo info{};
        info.sType = VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO;
        // ...
        vkCreateInstance(&info, nullptr, &instance);
    }

    /* Load per-instance function pointers */
    FlextVkInstance i;
    flextVkInitInstance(instance, &i);

    /* Create a device */
    VkPhysicalDevice physicalDevice;
    {
        uint32_t count = 1;
        i.EnumeratePhysicalDevices(instance, &count, &physicalDevice);
    }
    VkDevice device;
    {
        VkDeviceCreateInfo info{};
        info.sType = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO;
        // ...
        i.CreateDevice(physicalDevice, &info, nullptr, &device);
    }

    /* Load per-device function pointers */
    FlextVkDevice d;
    flextVkInitDevice(device, &d, i.GetDeviceProcAddr);

    // ...
}

In the above snip­pet, the i and d struc­tures con­tain all loaded func­tion point­ers. So in­stead of vkCreateBuffer(device, ...) you’d write d.Createbuffer(device, ), for ex­ample. While this is prop­erly de­coupled, it might get in the way when just play­ing around or ad­apt­ing sample code. For that reas­on, Mag­num provides opt-in glob­al func­tion point­ers as well — just in­clude flextVkGlobal.h in­stead of flextVk.h and load your point­ers glob­ally:

#include <MagnumExternal/Vulkan/flextVkGlobal.h>

int main() {
    /* Create an instance */
    VkInstance instance;
    {
        VkInstanceCreateInfo info{};
        info.sType = VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO;
        // ...
        vkCreateInstance(&info, nullptr, &instance);
    }

    /* Load per-instance function pointers globally */
    flextVkInitInstance(instance, &flextVkInstance);

    /* Create a device */
    VkPhysicalDevice physicalDevice;
    {
        uint32_t count = 1;
        vkEnumeratePhysicalDevices(instance, &count, &physicalDevice);
    }
    VkDevice device;
    {
        VkDeviceCreateInfo info{};
        info.sType = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO;
        // ...
        vkCreateDevice(physicalDevice, &info, nullptr, &device);
    }

    /* Load per-device function pointers globally */
    flextVkInitDevice(device, &flextVkDevice, vkGetDeviceProcAddr);

    // ...
}

In this case flextVkInitInstance() and flextVkInitDevice() will load the point­ers in­to glob­al flextVkInstance and flextVkDevice struc­tures, which then are ali­ases to glob­al vk*() func­tions.

Both ap­proaches can co­ex­ist, just be sure that you call in­stance-/device-spe­cif­ic func­tions on the in­stance/device that they were quer­ied from and everything will work well.

~ ~ ~

And that’s it! Check Vulkan sup­port in flex­t­GL out and please re­port bugs, if you find any. Thanks for read­ing, I’ll be back soon!