Does it really make sense to have access to the entire C++ language to run on GPUs? Aren’t basic constructs that you take for granted in a complicated general purpose language like C++ super-expensive on a GPU? For example, my understanding is that whenever a GPU runs into a branch, it has to essentially run that part of the code twice: once for each branch.
I don’t think their “entire C++ language” wording also includes the standard library (except maybe for algos with std::execution semantics). I’ve never written Vulkan code but I do a lot of CUDA (and now HIP) GPGPU stuff and things like template meta-programming are immensely useful to have generic kernels with stuff like compile-time strategy pattern to avoid lots of mostly duplicate kernels, unrolling and classical loop patterns like tiling etc… can be easily refactored into nice looking metafunctions with zero runtime overhead. It’s a game changer when you have access to modern C++ features to do GPU programming.
I agree that having access to many of the compile time computation features could be nice, but the language also lets you do things like looping over pointers to objects with virtual methods that might not be resolvable until runtime, which I believe would be a performance killer.
Does it really make sense to have access to the entire C++ language to run on GPUs? Aren’t basic constructs that you take for granted in a complicated general purpose language like C++ super-expensive on a GPU? For example, my understanding is that whenever a GPU runs into a branch, it has to essentially run that part of the code twice: once for each branch.
I don’t think their “entire C++ language” wording also includes the standard library (except maybe for algos with
std::execution
semantics). I’ve never written Vulkan code but I do a lot of CUDA (and now HIP) GPGPU stuff and things like template meta-programming are immensely useful to have generic kernels with stuff like compile-time strategy pattern to avoid lots of mostly duplicate kernels, unrolling and classical loop patterns like tiling etc… can be easily refactored into nice looking metafunctions with zero runtime overhead. It’s a game changer when you have access to modern C++ features to do GPU programming.I agree that having access to many of the compile time computation features could be nice, but the language also lets you do things like looping over pointers to objects with virtual methods that might not be resolvable until runtime, which I believe would be a performance killer.