An Opinionated Post on Modern Rendering Abstraction Layers



Overview

Recently I found myself fed up with my personal engine while trying to pivot to newer renderer methodologies, and realized the problem was more or less entirely because interfacing with my abstraction layer had become too painful, especially compared to what I've been using elsewhere. Rendering abstraction layers are something I've spent quite a bit of time on lately, and for better or worse I've become more opinionated on what I consider to be components of strong abstraction layers vs weaker ones, and share these thoughts below. I'll throw some massive disclaimers here that "good" abstraction layers are significantly dependent on the context of the work, and the goals of its usage. No abstraction design is good for everyone. However, I think there are some components that are definitely good for most.


Identify High Level Goals of the Abstraction Layer

Before starting your rendering abstraction, it's important to have some goals about what it needs to be able to accomplish in order for it to be successful. This way, when you encounter a situation where you ask yourself "should I do it this way or that way", you can look to these goals and decide which way best aligns with your goals, and you will build a more cohesive and consistent abstraction. They can also be a good lens from which to re-evaluate poorly conceived abstraction components. Here are some of my goals, from my own context, so you can see where I'm coming from:

-The abstraction layer should be as lightweight as possible. As many API-layer-specific concepts as possible should be hidden from the user, especially if the API-specific implementations of those concepts diverge heavily in usage or behaviour. This frees you to make each underlying API implementation tailored to best make use of the API, rather than force the API to best work with your abstraction layer. This ties into the next point.

-The abstraction layer should be trivial to debug and should load into Brain RAM(TM) quickly. Complex components are sometimes unavoidable, but they should be rare. If the graphics user needs to spend a lot of time to get to the meat of their calls to the device, this is almost always a red flag that you've been too "clever" with your abstractions.

-The abstraction layer should be easy to maintain. The above points lend themselves towards this goal of making an abstraction that doesn't get in your way when you need to modify, fix, or extend it. The more underlying APIs the abstraction covers, the more important this becomes.

That's really it. You'll see later on how much this ends up driving my opinions.


Identify Exactly The Needs of the Rendering Abstraction

An easy way to get too deep into building out an abstraction is to forget what it's intended for. So a good step is to start with identifying what those are, and it's really not all that much.

-Create a device that can talk to a display surface.

-Create, upload, and destroy device resources (buffers, textures, shaders, pipelines, etc).

-Gather, submit, and wait on command work from various passes, in a multicore-compatible way.

Anything else is probably something that should sit on top of the abstraction at the system-level, not as part of the abstraction itself. It needs only enough to be able to do what you need collectively from your abstracted graphics APIs. Given the above, this leaves us with very few actual classes we need to define to fulfil this. We need a device, some kind of representation of a command context, and a representation for each type of resource.


The Resources

Contrary to one of my D3D11 to D3D12 learning posts (where I was just trying to translate concepts in isolation), do yourself a favor and only define a single class for all buffers (vertex, index, constant, storage), and a single class for all textures (2D, 3D, render targets, depth stencil, storage). It's unnecessary to define a class for each individual type and it will only complicate your design. The only meaningful difference between different buffers, for example, are the views into the buffer for binding purposes (SRV, UAV, VB, etc), and whether or not it's host-visible. To define a buffer for creation, you can just use a common description struct with the size, stride, etc, and some kind of usage flags or enum to define which views the underlying APIs should create for the buffer, and whether or not it can be mapped.

Speaking of views, there's really no need to define a common "view" interface for these resource classes. Views work quite differently in different APIs (D3D12 vs Vulkan for example), and the only parts of your renderer that need to know about the specifics of the views should be in the API-specific portions anyway, so there's no reason to expose that as part of the abstraction. When you go to bind resources to your contexts, just pass the buffer or texture for X shader slot and let the underlying API worry about how to grab the views and bind them. If you're worried about validation you can just throw assertions up based on your usage enums and available views and whether or not that lines up with how it's being used by the API.

Edit: To clear up what I mean by this, my intention is to say that you don't need to try to create some common high level representation of an individual "view" to your resources. What would be better is to expose a structure that effectively represents a binding table of buffers and textures, but under the hood (at the API level) compiles that table into a representation of a table of views that fits the API. This way, the API-specific layer can maximize on the binding style that best fits with the API, rather than trying to fit itself to a common binding style.

Now let's take it one step even further: you don't even need to virtualize these classes for use by different APIs, or go all pimpl on it either. A simple example is to just create a Buffer and Texture class in each API with the same name, and use some common Resource.h header that includes the correct implementation based on the defined platform macro. No virtualization, no pimpl indirection needed. This requires only a small amount of code duplication for passing back common data like the resource's description struct, but then leaves these classes open ended to slap in your ID3D12Resource, or your VkBuffer, and your API specific view setup (eg VkBuffer/ImageView for Vulkan, and a flat map of view types to SRV/UAV/CBVs in D3D12). Really simple pseudo-code example:

//D3D12
class Buffer
{
    Buffer(BufferDescription);
 
    uint32_t GetStride();
    uint32_t GetSize();
    uint32_t GetNumElements();
 
    void SetAPIResource(ID3D12Resource*);
    ID3D12Resource* GetAPIResource();
    DescriptorHandle GetView(ViewEnum);
 
    BufferDescription mDesc;
    ID3D12Resource* mAPIResource;
    FlatMap<ViewEnum, DescriptorHandle> mViews;
};
 
//Vulkan
class Buffer
{
    Buffer(BufferDescription);
 
    uint32_t GetStride();
    uint32_t GetSize();
    uint32_t GetNumElements();
 
    void SetAPIResource(VkBuffer);
    VkBuffer GetAPIResource();
 
    BufferDescription mDesc;
    VkBuffer mAPIResource;
};

Sure, a little extra code to write GetStride() per implementation, but after you've done the initial up-front cost, the advantage of this approach is that your common interface needs are fleshed out naturally as needed by the abstraction level, and if you miss implementing a common function in one API, the compiler will complain about it just the same as it would a virtual. API-specific functions (like GetAPIResource) are only used in API-specific files, so you don't have to worry about them differing completely from one API to the next, you're free to set it up however you like because you know the common abstraction layer doesn't need to know about it. At the API-specific level, no casting to the API-derived type is needed either, because we didn't need to virtualize/derive in the first place. You might balk at the duplication, but this kind of methodology keeps everything simple and lets you make maximum usage of the needs of each underlying API.


The Context

This is basically your wrapper for command buffers/lists and whatever state you want to manage with them. You can define your command buffer similarly to how I defined buffers above - an implementation for each API with common and diverging functions in each, and store it as a member of the context. The context interface should serve as a divider between functionality specific to certain queues or workloads. As such, I think it only makes sense to create GraphicsContext (capable of graphics and compute work), ComputeContext (async compute work), UploadContext (copy work), and if available a RayTracingContext for raytracing.

class Context
{
    void Begin();
    void End();
    void ResourceBarrier(BarrierDescription);
    (...)
 
    CommandBuffer mCommandBuffer;
};
 
class GraphicsContext : public Context
{
    void SetPipeline(Pipeline);
    void SetVertexBuffer(Buffer);
    void SetIndexBuffer(Buffer);
    void Draw(...);
    (...)
};
 
class ComputeContext : public Context
{
    void SetPipeline(Pipeline);
    void Dispatch(...);
    (...)
};
 
class UploadContext : public Context
{
    void UploadBuffer(Buffer, Data);
    void UploadTexture(Texture, Data);
    (...)
};


These provide good separators for threading command work as well. A dedicated context each for uploading and async compute work means we can gather the work for those contexts asynchronously to other work, and can split up the graphics work across multiple GraphicsContexts, as many as appropriate for the threading approach. Submission itself can be done in a number of ways. I am partial to submitting contexts themselves to the device interface once recording has completed, to handle the work submission behind the scenes at the API level.


The Device

The device-type abstraction class should handle communicating to the display, and the handling of resources insofar as creation, upload, and destruction. This is an example of what the interface for each API might look like:

class Device
{
    void ProcessWindowChanges(Window);
    Buffer* CreateBuffer(BufferDescription);
    Texture* CreateTexture(TextureDescription);
    Shader* CreateShader(ShaderDescription);
    Pipeline* CreatePipeline(PipelineDescription);
    (...)
 
    Receipt SubmitWork(Context);
    void WaitOnWork(Receipt);
    void Present();
};


I've also found this interface to be a good place to submit context workloads. I don't find it particularly useful to expose the queues themselves, I think it makes more sense to hide them behind the device interface. I also do not find it useful to expose GPU/GPU and GPU/CPU synchronization primitives. At the end of the day you're either syncing frames with the CPU or syncing GPU work with other GPU work at a coarse level. Rather than try to create abstract wrapper classes for concepts that diverge rather significantly (ID3D12Fence, VkSemaphore, VkFence), it makes more sense to wrap that in a simpler way. The "Receipt" concept I introduce here is one such way that can be done. The idea is to return some simple struct that can be used as a lookup for the actual API synchronization at the API-device level where it is free to handle waiting however best makes sense for each API. I never find it helpful trying to mash together disparate concepts of different graphics APIs into a singular confused virtualized interface. More often than not, in can be obscured differently to avoid that problem altogether.


Wrapping Up (Hah) And Advice

The above is more or less my ideal kind of abstraction to interface with. Simple, small, yet providing plenty of room to take full advantage of API specifics underneath it. You can lay other systems on top of it, of course, but the less layers you put in between your abstraction and the API specifics, the better. And if you can do all that while limiting the amount of concepts you need to abstract, you end up with something that's easy to work with, modify, and debug. I place a very high value on that.

Your values may differ, which will lead to different decisions. Lay out your own goals early on, and stick with them (but of course, rework it all if your foundations change). Everyone has their preferences, but what can ruin a rendering abstraction is a wishy washy architecture with no thoughtful foundation. Your architecture must be opinionated to be successful. Otherwise, you end up with a mix of different ideals driving what should be a unified structure. As a developer using another person's tech, I would choose to follow a consistent abstraction that I completely disagree with rather than an inconsistent one that I partially agreed with.


Other Render Architecture Papers/Posts/Code

Graham Wihlidal's "Halcyon Architecture" is one of my favorite presentations on rendering abstractions and architecture. Some of what I say in this post directly conflicts with topics put forward in this presentation (which is totally fine!), but it wins big-time by being consistent and clear in the implementation of its goals. I take a lot of inspiration from this.

Egor Yusov's "Designing a Modern Cross-Platform Low-Level Graphics Library" is another good post on abstractions, and covers older API support as well. The Diligent Engine is well proven out and well documented if you follow the links.

If you want to take a look at some other strong publicly available and proven out abstractions, I'd recommend looking at bgfx and The Forge.

Gijs Kaerts wrote a couple of thought provoking pieces on cross-platform graphics engine architectures.


Super Spicy Hot Takes


Render Graphs Mostly Suck

Render Graphs are mostly overly complicated technical debt magnets. Managing/aliasing shared resources like render targets? Sure. Intelligent automatic resource barriers? Sounds good. Pass compilation, reordering, queue submission decision making? Hard pass, and an easy road to bugs and debugging pain. There's maybe a small handful of functionality that I want render graphs to automatically manage for me, the rest just makes an awful undebuggable mess.


Meta Command Buffers Are Terrible To Debug

Why in the age of modern API deferred command lists/buffers do we still write command buffer abstractions on top of them? If you're storing off command packets into your own command lists and then later executing those commands to actual API command lists, I'd reconsider if you actually need that. If you're doing this to support an older non-deferred command API like OpenGL, then just use that as a sub for that specific API where it's needed and not for all of them. Why? Because when you need to debug your calls to the device, having a stack that isn't MyCommandListToAPICommandListJob::Execute() is incredibly helpful. It's losing your entire draw call context, data, etc and having to painfully retrace to find out what was going on (if you can even do that at all), versus having a full render pass stack with locals. Not to mention that your meta command buffers have added memory and perf overhead just by existing and needing to be executed on actual API command lists which already have perf and memory implications. It's also just another magnet for bugs when you go to translate to API calls.


Leave HLSL Alone

Hey that meta-shader language you wrote over HLSL is pretty cool, but probably just don't. HLSL is objectively the perfect programming language and is completely beyond reproach, so if you're adding meta languages on top of it, you are by definition making it worse. But really, stick to native HLSL as much as you possibly can, because the more you deviate from it, the more inaccessible you make it to developers who just want to write shaders and not have to trudge through "how do I do this HLSL stuff with this meta language?" and all the inevitable bugs and debt that come with that. Bonus points if you can make usage of the great features DXC has like -fvk shift commands to write HLSL that can generate SPIR-V without needing to decorate any of it with Vulkan specific binding code.


Contact