Quantcast
Channel: Timothy Lottes
Viewing all articles
Browse latest Browse all 434

Re: Things that drive me nuts about OpenGL

$
0
0
This post is a reply to Rich Geldreich's Blog : Things that drive me nuts about OpenGL.

"Mantle and D3D12 are going to thoroughly leave GL behind (again!) on the performance and ..." --- Are they? I'm not sure. There is now a NVIDIA beta driver for DX11 that has a DX11 780ti going faster than a Mantle 290x in BF4, StarSwarm, and Theif. Is the DX12 "single view type per table" design really the best mapping to modern bindless hardware (see slide 22)? Multiple view types times multiple descriptor tables (for engines which update descriptor tables with different per-draw, per-material, per-mesh frequency): sounds like lots of extra loads for different table base addresses. OpenGL's bindless might actually be a better design. Even on GCN, OpenGL's joint {texture, sampler} might provide better performance than separate textures and samplers (it saves a scalar load, important for low wave occupancy cases). Of the exception of parallel command buffer generation, GL currently has bindless and persistent mapped buffers (arguably the two most important features for performance). Of the rest of the non-bindless cases {for example constant buffers and shaders}, can a single thread with an optimized driver easily reach the point at which the GPU starts to loose performance do to fixed function state changes introducing GPU internal pipeline bubbles (answer here I believe is easily yes)? Will all engines go wide across many threads to leverage parallel command buffer generation (likely yes), but will random OS thread preemption cause random latency problems? Will DX12's HLSL actually expose the important ISA features of the vendors (features that are currently exposed in GL via extensions)? Etc..

"The DSA (direct state access)-style API should be standard/required" --- Agreed. Can be used on AMD and NVIDIA now.

"The thread's current context may be an implied "this" pointer" --- Yeah using TLS for an implied context was a serious fail. Would like context to be passed into each function (C API), would like MakeCurrent() to go away.

"glGet() API deficiencies ... glGetError() ... Can't query key things such as texture targets" --- IMO GL should be split into two APIs, the low-level driver API (simple to the metal) and an open-source high-level API which provides optional features for debug, safety and state shadowing, etc. This reduces the amount of work all vendors need to do and amortizes cost across anyone who is able to contribute a patch to the high-level API source.

"Documentation hell" --- Many of the docs are actually quite good. For example the OpenGL Quick Reference Card for 4.4 (everything in one simple PDF). What probably is not as well documented are all the fast paths, but various people are attempting to do this in side channels.

""Forward compatible", "compatibility" vs. "core" profiles etc. etc. etc." --- Just use core profiles. AMD's drivers are faster with core only. Games on DX require driver updates to for bugs and performance issues, time for GL games to do the same and stick with the up-to-date API.

"Reliably locking a buffer with DISCARD-semantics on all drivers without stalling the pipeline: ... BufferSubData() stalls when called with "too much" data on threaded drivers" --- Just use ARB_buffer_storage instead. There is no need for discard or glBufferSubData() any more.

"Difficult API to trace, replay, and snapshot/restore" --- Looking forward this will only get worse as developers begin to leverage more CPU and GPU task parallelism with persistent mapped buffers. I know personally no debug tool could ever get a useful trace of the style of workloads I do outside of work regardless of the API design (one would need to replay CPU work during the frame too).

Driver Quality Issues --- The route to solving this problem is by building more GL applications, and doing this in a smart way. Best if developers require up-to-date drivers on GL desktop and use the modern GL fast paths (discard legacy paths). This reduces the amount of work IHVs need to do. Test on both AMD and NVIDIA, report bugs as early as possible with repro cases. Even for DX, applications are a huge amount of the coverage testing.

Viewing all articles
Browse latest Browse all 434

Trending Articles