This is what I like to do personally for modern native GL engines ($ = different functions or typecasts),
(1.) Use glGenTextures() to allocate large blocks of texture handles.
(2.) Use glTextureStorage$EXT() to allocate storage for textures.
(3.) Use glGenBuffers() and glNamedBufferStorageEXT() to allocate buffers used for TBOs.
(4.) Use glTextureView() with glTextureParameteriEXT() to build a texture sampler pair.
(5.) Use glTextureBufferEXT() to setup views of buffers (TBOs).
(6.) Remember to glBindMultiTextureEXT() once after glTextureView() or glTextureBufferEXT().
(7.) So that only glBindTextures() needs to be called to bind texture or buffer state at runtime.
Step (4.) can be used to create different texture handles which alias the same texture or parts of the same texture, with different compatible formats, and also have different sampler state. Likewise step (5.) can be used to create different texture handles which alias the same buffer with different formats.
Step (6.) sets up the "target" per texture (like GL_TEXTURE_2D) so that during regular rendering, no target information is passed to GL. The glBindTextures() call does not use any "target" parameters.
This is the base path for all DX11 level hardware.
Moving to the Future: Bindless
Starting with NVIDIA's Kepler and AMD's GCN there is an even easier way to manage textures in GL: bindless. This is something I'm looking forward to leveraging when I eventually upgrade to the Kepler/GCN feature level now that I understand the fast path. While there is a requirement of an extra indirection, the cost of the indirection can be minimal if using bindless samplers from immediate indexed uniform buffers. In theory on GCN the cost can be bound to using a total of just 2 scalar registers extra/shader, and one extra scalar 64-bit load per sequence of fetches from a resource. This 64-bit load is from the scalar data cache and is co-issued with regular vector ALU instructions and vector buffer or image accesses. So as long as the shader is not scalar instruction bound, in theory bindless has little overhead. In thoery on Kepler the cost can be bound to using a total of just 2 registers extra/shader, and one extra 64-bit uniform load per sequence of fetches from a resource. Immediate indexed uniform buffer reads are a fast path, so little overhead.
Leveraging bindless removes (7.) in the list above. Instead,
(7.) Use glGetTextureHandleARB() to get a 64-bit handle for textures or TBOs.
(8.) Use glMakeTextureHandleResidentARB() to make texture resident on the GPU.
(9.) Then add these 64-bit handles to a uniform buffer used in the shader.
(10.) In GLSL use sampler$(uvec2 handle) typecast.
No more binding textures or sampler state.
(1.) Use glGenTextures() to allocate large blocks of texture handles.
(2.) Use glTextureStorage$EXT() to allocate storage for textures.
(3.) Use glGenBuffers() and glNamedBufferStorageEXT() to allocate buffers used for TBOs.
(4.) Use glTextureView() with glTextureParameteriEXT() to build a texture sampler pair.
(5.) Use glTextureBufferEXT() to setup views of buffers (TBOs).
(6.) Remember to glBindMultiTextureEXT() once after glTextureView() or glTextureBufferEXT().
(7.) So that only glBindTextures() needs to be called to bind texture or buffer state at runtime.
Step (4.) can be used to create different texture handles which alias the same texture or parts of the same texture, with different compatible formats, and also have different sampler state. Likewise step (5.) can be used to create different texture handles which alias the same buffer with different formats.
Step (6.) sets up the "target" per texture (like GL_TEXTURE_2D) so that during regular rendering, no target information is passed to GL. The glBindTextures() call does not use any "target" parameters.
This is the base path for all DX11 level hardware.
Moving to the Future: Bindless
Starting with NVIDIA's Kepler and AMD's GCN there is an even easier way to manage textures in GL: bindless. This is something I'm looking forward to leveraging when I eventually upgrade to the Kepler/GCN feature level now that I understand the fast path. While there is a requirement of an extra indirection, the cost of the indirection can be minimal if using bindless samplers from immediate indexed uniform buffers. In theory on GCN the cost can be bound to using a total of just 2 scalar registers extra/shader, and one extra scalar 64-bit load per sequence of fetches from a resource. This 64-bit load is from the scalar data cache and is co-issued with regular vector ALU instructions and vector buffer or image accesses. So as long as the shader is not scalar instruction bound, in theory bindless has little overhead. In thoery on Kepler the cost can be bound to using a total of just 2 registers extra/shader, and one extra 64-bit uniform load per sequence of fetches from a resource. Immediate indexed uniform buffer reads are a fast path, so little overhead.
Leveraging bindless removes (7.) in the list above. Instead,
(7.) Use glGetTextureHandleARB() to get a 64-bit handle for textures or TBOs.
(8.) Use glMakeTextureHandleResidentARB() to make texture resident on the GPU.
(9.) Then add these 64-bit handles to a uniform buffer used in the shader.
(10.) In GLSL use sampler$(uvec2 handle) typecast.
No more binding textures or sampler state.