Improve material system data packing#1473
Conversation
1332e02 to
981f302
Compare
|
Is there any way to split this up into multiple PRs? |
|
I believe so, yes. |
f4a4702 to
3045e47
Compare
|
Thought I fixed those, gonna need to recheck it then. |
3045e47 to
217028e
Compare
|
Nevermind, that does not fix it. It's an issue with dynamic lights though. |
f7c1c76 to
92ceaa8
Compare
|
Both issues fixed now. |
92ceaa8 to
232b0b3
Compare
60acd47 to
f4cc9bd
Compare
|
That is very good, thanks. |
df287f7 to
685684f
Compare
| for ( shaderStage_t* pStage : stages ) { | ||
| Material& material = materialPacks[pStage->materialPackID].materials[pStage->materialID]; | ||
|
|
||
| /* Stage variants are essentially copies of the same stage with slightly different values that |
There was a problem hiding this comment.
So what's the ultimate purpose of variants? If I trace e.g. mayUseVertexOverbright/VERTEX_OVERBRIGHT, I can't find any decisions ever being made based on the variant, other than segregating it from other variants.
There was a problem hiding this comment.
Once some of the pStage members are set in MaterialSystem::GenerateMaterialsBuffer(), they are then used in pStage->surfaceDataUpdater() to set the correct surface data. Essentially, for each unique surface data all of the active variants are stored consecutively, with the differences dictated by variant.
There was a problem hiding this comment.
Seems wrong to store them on pStage since if there is more than one draw surf using the same shader (which is the whole point of having variant bit masks, right?) the different values would clobber each other.
There was a problem hiding this comment.
The offset stored in drawSurf_t includes the offset to the specific variant used for it (
Daemon/src/engine/renderer/Material.cpp
Line 452 in 685684f
ProcessStage() and AddStage().
There was a problem hiding this comment.
The mapping between drawSurfs, shader stages, and material buffer can be thought of like this:
shaderStage_t* pStage = drawSurf->shader->stages[stage];
uint32_t baseSurfaceDataIndex = pStage->bufferOffset;
uint32_t variant = drawSurf->shaderVariant[stage]; // Always in the range of [0; active stage variants]
uint32_t variantOffset = pStage->variantOffsets[variant];
void* material = materials + baseSurfaceDataIndex + variantOffset; // materials is mapped to the start of the range for either static or dynamic stages
uint32_t materialOffset = pStage->materialOffset // Different from bufferOffset for dynamic stages
+ variantOffset; // materialOffset is what's then used in the GLSL shaders to index the material array
src/engine/renderer/Material.cpp
Outdated
| lightMode_t lightMode; | ||
| deluxeMode_t deluxeMode; | ||
| SetLightDeluxeMode( drawSurf, pStage->type, lightMode, deluxeMode ); | ||
| pStage->mayUseVertexOverbright = pStage->type == stageType_t::ST_COLORMAP && drawSurf->bspSurface && pStage->shaderBinder == BindShaderGeneric3D; |
There was a problem hiding this comment.
This data should be stored somewhere else. It's hacky to store drawsurf-specific data on the shader.
There was a problem hiding this comment.
Perhaps, though I don't really see where else it could be stored. There could be a wrapper around shaderStage_t passed to GenerateMaterialsBuffer instead, and it could store all the variants, I suppose.
There was a problem hiding this comment.
Well, it's not truly stored in the shader, the shaderStage_t is just used as a proxy, so that surfaceDataUpdater() no longer needs a drawSurf_t as input. When the materials are initially generated, these use drawSurf to set the value to avoid using up memory and doing processing for stage variants that are never used.
There was a problem hiding this comment.
OK I see that they are just temporary; used during one iteration of GenerateMaterialsBuffer.
If it's too burdensome to add an argument to those "updater" functions, it would still be better to store them in a global variable, instead of misleadingly them with shader stages.
There was a problem hiding this comment.
Made it work with propagating it through UpdateSurfaceData*().
6003e22 to
65e465b
Compare
Allocate memory for each stage of used shaders rather than for each stage in all used shaders for each surface that uses them. Change some of the material system functions to use `shaderStage_t*` instead of `drawSurf_t*` etc as input. Textures are put into a different storage buffer, following the structure of `textureBundle_t`. This allows decreasing the total amount of memory used, and changing the material buffer to be a UBO instead. The lightmap and deluxemap handles are put into another UBO, because they're per-surface, rather than per-shader-stage. Also moved some duplicate code into a function.
Only these 6 components (0, 1, 4, 5, 12, 13) actually have an effect. This largely reduces the amount of data being transferred for texture matrices, as well as some useless computations. .
Also removed some now useless code.
Also fixes an incorrect matrix constructor in shaders.
The UBO bindings were conflicting with light UBO.
Also make it in line with other usages of `lightMap`.
efe9430 to
6431517
Compare
|
LGTM |


Partially implements #1448.
Stage data is allocated for stages rather than
surface * stage, which greatly reduces the amount of required data. Texture handles and texture matrix are moved into a different buffer, since stage data ends up being often under 1kb this way. Lightmap/deluxemap handles are moved into a different UBO, because they depend on thedrawSurfrather than on shader stage. All 3 are encoded intobaseInstance(12 bits for shader stage, 12 bits for textures, and 8 more for lightmaps).Optimised
u_Colorby changing it from avec4to auint(the colors we use only have 8-bit precision in engine).Also added
u_ColorModulateColorGenfor use with shaders other thancameraEffects: these only use 12 distinct states, encoded as 5 bits, which also allows using auinthere instead of avec4. AddedunpackUnorm4x8()define for devices that don't supportARB_gpu_shader5.Changed
u_TextureMatrixfrom amat4tomat3x2, reducing its size from 16 to just 6 components since only these 6 components have any actual effect, which allows making theTexDatastruct be exactly 64 bytes.All of the above combined reduces the memory usage of material system by ~50-100x, and improves its performance and load times.
Also added
r_materialSystemSkipcvar, which allows instantaneously switching between core and material renderer (as long as the latter is enabled in the first place), for much faster debugging and performance comparison.