In the previous chapter, Vertex Shader of Depth Prepass, we discussed the transformation of vertex positions into final clip-space vertex positions via the vertex shader.
Two key components in this process are:
- The LocalToWorld matrix, which is derived from the instance data buffer.
- The WorldToClip matrix, which is accessed via the view uniform buffer.
In this chapter, we will examine both of these buffers in detail.
- View Data
- Instance Data
- Instance Id
- Instance Scene Data
- Primtive Scene Data
- Difference between two cubes
- Instanced static mesh component
- Small things
- Tile
View Data
View data is easier, it stored in the uniform buffer.
These data is mapped from the FViewUniformShaderParameters
in the C++ code
This macro is used in:
Then the FViewUniformShaderParameters
is used in:
BEGIN_SHADER_PARAMETER_STRUCT(FViewShaderParameters, )
SHADER_PARAMETER_STRUCT_REF(FViewUniformShaderParameters, View) //<-- Here!
SHADER_PARAMETER_STRUCT_REF(FInstancedViewUniformShaderParameters, InstancedView)
END_SHADER_PARAMETER_STRUCT()
Finally, in the pass parameters:
BEGIN_SHADER_PARAMETER_STRUCT(FDepthPassParameters, )
SHADER_PARAMETER_STRUCT_INCLUDE(FViewShaderParameters, View) //<-- Here!
SHADER_PARAMETER_STRUCT_INCLUDE(FInstanceCullingDrawParams, InstanceCullingDrawParams)
RENDER_TARGET_BINDING_SLOTS()
END_SHADER_PARAMETER_STRUCT()
When will this data get updated? Do you remember this line in our Exam : Try to draw a cube by yourself :
FRedCubePassParameters* PassParameters = GetRedCubePassParameters(GraphBuilder, View, RenderTargetTexture);
And this is the function:
FRedCubePassParameters* GetRedCubePassParameters(FRDGBuilder& GraphBuilder, const FViewInfo& View, FRDGTextureRef RenderTarget)
{
auto* PassParameters = GraphBuilder.AllocParameters<FRedCubePassParameters>();
PassParameters->View = View.GetShaderParameters(); //<-- Here!
PassParameters->RenderTargets[0] = FRenderTargetBinding(RenderTarget, ERenderTargetLoadAction::ELoad);
return PassParameters;
}
Instance Data
This is the complex part. To make it easier to understand, let's duplicate the cube. Having only one cube doesn't make much sense for instancing.
Now we have two cubes:
The transform matrics are:
These two cubes are rendered together by dynamic instancing, just like we said before:
The main steps to follow are:
- Use
SV_InstanceID
to obtain the actualinstanceId
that can be used as an index to retrieve instance data from the instance data buffer. - Retrieve the
FInstanceSceneData
from theGPUScene.InstanceData
(View_InstanceSceneData
). - Obtain the
PrimitiveId
from the retrievedFInstanceScendData
. - Use the
PrimitiveId
to fetch data from theGPUScene.PrimitiveSceneData
(View_PrimitiveSceneData
).
Instance Id
The input parameter provided by the GPU already contains the Instance Id. So why do we need another buffer? This is because Unreal Engine has a compute shader-based instance culling system that may remove some instances. Therefore, we need an indirect buffer to map the render instance id to the original pre-culling instance id. We then use the mapped id for data fetching.
For our cube example, it’s OK to ignore this.
The instance id buffer’s format is:
Each element is represented in 32 bits. The first 4 bits represent the view ID (which typically does not have many views), while the remaining bits represent the instance ID.
In my two cubes test, the instance id buffer is inverted, which means the first element has an instance id of 1 and the second element has an instance id of 0. I haven't investigated to figure out the reason.
Just a short summary, treat this buffer as a indirect mapping from SV_InstanceID
to the InstanceId
for fetching data from the instance data buffer.
Instance Scene Data
This data buffer contains packed instance data. But there is one thing I need to point out:
The final Instance data structure is
This image shows how some important members are loaded from the buffer.
And for some variant sized data, unreal engine put them into the payload buffer, and load them based on flags.
We apologize for skipping the payload data loading and decoding/calculation portions, but this image provides enough information for us to continue.
Primtive Scene Data
The primitive scene data structure is much complex.
- The data is carefully encoded into the buffer. We need to pack on the cpu side (
FPrimitiveSceneShaderData::Setup
)and unpack on the gpu side (GetPrimitiveData(uint PrimitiveId)
).
It is no meaning to talk about the details of the unpacking too much, so let me show a part of the decoded data:
Primitive.LocalToWorld
:
And this is what it looks like in the original raw data buffer:
This is the transform of our left cube.
Difference between two cubes
In the instance scene data buffer:
Cube 0 | Cube 1 | |
SV_InstanceId | 0 | 1 |
PrimitiveId | 6 | 5 |
LocalToWorld | 1 0 0 300
0 1 0 110
0 0 1 80 | 1 0 0 300
0 1 0 0
0 0 1 80 |
You may be wondering if these two instances can reference a single primitive instead of two.
I don't have the answer, but I suspect it's because they are dynamically instantiated.
Instanced static mesh component
I conducted another test in which, instead of using two static mesh components, I created a single instanced static mesh component and added two instances to it.
The actor is like this
The two instance data is:
Cube 0 Instance | Cube 1 Instance | |
SV_InstanceId | 0 | 1 |
PrimitiveId | 0 | 0 |
InstanceId | 0 | 1 |
LocalToWorld | 1 0 0 320
0 1 0 -100
0 0 1 80 | 1 0 0 320
0 1 0 100
0 0 1 80 |
Small things
Tile
It looks like unreal engine divides the instances into tiles. the tile size is 2097152
. The reason seems because of the LargeWorldCoordinates .