- View
- Primitive Culling
- Data Structure and Relationship
- Occlusion Culling
- Latency
- Mesh Draw Command Culling (Relevance)
- Additional Claim
Do we need to render everything in the scene?
No, because:
- Some objects are not inside the frustum, which can be addressed through primitive culling.
- Some objects are occluded by others, which can be addressed through occlusion culling.
- Some mesh draw commands of one object that passed the culling test may not be necessary to render in the current pass, which can be addressed through mesh draw command culling.
So let’s talk about the culling.
Actually, in unreal engine, Epic Games chose the word Relevance . But to keep the title easy to understand I still use ‘Culling'.
View
Before we talk about the culling, we need to think about the culling is based on what. If we have two camera views, we need to execute the culling for each camera view.
So the culling precedural is view-based. Let’s look at the FViewInfo
The scene renderer may need to render more than one view in one frame (for example, split screen), so it contains an array of FViewInfo
.
Primitive Culling
In this step, it will do Frustum Culling and Distance-based culling in parallel.
Data Structure and Relationship
The result will be write into View.PrimitiveVisibilityMap
- The
PrimitiveVisibilityMap
is aFSceneBitArray
, you can treat this as an optimized version of aTArray<bool>
. - The index of
PrimitiveVisibilityMap
isPrimitiveIndex
, which saved in theFPrimitiveSceneInfo
. If you forget what this is, check
Occlusion Culling
In my environment, Unreal Engine does not use Hierarchical-Z-Buffer (HZB) culling. Instead, it uses hardware queries for culling, which means it has to deal with latency issues.
The process works as follows:
- First, send a bounding box to the GPU and request that it be rendered. Also request that the number of pixels rendered be returned and stored in a variable.
- Later, check the variable and if it contains 0, it means the object was occluded in the last frame.
- We show 2 frames of the occlusion query to help with understanding.
- In the first frame, a query is added to the occlusion query history. This query is actually a render request for the bounding box.
- The query data is then batched into
BatchOcclusionQueries
. - After that, it is sent to the GPU for execution, and the number of pixels that are actually rendered is counted.
- The result is saved back.
- In the next frame, the query result is picked up and the number of rendered pixels is checked.
- If the number of rendered pixels is 0, then the primitive is totally occluded and
bIsOccluded
is set to true.
The final result is saved into PrimitiveVisibilityMap
(if the bIsOccluded
is true) or bOcclusionStateIsDefinite
(if the bIsOccluded
is false).
Latency
By the way, this latency may cause some problems. I'm not sure if this is the case, but it serves as a possible example.
Mesh Draw Command Culling (Relevance)
Actually this part is not about culling mesh draw commands. Since only visible primitives can reach here. So let’s change the word into Relevance.
Let's start by defining two concepts:
- The core data structure in this context is
FRelevancePacket
. It contains the data for both input and output of relevance calculation. - For visibility checks and dynamic instance merging, Unreal Engine separates
visibility
out ofFMeshDrawCommand
, calling itFVisibleMeshDrawCommand
. - The idea here is that the original mesh draw command is heavy and can be cached somewhere. The visibility information of the mesh draw command can be checked and passed around.
- After the check is finished, Unreal Engine writes the necessary mesh draw commands back and uses them in the render pass functions.
The process can be described as follows:
- After primitive culling and occlusion culling, the necessary primitives are identified.
- Create
FRelevancePacket
and execute it in parallel. - If a primitive is relevant, add it to
RelevantStaticPrimitives
. - Pick a primitive from
RelevantStaticPrimitives
. - For each mesh pass:
- If this primitive passes a set of flag checks and should be rendered in this pass, call
AddCommandsForMesh
. - This function takes the cached mesh command from
StaticMeshCommandInfos
. If necessary, this is where it comes from - Build a new
FVisibleMeshDrawCommand
, add a reference to the cached mesh draw command, and add it toVisibleCachedDrawCommands
for later use. - Wait for all parallel jobs to finish.
- Merge the results of each job into the mesh commands array, indexed by the mesh pass.
- This image may help you better understand the meaning of merge
- Later, these commands will be used in the
SetupMeshPass
. And finally it will be used in render pass functions, which we talked in From Mesh Draw Commands to RHI Commands.
Now, let me show a detailed image of this process:
You may noticed, I put some numbers in this image, let me talk about them now.
- This is the function we talked in .
InGetViewRelevance
, you can configure if this render proxy isbStaticRelevance
orbDynamicRelevance
Some of them will be used later.
Additional Claim
In fact, after the culling process, Unreal Engine includes a dynamic instance merging system. However, I have chosen to skip this topic currently for two reasons: 1) It is not directly related to the culling process, and 2) We only have one cube in this case.