Lumen Scene Update

ℹ️
How to effectively update the cache of LumenScene based on the changes in the scene caused by gameplay and physics engine, as well as the change in the camera position?

Lumen Scene Update Trigger

Before analyzing the execution flow, we need to organize which situations may lead to the update of LumenScene.

  1. Movement of Primitive: When a Primitive moves into the current viewport from outside of the viewport or capture range, the update of LumenScene needs to be handled.
  2. image

    This also includes real-time addition and deletion of Primitives.

  3. Movement and orientation of the camera: When the area shot by the camera changes, LumenScene needs to be updated accordingly.
  4. image

    This includes the addition, deletion, and adjustment of the resolution (MipMap level) of Cards.

  5. Update of material itself: The material itself may have nodes that change over time. Lumen will also periodically capture the results of the changes. But it is not necessarily real-time capture.
  6. image

All of these changes will be organized according to priority, and then a part of them will be updated each frame.

Surface Cache Texture

As the result of this update section, it will capture the necessary material information of the relevent cards to Albedo, Normal, Emissive, Depth, Stencil and other buffers. And copy them to the Surface Cache texture.

Albedo Texture
Albedo Texture
Normal Texture
Normal Texture

This gif shows how Albedo texture is rendered:

image

From here, we can see that LumenSurfaceCache also uses the VirtualTexture approach, where all LumenSurfaceCache are mapped to specific regions on the physical Texture through page tables.

Let's organize the data structure hierarchy here.

  • Primitive: Mesh.
    • Cards: Data structures used in Mesh for capturing and caching that are stored in advance. A Mesh can have multiple Cards.
    • Page: The basic unit managed in physical cache Texture where Card is cached. A Card can cover multiple Pages. Similarly, when the resolution of the Card is too small, Lumen will merge multiple Cards into one Page.

Later, we will see how these data structures are gradually updated.

GPU Passes

Lumen Scene Update on the CPU side

The Lumen Scene update is initiated on the CPU side, and the most important part runs in the BeginUpdateLumenSceneTasks function. The call stack is as follows:

  • FDeferredShadingSceneRenderer::Render
    • FDeferredShadingSceneRenderer::EndInitViews
      • FDeferredShadingSceneRenderer::BeginUpdateLumenSceneTasks

For the processing flow of this section, you can refer to this conceptual diagram.

  • Primitives: Delete or add a new primitive.
  • MeshCards: Handle the add / delete of the mesh cards. For example, as your camera moves, a new Mesh enters the capture range.
  • SurfaceCacheRequests:Organize SurfaceCache drawing requests based on the visibility and resolution requirements of existing MeshCards, as well as data from GPU readbacks.
  • CardPagesToRender: Further analyze SurfaceCacheRequest and organize requests for drawing pages according to MipMap levels, etc.
  • MeshDrawCommands: Here is the MeshDrawCommand layer that we are very familiar with when rendering the three-layer architecture. After this, it is consistent with the One Pass process we talked about before.

The following image shows a more detailed process:

image

Please pay attention to the following aspects:

  • The update process of LumenScene starts from Primitive and refines to Cards. This part uses ParallelFor for parallelization.
  • Differential updates are used for MeshCards.
  • In addition to updates directly on the CPU based on distance and resolution, Mesh Card's Page will also be updated based on feedback data from the GPU.
  • After this, SurfaceCacheRequest will be analyzed to sort out the CardPages that need to be rendered.
    • Note that the number of updates will be constrained here to avoid significant fluctuations in frame rates.

Card Relevence

ℹ️
LumenSceneRendering.cpp: FLumenSurfaceCacheUpdateMeshCardsTask::AnyThreadTask()

As mentioned in the "Importance" principle, we only need to capture the Mesh within a close range into the Card Cache. From this Gif, we can see that as the distance increases, the Card information gradually disappears.

image
image
image

The distance to MeshCard and the resolution of MeshCard in the current camera (screen space size) determine whether this MeshCard needs to be rendered, i.e., whether an FSurfaceCacheRequest needs to be generated.

Please note that the MeshCard of the floor in the above picture disappears slower than the Cube. This is because it covers more pixels and has a larger screen space size, so it needs to be excluded from a farther distance.

Card Placement

ℹ️
LumenSceneRendering.cpp: FLumenSceneData::ProcessLumenSurfaceCacheRequests

As we previously discussed, Lumen will compactly allocate the Cards' positions in the SurfaceCache physical texture.

ℹ️
LumenScene.cpp: FLumenSurfaceCacheAllocator::Allocate

Using our current scenario as an example:

  • There is only one MeshCard that belongs to the floor, with a resolution of 128 x 128, therefore occupying a complete page.
image
  • Next, it is the Cube's turn to be allocated. The Cube has 6 MeshCards, each requesting a resolution of 16 x 16.
    • Since the floor has already used one page, the coordinates of the next page are (1, 0), which moves one physical address page to the right.
    • image
    • According to the request of each Card, as the resolution is lower than the resolution of a single page (128), the allocation within the page begins.
    • Each Card is allocated according to a right-to-left arrangement, with a total of 6 pages arranged towards the left, starting from the bottom right corner of the physical page.
    • image
      image

      Note that the value of X for Min decreases gradually.

The final result of the above content reflected in the rendering page layout is as follows:

image

Here we are only talking about the CPU-side work, and this Gif is only used to demonstrate the allocation process. The real rendering will be explained later.

Map Between Card and Page

So now we need to discuss a question:

If we want to sample information from a certain position in the Card, how do we calculate the actual coordinates in the AtlasPageTexture?

image

There are three data structures available to help you complete this task: Cards, PageBuffer, and PageTable.

image

These three Buffers provide two different ways to accomplish the mapping task:

  1. By using PageTable, which requires more computation to determine the coordinates in physical texture with smaller graphics memory bandwidth.
  2. By using PageBuffer, which requires more data to be loaded but involves less computation.

The flowchart for these two methods is as follows:

image

Data Compactness

FLumenCardData requires 9 float4s for compressed storage.

FLumenCardPageData requires 5 float4s for compressed storage.

An item in the PageTable only needs 2 uint32s for storage.

Card Capture Pass

How the pass looks in the capture
How the pass looks in the capture

Card Capture Pass is similar to the familiar BasePass, except that this time it writes not a more complex GBuffer, but only four buffers: Albedo, Normal, Emissive, and DepthStencil.

The other three are relatively easy to understand, so we will focus on how Albedo Buffer is calculated.

ℹ️
/Engine/Private/Lumen/LumenCardPixelShader.usf

Albedo Buffer still needs to calculate the nodes in the Material Graph. Therefore, it can be seen that the "CalcMaterialParametersEx" function is still called during the calculation process. This explains why Lumen can capture the surface color written in the Material Graph that changes over time.

However, after the calculation is completed, it will no longer perform complex calculations such as specular reflection (because it does not have the Camera information required for specular reflection). Instead, it assumes that the surface is a completely diffuse surface, calculates the final color, and then outputs it.

Copy

image

The final result will be copied to the SurfaceCache texture according to the type of Buffer.

Why we need an additional copy?

  • For compression
  • For material merging
Why not write directly to the Surface Cache during the Capture stage, but instead output to the Capture Atlas and then execute CopyToSurfaceCache? This is mainly because BC compression is used, and writing directly would increase compression instructions, leading to decreased write performance. Additionally, a separate Copy Pass can merge different material property data into multiple instances for drawing Quads all at once, resulting in better performance. Ref: https://zhuanlan.zhihu.com/p/516141543

Determination of Card Update List

ℹ️
LumenSceneLighting.usf : BuildPageUpdatePriorityHistogramCS

Now it's time to start calculating the lighting of the Card. But before we start calculating the lighting for a specific Card, we must choose the small part of the Cards that needs to be updated in this frame.

We need a reasonable priority system that meets the following requirements:

  • Able to perform calculations on a large number of CardPages in parallel
  • Able to take into account priority factors such as distance from the camera
  • Able to dynamically adjust priorities based on the most recent update time
image

Lumen's approach is to first calculate the priority of each Card, and then place it in a histogram based on the calculated priority. Finally, a portion of the Cards within the histogram that does not exceed the budget is selected, and these Cards are updated.

image

Some details:

  • The priority levels of the histogram are a total of 128 levels. This number is controlled by the macro PRIORITY_HISTOGRAM_SIZE.
  • The histograms of Direct Lighting and Indirect Lighting are separate. Therefore, the same card will be calculated twice.