Screen, Mesh, Voxel Trace

Tracing with step-by-step rollback

image

It should be noted that Lumen's Tracing is actually a process of progressive fallback:

  • The information in screen space is sufficiently comprehensive, so it is prioritized to sample the information that already exists in screen space as much as possible. The sampling target is the SceneColorTexture of the previous frame.
  • If screen space does not hit, fallback to Mesh SDF Trace, and use higher-precision MeshSDF information to calculate ray intersection. The sampling target at this time is Mesh Card.
  • If MeshSDF does not hit either, further fallback to Global SDF ray intersection calculation, and the sampling target at this time is Mesh Card.
  • If Global SDF still does not hit, fallback to Radiance Cache.
  • If Radiance Cache still does not have valid information (worst case), sample skylight.

Cone Trace

ℹ️
Engine\Shaders\Private\Lumen\LumenScreenProbeTracingCommon.ush: GetScreenProbeTexelRay

First, let me briefly explain the part of constructing a cone in Cone Trace, as it is a shared step for the next three Tracing Passes.

To construct a cone, we need to know the following information:

  • Where to start tracing :TranslatedWorldPosition
  • Which direction to trace: RayDirection
  • How wide is the cone: ConeHalfAngle
image

For TranslatedWorldPosition, we can calculate it using Depth and projection matrix. This is not complicated.

RayDirection comes from the calculation result in the last section.

ConeHalfAngle is calculated from the current Mip/Texel count. It generates uniformly distributed and non-overlapping Cone angles.

About Mip levels:

In the "Subdivision" section of the previous chapter, after the subdivision is completed, the subdivision level of the written Ray will be increased. This information is used here to determine the size of the Cone Angle.

Screen Trace

ℹ️
Engine\Shaders\Private\Lumen\LumenScreenTracing.ush: TraceScreen

Screen Trace is Ray Tracing. Each Thread is mapped to a Trace Ray. Although Screen Trace still calls the GetScreenProbeTexelRay function to obtain Ray information, the ConeHalfAngle parameter has been ignored and is not used.

The implementation of this part should be inspired by the idea of stochastic-screen-space-reflections in the Frostbite engine. It achieves very fast screen space tracing through the HZB texture of screen depth.

ℹ️
This part is actually highly related to SSR Screen Space Reflections. If readers wish to delve deeper into the topic, please refer to SSR-related resources. I highly recommend readers to take a look at this slide.
image
image
image
image
image
image

The rules of the algorithm are very simple, but the ideas behind it are very interesting:

image
  • Since the Z value of each pixel at each level of HZB is determined by the minimum Z value of the covered area from the previous level, it can be used as a "conservative" depth detection reference. That is, if the endpoint of the current ray is above the Z value of the current level, it can be safely determined that the endpoint will not collide with any pixel in the covered range (left image).
  • On the contrary, if the endpoint is below the Z value of the current level, it means that it may collide with a pixel in a finer level, so further refinement of the judgment is required.

After determining the intersection points of screen space sampling through HZB, the target of sampling is the SceneColorTexture of the previous frame:

image

The output of this step contains two parts:

  • Whether the Trace hit or not
    1. Do you see our cube?
      Do you see our cube?
    2. A small detail: The TraceHit buffer actually contains not only information about whether it hit, but also information about the distance of the hit, whether it hit, and whether it was a high-speed moving object.
  • The Radiance of the Trace

Mesh Trace

image

Mesh SDF Culling

image

Due to the fact that Mesh SDF is stored in a linear discrete form, if we trace it directly, we can only use a huge for loop:

For each Mesh SDF:

Trace Mesh SDF

Sort results, find nearest one

This is obviously extremely inefficient.

To optimize this, Lumen uses several approaches:

  • Only capture Mesh SDF within a specific range.
  • Break down the Mesh into a Grid and use the Grid as an acceleration structure.
image

Compact

image

Only Mesh Trace will be used for those Hits that failed Screen Trace. However, these failed Hit Points are scattered all over the screen, and using a Pass equal to the full screen resolution for Mesh Trace would waste a lot of Thread calculations.

Therefore, at this step, the required Traces will be merged and written into a linear Buffer, which helps to improve parallelism.

image

According to the official PPT, doing this can generate up to 50% acceleration.

Trace

ℹ️
Engine\Shaders\Private\Lumen\LumenReflectionTracing.usf:TraceMeshSDFs

The Trace object of MeshSDF is the MeshSDF within the current Cell where the ScreenProbe is located.

The basic process involves iterating through all the MeshSDFs in the Cell, calling the RayTraceSingleMeshSDF function to Trace each one, and then obtaining the MeshCards information corresponding to the closest Hit MeshSDF.

Then, the SampleLumenMeshCards function is called to read the information in MeshCards, which serves as the final result for Radiance.

Voxel Trace

Before performing actual tracing, similar to the previous steps, a compact operation will be performed on the rays that still miss after passing through Mesh Trace, in order to improve parallelism.

Next, actual tracing will be performed. First, the ConeTraceLumenSceneVoxels function will be called, which actually calls the RayTraceGlobalDistanceField function explained earlier to trace the GlobalDistanceField and determine if it hits.

image

If the Global Distance Field is not hit, further fallback to several alternative solutions:

  • Sample RadianceCache (described later in the text)
  • If RadianceCache also does not contain the necessary information, sample the sky light.