poniedziałek, 25 maja 2020

Reverse engineering the rendering of The Witcher 3, part 19 - portals

This post is a part of the series "Reverse engineering the rendering of The Witcher 3".


If you have played The Witcher 3 for long enough, you know that Geralt is not a huge fan of portals. Let's find out if they are really that scary.

There are two types of portals in the game:
Blue portal
Fire portal

I will explain how the fire one is built. It's mostly because its code is simpler comparing to the blue one :)

Here is how the fire portal looks in the game:


The most important part of course is fire rotating towards the centre, but there is more than meets the eye. More about it later.

The plan for today is pretty standard: I will describe geometry first, the vertex and the pixel shaders later. Quite a few screenshots and videos incoming.

In terms of general rendering details, the portals are drawn in forward pass with blending enabled - pretty widespread approach in the game, check shooting stars article for more info.

Let's get going.


1. Geometry

Here's how the portal mesh looks like:
Local space - Front view

Local space - Side view

The mesh reminds Gabriel's Horn. The vertex shader squeezes it along one axis, here's the same mesh afterwards as seen from side (in world space):
The portal mesh after vertex shader (side view)

Besides position, each vertex has extra data associated with it: The relevant ones are (at this point I'll show visualization from RenderDoc, they will be described in more detail later):

Texcoords (float2):


Tangent (float3):



Color (float3):


All of them will be used later, but already at this point there is too much data for .obj file so exporting this mesh can be problematic. What I did was exporting every channel to a separate .csv file, and then I'm loading all the .csv files in my C++ application and am assembling the mesh in runtime from such loaded data.



2. Vertex shader

The vertex shader is not particularly interesting, let's have a quick look at the relevant fragment anyway:
 vs_5_0  
    dcl_globalFlags refactoringAllowed  
    dcl_constantbuffer cb1[7], immediateIndexed  
    dcl_constantbuffer cb2[6], immediateIndexed  
    dcl_input v0.xyz  
    dcl_input v1.xy  
    dcl_input v3.xyz  
    dcl_input v4.xyzw  
    dcl_input v6.xyzw  
    dcl_input v7.xyzw  
    dcl_input v8.xyzw  
    dcl_output o0.xyz  
    dcl_output o1.xyzw  
    dcl_output o2.xyz  
    dcl_output o3.xyz  
    dcl_output_siv o4.xyzw, position  
    dcl_temps 3  
   0: mov o0.xy, v1.xyxx  
   1: mul r0.xyzw, v7.xyzw, cb1[6].yyyy  
   2: mad r0.xyzw, v6.xyzw, cb1[6].xxxx, r0.xyzw  
   3: mad r0.xyzw, v8.xyzw, cb1[6].zzzz, r0.xyzw  
   4: mad r0.xyzw, cb1[6].wwww, l(0.000000, 0.000000, 0.000000, 1.000000), r0.xyzw  
   5: mad r1.xyz, v0.xyzx, cb2[4].xyzx, cb2[5].xyzx  
   6: mov r1.w, l(1.000000)  
   7: dp4 o0.z, r1.xyzw, r0.xyzw  
   8: mov o1.xyzw, v4.xyzw  
   9: dp4 o2.x, r1.xyzw, v6.xyzw  
  10: dp4 o2.y, r1.xyzw, v7.xyzw  
  11: dp4 o2.z, r1.xyzw, v8.xyzw  
  12: mad r0.xyz, v3.xyzx, l(2.000000, 2.000000, 2.000000, 0.000000), l(-1.000000, -1.000000, -1.000000, 0.000000)  
  13: dp3 r2.x, r0.xyzx, v6.xyzx  
  14: dp3 r2.y, r0.xyzx, v7.xyzx  
  15: dp3 r2.z, r0.xyzx, v8.xyzx  
  16: dp3 r0.x, r2.xyzx, r2.xyzx  
  17: rsq r0.x, r0.x  
  18: mul o3.xyz, r0.xxxx, r2.xyzx  

The vertex shader looks pretty similar to the other ones we've seen in this series.
After a quick analysis and comparing with input layout, the output struct can be written like so:
 struct VS_OUTPUT  
 {  
      float3 TexcoordAndViewSpaceDepth : TEXCOORD0;  
      float3 Color : TEXCOORD1;  
      float3 WorldSpacePosition : TEXCOORD2;  
      float3 Tangent : TEXCOORD3;  
      float4 PositionH : SV_Position;  
 };  


One thing I wanted to point out is how the shader retrieves view-space depth (o0.z): it's just .w component of SV_Position.

there is a thread from gamedev.net which explains it in a bit more detail.



3. Pixel shader

Here is an example scene just before drawing a portal...:


...and after:


also, there is an useful "Clear Before Draw" overlay option in RenderDoc texture viewer, so we can precisely see the drawn portal:

The first observation is that the actual fire layer is drawn only in the central area of the mesh.

The pixel shader is 186 lines long, I put it here for convenience and reference. As usual, I will be showing relevant assembly fragments while explaining things.

It's also worth to notice that 100 lines of 186 are related with fog calculations.

To start, there are 4 textures attached as input: fire (t0), noise/smoke (t1), scene color (t6) and scene depth (t15):

Fire texture
Noise/smoke texture
Scene color
Scene depth
There is also a dedicated constant buffer with 14 params which control the effect:

While the inputs: position, tangent and texcoords are quite simple concepts, let's take a closer look at the "Color" channel. After a few experiments it seems this is not a color per se but rather three different masks which the shader uses to distinguish between individual layers and where to apply certain effects:

Color.r - heat haze mask. As the name implies, it's used for heat haze effect (more about it later):


Color.g - inner mask. Used mostly for the fire effect


Color.b - back mask. Used to determine where the "back" of the portal is.


In case of such effects I think it's better to describe particular layers individually instead of analyzing the assembly from the very start to the very end like I used to do long time ago.

So, here we go:



3.1. Fire layer

First, let's investigate the most important bit: a fire layer. Here is a video of it:


The basic idea to achieve such effect is using the static texcoords from per-vertex data and animate them using elapsed time variable from constant buffer. Having such animated texcoords, we sample a texture (fire in this case) with warp/repeat sampler.

Interestingly, in this particular effect actually only the .r channel of the fire texture is sampled. To make the effect more convincing two layers of fire are obtained this way and then they are modulated together.

Alright, alright... let's see some code finally!

We start with making the texcoords more dynamic as they reach the center of the mesh:
   const float2 texcoords = Input.TextureUV;  
   const float uvSquash = cb4_v4.x; // 2.50  
   ...      
 
   const float y_cutoff = 0.2;  
   const float y_offset = pow(texcoords.y - y_cutoff, uvSquash);  

here is the same, but in assembly lang:
  21: add r1.z, v0.y, l(-0.200000)  
  22: log r1.z, r1.z  
  23: mul r1.z, r1.z, cb4[4].x  
  24: exp r1.z, r1.z  


Then, the shader obtains texcoords for the first fire layer and samples the fire texture:
   const float elapsedTimeSeconds = cb0_v0.x;  
   const float uvScaleGlobal1 = cb4_v2.x; // 1.00  
   const float uvScale1 = cb4_v3.x;    // 0.15  
   ...  

   // Sample fire1 - the first fire layer  
   float fire1; // r1.w  
   {
     float2 fire1Uv;  
     fire1Uv.x = texcoords.x;  
     fire1Uv.y = uvScale1 * elapsedTimeSeconds + y_offset;  
        
     const float scaleGlobal = floor(uvScaleGlobal1); // 1.0
     fire1Uv *= scaleGlobal;  
       
     fire1 = texFire.Sample(samplerLinearWrap, fire1Uv).x;  
   }  
   

The corresponding assembly snippet is:
  25: round_ni r1.w, cb4[2].x  
  26: mad r2.y, cb4[3].x, cb0[0].x, r1.z  
  27: mov r2.x, v0.x  
  28: mul r2.xy, r1.wwww, r2.xyxx  
  29: sample_indexable(texture2d)(float,float,float,float) r1.w, r2.xyxx, t0.yzwx, s0  


Here's how the first layer looks like for elapsedTimeSeconds = 50.0:



And to show what y_cutoff actually does, here is the same scene but y_cutoff = 0.5:



This way we have obtained the first layer. Now, the shader obtains the second one:
   const float uvScale2 = cb4_v6.x;       // 0.06  
   const float uvScaleGlobal2 = cb4_v7.x; // 1.00  
   ...  
   
   // Sample fire2 - the second fire layer  
   float fire2; // r1.z  
   {            
     float2 fire2Uv;  
     fire2Uv.x = texcoords.x - uvScale2 * elapsedTimeSeconds;  
     fire2Uv.y = uvScale2 * elapsedTimeSeconds + y_offset;  
     
     const float fire2_scale = floor(uvScaleGlobal2);  
     fire2Uv *= fire2_scale;  
     
     fire2 = texFire.Sample(samplerLinearWrap, fire2Uv).x;  
   }  

and the assembly snippet responsible for it:
  144: mad r2.x, -cb0[0].x, cb4[6].x, v0.x  
  145: mad r2.y, cb0[0].x, cb4[6].x, r1.z  
  146: round_ni r1.z, cb4[7].x  
  147: mul r2.xy, r1.zzzz, r2.xyxx  
  148: sample_indexable(texture2d)(float,float,float,float) r1.z, r2.xyxx, t0.yzxw, s0  

So, as you can see, the only difference are UVs: Now the X is animated as well.

The second layer looks like so:


Once we have the two layers of inner fire, it's time to modulate them. This is a bit more complicated than a simple multiplication though, as the inner mask is involved:
   const float innerMask = Input.Color.y;  
   const float portalInnerColorSqueeze = cb4_v8.x; // 3.00  
   const float portalInnerColorBoost = cb4_v9.x; // 188.00  
   ...  
        
   // Calculate inner fire influence  
   float inner_influence;  // r1.z
   {  
     // innerMask and "-1.0" are used here to control where the inner part of a portal is.  
     inner_influence = fire1 * fire2 + innerMask;  
     inner_influence = saturate(inner_influence - 1.0);  
       
     // Exponentation to hide less luminous elements of inner portal  
     inner_influence = pow(inner_influence, portalInnerColorSqueeze);  
       
     // Boost the intensity  
     inner_influence *= portalInnerColorBoost;  
   }  

And corresponding assembly:
  149: mad r1.z, r1.w, r1.z, v1.y  
  150: add_sat r1.z, r1.z, l(-1.000000)  
  151: log r1.z, r1.z  
  152: mul r1.z, r1.z, cb4[8].x  
  153: exp r1.z, r1.z  
  154: mul r1.z, r1.z, cb4[9].x  

Once we have inner_influence, which is nothing more than just a mask for inner fire, all we have to do is to multiply the mask with the inner fire color:

   // Calculate portal color  
   const float3 colorPortalInner = cb4_v5.rgb; // (1.00, 0.60, 0.21961)  
   ...  
   
   const float3 portal_inner_final = pow(colorPortalInner, 2.2) * inner_influence;  

the assembly:
  155: log r2.xyz, cb4[5].xyzx  
  156: mul r2.xyz, r2.xyzx, l(2.200000, 2.200000, 2.200000, 0.000000)  
  157: exp r2.xyz, r2.xyzx  
  ...  
  170: mad r2.xyz, r2.xyzx, r1.zzzz, r3.xyzx  


Here is a video which shows particular layers of inner fire in action: The order: the first layer, the second layer, the inner influence and the final inner color:



3.2. Glow

Once we have the inner fire, let's take at the second layer: glow. Here is the video which shows inner fire only, then glow only and then their sum - the final fire effect:



Here's how the shader calculates the glow. Similar to the inner fire, at first a mask is generated and then multiplied with glow color from the constant buffer.
   const float portalOuterGlowAttenuation = cb4_v10.x; // 0.30  
   const float portalOuterColorBoost = cb4_v11.x; // 1.50
   const float3 colorPortalOuterGlow = cb4_v12.rgb; // (1.00, 0.61961, 0.30196)  
   ...  
  
   // Calculate outer portal glow  
   float outer_glow_influence;  
   {    
     float outer_mask = (1.0 - backMask) * innerMask;  
       
     const float perturbParam = fire1*fire1;  
     float outer_mask_perturb = lerp( 1.0 - portalOuterGlowAttenuation, 1.0, perturbParam );  
       
     outer_mask *= outer_mask_perturb;  
     outer_glow_influence = outer_mask * portalOuterColorBoost;  
   }  
     
   // the final glow color  
   const float3 portal_outer_final = pow(colorPortalOuterGlow, 2.2) * outer_glow_influence; 
 
   // and the portal color, the sum of fire and glow
   float3 portal_final = portal_inner_final + portal_outer_final;


Here's how the outer_mask looks:

 (1.0 - backMask) * innerMask


The glow is not a constant color. To make it more interesting, it uses animated first fire layer (squared) so wobbles going towards the centre can be noticed:



And the assembly responsible for the glow:
  158: add r2.w, -v1.z, l(1.000000)  
  159: mul r2.w, r2.w, v1.y  
  160: mul r1.w, r1.w, r1.w  
  161: add r3.x, l(1.000000), -cb4[10].x  
  162: add r3.y, -r3.x, l(1.000000)  
  163: mad r1.w, r1.w, r3.y, r3.x  
  164: mul r1.w, r1.w, r2.w  
  165: mul r1.w, r1.w, cb4[11].x  
  166: log r3.xyz, cb4[12].xyzx  
  167: mul r3.xyz, r3.xyzx, l(2.200000, 2.200000, 2.200000, 0.000000)  
  168: exp r3.xyz, r3.xyzx  
  169: mul r3.xyz, r1.wwww, r3.xyzx  
  170: mad r2.xyz, r2.xyzx, r1.zzzz, r3.xyzx  



3.3. Heat haze

When I started analyzing how the portal shader actually works, I was wondering why exactly it needs the scene color without the portal as one of input textures. My main point was "hey, we are using blending here, so it's enough to return a pixel with zero alpha to keep the background color".

The shader has a subtle yet nice effect of heat haze - heat and energy are coming from it so the background is distorted.

The idea is to offset the pixel texcoords and sample the background color texture with the new coordinates - an operation which is impossible with simple blending.

Here is a video which demostrates how this works - the order: full effect first, then heat haze as in the shader, in the end I'm multiplying the offset by 10 to exaggerate the effect.


Let's see how the offset is actually calculated.
   const float ViewSpaceDepth = Input.ViewSpaceDepth;  
   const float3 Tangent = Input.Tangent;  
   const float backgroundDistortionStrength = cb4_v1.x; // 0.40  

   // Fades smoothly from the outer edges to the back of a portal
   const float heatHazeMask = Input.Color.x;
   ...  
     
   // The heat haze effect is view dependent thanks to tangent vectors in view space.  
   float2 heatHazeOffset = mul( normalize(Tangent), (float3x4)g_mtxView);  
   heatHazeOffset *= float2(-1, 1);  
     
   // Fade the effect as camera is further from a portal  
   const float heatHazeDistanceFade = backgroundDistortionStrength / ViewSpaceDepth;  
   heatHazeOffset *= heatHazeDistanceFade;  
        
   heatHazeOffset *= heatHazeMask;  
   
   // this is what animates the heat haze effect  
   heatHazeOffset *= pow(fire1, 0.2);  
        
   // Actually I don't know what's this :)  
   // It was 1.0 usually so I won't bother discussing this.  
   heatHazeOffset *= vsDepth2;  


The relevant assembly is a bit scattered throughout the code, here it is:
  11: dp3 r1.x, v3.xyzx, v3.xyzx  
  12: rsq r1.x, r1.x  
  13: mul r1.xyz, r1.xxxx, v3.xyzx  
  14: mul r1.yw, r1.yyyy, cb12[2].xxxy  
  15: mad r1.xy, cb12[1].xyxx, r1.xxxx, r1.ywyy  
  16: mad r1.xy, cb12[3].xyxx, r1.zzzz, r1.xyxx  
  17: mul r1.xy, r1.xyxx, l(-1.000000, 1.000000, 0.000000, 0.000000)  
  18: div r1.z, cb4[1].x, v0.z  
  19: mul r1.xy, r1.zzzz, r1.xyxx  
  20: mul r1.xy, r1.xyxx, v1.xxxx  
  ...  
  33: mul r1.xy, r1.xyxx, r2.xxxx  
  34: mul r1.xy, r0.zzzz, r1.xyxx  


Once we have the offset calculated, let's use it!
   const float2 backgroundSceneMaxUv = cb0_v2.zw; // (1.0, 1.0)  
   const float2 invViewportSize = cb0_v1.zw; // (1.0 / 1920.0, 1.0 / 1080.0 )
        
   // Obtain background scene color - we need to obtain it from texture  
   // for distortion effect  
   float3 sceneColor;  
   {  
     const float2 sceneUv_0 = pixelUv + backgroundSceneMaxUv*heatHazeOffset;  
     const float2 sceneUv_1 = backgroundSceneMaxUv - 0.5*invViewportSize;  
             
     const float2 sceneUv = min(sceneUv_0, sceneUv_1);  
       
     sceneColor = texScene.SampleLevel(sampler6, sceneUv, 0).rgb;  
   }  


  175: mad r0.xy, cb0[2].zwzz, r1.xyxx, r0.xyxx  
  176: mad r1.xy, -cb0[1].zwzz, l(0.500000, 0.500000, 0.000000, 0.000000), cb0[2].zwzz  
  177: min r0.xy, r0.xyxx, r1.xyxx  
  178: sample_l(texture2d)(float,float,float,float) r1.xyz, r0.xyxx, t6.xyzw, s6, l(0)  

So, in the end we have sceneColor.



3.4. "Destination" color

By "destination" color I refer to the central part of the portal:



Unfortunately, this is all black. And the reason for that is fog.

I have already explored fog solution more or less in part 15 of the series. In the portal shader fog calculations are in [35-135] lines of the source assembly.

HLSL:
 struct FogResult  
 {  
   float4 paramsFog;  
   float4 paramsAerial;  
 };  
   
 ...  
   
 FogResult fog;  
 {  
   const float3 CameraPosition = cb12_v0.xyz;  
   const float fogStart = cb12_v22.z; // near plane  
     
   fog = CalculateFog( WSPosition, CameraPosition, fogStart, false );   
 }  
   
 ...  
   
 const float3 destination_color = fog.paramsFog.a * fog.paramsFog.rgb;  

So this is what brings us the final scene:

The thing is, in this frame camera is so close to the portal that the estimated destination_color is equal to zero so the black center of the portal is actually fog! (or, lack of fog, technically).

Since we are allowed to inject shaders into the game via RenderDoc, let's try to manually offset the camera:
  const float3 CameraPosition = cb12_v0.xyz + float3(100, 100, 0);  

And here's the result:

Ha!

So, while it has very little sense to use fog calculatons in this particular scenario, in theory there is nothing what stops us from using, for instance, a landscape from another world as the destination_color (maybe an extra pair of texcoords would be needed but still, this is perfectly doable).

Using fog could be helpful in case of huge portal which player can see from great distance.


3.5. Mixing (heat hazed) scene color with destination

I was wondering where to put this section - to "destination color" or maybe to "putting all this together" but I decided to make new subsection instead :)

At this point we have sceneColor described in 3.3 which already contains heat haze effect and we also have destination_color from 3.4. 

They are interpolated with lerp:
  178: sample_l(texture2d)(float,float,float,float) r1.xyz, r0.xyxx, t6.xyzw, s6, l(0)  
  179: mad r3.xyz, r4.wwww, r4.xyzx, -r1.xyzx  
  180: mad r0.xyw, r0.wwww, r3.xyxz, r1.xyxz  

What is the value that interpolates them (r0.w) ?
This is where the noise/smoke texture is actually used.

It's used to produce, as I called it, "portal destination mask".



And a video (first the full effect, then the destination mask, then the interpolated heat hazed scene color with destination color):


Take a look at this HLSL snippet:
   // Determines the back part of a portal  
   const float backMask = Input.Color.z;  
   
   const float ViewSpaceDepth = Input.TexcoordAndViewSpaceDepth.z;  
   const float viewSpaceDepthScale = cb4_v0.x; // 0.50    
   ...  
   
   // Load depth from texture  
   float hardwareDepth = texDepth.SampleLevel(sampler15, pixelUv, 0).x;  
   float linearDepth = getDepth(hardwareDepth);  
     
   // cb4_v0.x = 0.5  
   float vsDepthScale = saturate( (linearDepth - ViewSpaceDepth) * viewSpaceDepthScale );  
     
   float vsDepth1 = 2*vsDepthScale;
   
   ....  
   
   // Calculate 'portal destination' mask - maybe we would like see a glimpse of where a portal leads  
   // like landscape from another planet - the shader allows for it.  
   float portal_destination_mask;  
   {    
     const float region_mask = dot(backMask.xx, vsDepth1.xx);  
     
     const float2 _UVScale = float2(4.0, 1.0);  
     const float2 _TimeScale = float2(0.0, 0.2);  
     const float2 _UV = texcoords * _UVScale + elapsedTime * _TimeScale;  
       
     portal_destination_mask = texNoise.Sample(sampler0, _UV).x;  
     portal_destination_mask = saturate(portal_destination_mask + region_mask - 1.0);  
     portal_destination_mask *= portal_destination_mask; // line 143, r0.w  
   }  

The portal destination mask is mostly obtained the same way as fire - using animated texture coordinates. It uses "region_mask" variable to adjust where the effect is placed.

To obtain region_mask, another vaiable called vsDepth1 is used. I will describe it a bit in the next section. It does have a marginal effect on the destination mask though.

The corresponding assembly for the destination mask is:
  137: dp2 r0.w, v1.zzzz, r0.zzzz  
  138: mul r2.xy, cb0[0].xxxx, l(0.000000, 0.200000, 0.000000, 0.000000)  
  139: mad r2.xy, v0.xyxx, l(4.000000, 1.000000, 0.000000, 0.000000), r2.xyxx  
  140: sample_indexable(texture2d)(float,float,float,float) r2.x, r2.xyxx, t1.xyzw, s0  
  141: add r0.w, r0.w, r2.x  
  142: add_sat r0.w, r0.w, l(-1.000000)  
  143: mul r0.w, r0.w, r0.w  



3.6. Putting all this together

Phew, we are almost done.

Let's obtain the portal color first:
 // Calculate portal color  
 float3 portal_final;  
 {  
   const float3 portal_inner_color = pow(colorPortalInner, 2.2) * inner_influence;  
   const float3 portal_outer_color = pow(colorPortalOuterGlow, 2.2) * outer_glow_influence;  
     
   portal_final = portal_inner_color + portal_outer_color;  
   portal_final *= vsDepth1; // fade the effect to avoid harsh artifacts due to depth test  
   portal_final *= portalFinalColorFilter; // this was (1,1,1) - so not relevant  
 }  
   

The only aspect I'd like to discuss here is vsDepth1.

Here is how this mask looks like:

In the previous subsection I showed how this is obtained, basically a "linear depth buffer" which is used to reduce the portal's color so there is no harsh cutoff due to depth test.

Consider the final scene again, with and without the multiplication with vsDepth1.



Once we have portal_final, obtaining the final color is easy:
   const float finalPortalAmount = cb2_v0.x; // 0.99443  
   const float3 finalColorFilter = cb2_v2.rgb; // (1.0, 1.0, 1.0)  
   const float finalOpacityFilter = cb2_v2.a; // 1.0  
   ...  
   
   // Alpha component for blending  
   float opacity = saturate( lerp(cb2_v0.x, 1, cb4_v13.x) );  
   
   // Calculate the final color  
   float3 finalColor;  
   {  
     // Mix the scene color (with heat haze effect) with the 'destination color'.  
     // In this particular example fog is used as destination (which is black where camera is nearby)  
     // but in theory there is nothing which stops us from putting here a landscape from another world.  
     const float3 destination_color = fog.paramsFog.a * fog.paramsFog.rgb;      
     finalColor = lerp( sceneColor, destination_color, portal_destination_mask );  
       
     // Add the portal color  
     finalColor += portal_final * finalPortalAmount;  
       
     // Final filter  
     finalColor *= finalColorFilter;  
   }  
        
   opacity *= finalOpacityFilter;  
     
   return float4(finalColor * opacity, opacity);  

So this is it. There is an extra finalPortalAmount variable which decides how much of the fire you actually see. I haven't tested it in such detail, but I imagine it's used when the portal appears and disappears - for a brief amount of time you don't see fire, but the whole rest instead - glow, the destination color etc.



4. Summary

The final HLSL shader is here if you are interested. I had to reorder a few lines in order to get the same assembly as the original one, but it doesn't interrupt the general flow. The shader is RenderDoc ready, all cbuffers are there etc, so you can inject it and experiment on your own.

Hope you enjoyed it - thanks for reading!

Brak komentarzy:

Publikowanie komentarza