tag:blogger.com,1999:blog-13265764104021782232024-03-16T01:10:03.546+00:00john-chapman-graphicsJohn Chapmanhttp://www.blogger.com/profile/00717956687320127486noreply@blogger.comBlogger7125tag:blogger.com,1999:blog-1326576410402178223.post-65954240906750746322013-02-22T13:30:00.002+00:002017-11-05T16:42:32.925+00:00Pseudo Lens FlareLens flare is a photographic artefact, caused by various
interactions between a lens and the light passing through it. Although it is an
artefact, there are a number of motives for simulating lens flare for use in
computer graphics:<br />
<ul>
<li>it increases
the perceived brightness and the apparent dynamic range of an image</li>
<li>lens
flare is ubiquitous in photography, hence its absence in a computer generated
images can be conspicuous</li>
<li>it can
play an important stylistic or dramatic role, or work as part of the gameplay
mechanics for video games (think of glare blinding the player)</li>
</ul>
For real time lens flares, sprite-based techniques have
traditionally been the most common approach. Although sprites produce easily
controllable and largely realistic results, they have to be placed explicitly
and require occlusion data to be displayed correctly.
Here I'll describe a simple and relatively cheap screen
space process which produces a "pseudo" lens flare from an input
colour buffer. It is not physically based, so errs somewhat from photorealism,
but can be used as an addition to (or enhancement of) traditional sprite-based
effects.
<br />
<h2>
Algorithm</h2>
The approach consists of 4 stages:
<br />
<ol>
<li>Downsample/threshold.</li>
<li>Generate
lens flare features.</li>
<li>Blur.</li>
<li>Upscale/blend
with original image.</li>
</ol>
<h3>
1. Downsample/Threshold</h3>
Downsampling is key to reducing the cost of
subsequent stages. Additionally, we want to select a subset of the brightest
pixels in the source image to participate in the lens flare. Using a scale/bias
provides a flexible way to achieve this:
<br />
<div class="listing">
<pre class="sh_glsl"> uniform sampler2D uInputTex;
uniform vec4 uScale;
uniform vec4 uBias;
noperspective in vec2 vTexcoord;
out vec4 fResult;
void main() {
fResult = max(vec4(0.0), texture(uInputTex, vTexcoord) + uBias) * uScale;
}</pre>
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjRmCNvOiUua0sIjuWUVdQuVw7ypPB3y8jguFQlmpf6gkJT6Q8rYuzQVI1TS4FCOox9EE2IMoyPajfcXPZGJIwzIe64ezXWVLdsKl9DwKvtbCMJteTWCoQmf7rPthU4nYzwCyKuB1Ct5Os/s1600/fig1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjRmCNvOiUua0sIjuWUVdQuVw7ypPB3y8jguFQlmpf6gkJT6Q8rYuzQVI1TS4FCOox9EE2IMoyPajfcXPZGJIwzIe64ezXWVLdsKl9DwKvtbCMJteTWCoQmf7rPthU4nYzwCyKuB1Ct5Os/s1600/fig1.jpg" /></a></div>
Adjusting the scale/bias is the main way to
tweak the effect; the best settings will be dependant on the dynamic range of
the input as well as how subtle you want the result to look. Because of the
approximate nature of this technique, subtle is probably better.<br />
<h3>
2. Feature Generation</h3>
Lens flare features tend to pivot around the
image centre. To mimic this, we can just flip the result of the previous stage
horizontally/vertically. This is easily done at the feature generation stage by
flipping the texture coordinates:
<br />
<div class="listing">
<pre class="sh_glsl"> vec2 texcoord = -vTexcoords + vec2(1.0);</pre>
</div>
Doing this isn't strictly necessary; the rest of
the feature generation works perfectly well with or without it. However, the
result of flipping the texture coordinates helps to visually separate the lens
flare effect from the source image.<br />
<br />
<b>GHOSTS</b><br />
<br />
"Ghosts" are the repetitious blobs which mirror
bright spots in the input, pivoting around the image centre. The approach I've
take to generate these is to get a
vector from the current pixel to the centre of the screen, then take a number
of samples along this vector.
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiGwdmQ2Hf3VcqfDPEnDnbSjzkvzPek64stNWp0xV5kfUC88Nl9PRhVwHT0JA5G1sYlb2yuMbj-NtbdRFt-wV97qLEBK3G8IPe5TNyhf-3P6EalQnb8DzvxsE_tk2W7V5EVMxUw9FujihM/s1600/fig2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiGwdmQ2Hf3VcqfDPEnDnbSjzkvzPek64stNWp0xV5kfUC88Nl9PRhVwHT0JA5G1sYlb2yuMbj-NtbdRFt-wV97qLEBK3G8IPe5TNyhf-3P6EalQnb8DzvxsE_tk2W7V5EVMxUw9FujihM/s1600/fig2.jpg" /></a></div>
<div class="listing">
<pre class="sh_glsl"> uniform sampler2D uInputTex;
uniform int uGhosts; // number of ghost samples
uniform float uGhostDispersal; // dispersion factor
noperspective in vec2 vTexcoord;
out vec4 fResult;
void main() {
vec2 texcoord = -vTexcoord + vec2(1.0);
vec2 texelSize = 1.0 / vec2(textureSize(uInputTex, 0));
// ghost vector to image centre:
vec2 ghostVec = (vec2(0.5) - texcoord) * uGhostDispersal;
// sample ghosts:
vec4 result = vec4(0.0);
for (int i = 0; i < uGhosts; ++i) {
vec2 offset = fract(texcoord + ghostVec * float(i));
result += texture(uInputTex, offset);
}
fResult = result;
}</pre>
</div>
Note that I use <code>fract()</code> to ensure that the texture
coordinates wrap around; you could equally use <code>GL_REPEAT</code> as the texture's/sampler's wrap
mode.<br />
<br />
Here's the result:
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEeH3jOPUBL2hLkY2Py6YarvqxkG7HMXu3jfqoDc_UMAv7fzN7K4hfYWBPwb2vO9-UifgQ7zTy_eOwbHI40y9H8vTAI9AVfGNivDBwdf2qxIMXUON0rQ20HQjpBpxwHaERUZqiDmJhIWY/s1600/fig3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEeH3jOPUBL2hLkY2Py6YarvqxkG7HMXu3jfqoDc_UMAv7fzN7K4hfYWBPwb2vO9-UifgQ7zTy_eOwbHI40y9H8vTAI9AVfGNivDBwdf2qxIMXUON0rQ20HQjpBpxwHaERUZqiDmJhIWY/s1600/fig3.jpg" /></a></div>
We can improve this by allowing only bright
spots from the centre of the source image to generate ghosts. We do this by
weighting samples by a falloff from the image centre:<br />
<div class="listing">
<pre class="sh_glsl"> vec4 result = vec4(0.0);
for (int i = 0; i < uGhosts; ++i) {
vec2 offset = fract(texcoord + ghostVec * float(i));
<em>float weight = length(vec2(0.5) - offset) / length(vec2(0.5));</em>
<em>weight = pow(1.0 - weight, 10.0);</em>
result += texture(uInputTex, offset) * weight;
}
</pre>
</div>
The weight function is about as simple as it gets - a linear falloff. The reason we perform the weighting inside the sampling loop is so that bright spots in the centre of the input image can 'cast' ghosts to the edges, but bright spots at the edges can't cast ghosts to the centre.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgEIBExRWlgBjnquHlSQ_yEr7h_t1a1N3khdS1TFaC-KoETWo9KF-GNocw0RlhZaf7RVF6RLzH-cvkT2mq_kym08ln8OXmbTDbYzUuAnO0B76o2cJrDugQ6Cd1GzNzzeV2PAx6tc3mRDn0/s1600/fig4.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgEIBExRWlgBjnquHlSQ_yEr7h_t1a1N3khdS1TFaC-KoETWo9KF-GNocw0RlhZaf7RVF6RLzH-cvkT2mq_kym08ln8OXmbTDbYzUuAnO0B76o2cJrDugQ6Cd1GzNzzeV2PAx6tc3mRDn0/s1600/fig4.jpg" /></a></div>
A final improvement can be made by modulating the ghost colour radially according to a 1D texture:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEfmHiu-QZfofA24K8GPXnZjVwyASBSzHXc2a_PJtuGpHbLSU8RUs1X8PpTvEoO0dScSUY5Rc6NC3_KFEeziVH2BENeUnVvNTOzYThglPidDFqsPifR_nR7DHmcwXgKzvnhH4jlB0Z_aU/s1600/fig5.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEfmHiu-QZfofA24K8GPXnZjVwyASBSzHXc2a_PJtuGpHbLSU8RUs1X8PpTvEoO0dScSUY5Rc6NC3_KFEeziVH2BENeUnVvNTOzYThglPidDFqsPifR_nR7DHmcwXgKzvnhH4jlB0Z_aU/s1600/fig5.jpg" /></a></div>
This is applied <i>after </i>the ghost sampling loop so as to affect the final ghost colour:<br />
<div class="listing">
<pre class="sh_glsl"> result *= texture(uLensColor, length(vec2(0.5) - texcoord) / length(vec2(0.5)));</pre>
</div>
<b><br /></b>
<b>HALOS</b><br />
<b><br /></b>
If we take a vector to the centre of the image, as for the ghost sampling, but fix the vector length, we get a different effect: the source image is radially warped:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdzoTXTTqptgfKQBZFo7zuOvKzkAewXTEngb7-o-fMBxcphBdkNQ0ycTq6XlVj1PjkOOftKFaSQo04CFhAZi4khAytqRJSAz9zZ5w4YQSjPpgLU9356YYc3BC6WOKNXzwmhUVdG9MGgVg/s1600/fig6.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdzoTXTTqptgfKQBZFo7zuOvKzkAewXTEngb7-o-fMBxcphBdkNQ0ycTq6XlVj1PjkOOftKFaSQo04CFhAZi4khAytqRJSAz9zZ5w4YQSjPpgLU9356YYc3BC6WOKNXzwmhUVdG9MGgVg/s1600/fig6.jpg" /></a></div>
We can use this to produce a "halo", weighting the sample to to restrict the contribution of the warped image to a ring, the radius of which is controlled by <code>uHaloWidth</code>:<br />
<div class="listing">
<pre class="sh_glsl"> // sample halo:
vec2 haloVec = normalize(ghostVec) * uHaloWidth;
float weight = length(vec2(0.5) - fract(texcoord + haloVec)) / length(vec2(0.5));
weight = pow(1.0 - weight, 5.0);
result += texture(uInputTex, texcoord + haloVec) * weight;</pre>
</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOA_ALzDt6rv3u-ZTakhdAGlIwhcie1FCugBFgpVHXlI3AOQUPDLXOI0qe3bImg8qOt9wiA82nUyamj7ndFnbWQMy0o6CatSKjgylmqsd2bfECB76_VI5wHWBFm1emSK-Q1oQK94bB8BM/s1600/fig7.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOA_ALzDt6rv3u-ZTakhdAGlIwhcie1FCugBFgpVHXlI3AOQUPDLXOI0qe3bImg8qOt9wiA82nUyamj7ndFnbWQMy0o6CatSKjgylmqsd2bfECB76_VI5wHWBFm1emSK-Q1oQK94bB8BM/s1600/fig7.jpg" /></a></div>
<b>CHROMATIC DISTORTION</b><br />
<b><br /></b>
Some lens flares exhibit chromatic distortion, caused by the varying refraction of different wavelengths of light. We can simulate this by creating a texture lookup function which fetches the red, green and blue channels separately at slightly different offsets along the sampling vector:<br />
<br />
<div class="listing">
<pre class="sh_glsl"> vec3 textureDistorted(
in sampler2D tex,
in vec2 texcoord,
in vec2 direction, // direction of distortion
in vec3 distortion // per-channel distortion factor
) {
return vec3(
texture(tex, texcoord + direction * distortion.r).r,
texture(tex, texcoord + direction * distortion.g).g,
texture(tex, texcoord + direction * distortion.b).b
);
}</pre>
</div>
This can be used as a direct replacement for the calls to <code>texture()</code> in the previous listings. I use the following for the <code>direction</code> and <code>distortion</code> parameters:<br />
<br />
<div class="listing">
<pre class="sh_glsl"> vec2 texelSize = 1.0 / vec2(textureSize(uInputTex, 0));
vec3 distortion = vec3(-texelSize.x * uDistortion, 0.0, texelSize.x * uDistortion);
vec3 direction = normalize(ghostVec);</pre>
</div>
Although this is simple it does cost 3x as many texture fetches, although they should all be cache-friendly unless you set <code>uDistortion</code> to some huge value.<br />
<br />
That's it for feature generation. Here's the result:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGUOWjaXp_rZgpLIsXFAsnzRJ53C42BfRHbiDX8sLZi1RbUCLw3_Sm9cdxT-I_6BeIKyrknqeZ1SHukw8qJPLVh4vRRKsQ0RtbTKjmQopWOGc5EDrR2LjMli4D26Nv5dzeGNvCI-jAKWA/s1600/fig8.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGUOWjaXp_rZgpLIsXFAsnzRJ53C42BfRHbiDX8sLZi1RbUCLw3_Sm9cdxT-I_6BeIKyrknqeZ1SHukw8qJPLVh4vRRKsQ0RtbTKjmQopWOGc5EDrR2LjMli4D26Nv5dzeGNvCI-jAKWA/s1600/fig8.jpg" /></a></div>
<h3>
3. Blur</h3>
Without applying a blur, the lens flare features (in particular, the ghosts) tend to retain the appearance of the source image. By applying a blur to the lens flare features we attenuate high frequencies and in doing so reduce the coherence with the input image, which helps to sell the effect.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiX6tPtn_PLTz1f_yeSOB9ZfxwgMrdNPGxl_xLcGCoyGnhGGh9zVEmQw-uESslGGYJ8K__tw7Sb5O0gchsiBl4O3Ui0N7Q5uc6_SFAPiCSKAjt-_bbBhgTV0xj_fOvHeezYEhBUck-E1XU/s1600/fig9.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiX6tPtn_PLTz1f_yeSOB9ZfxwgMrdNPGxl_xLcGCoyGnhGGh9zVEmQw-uESslGGYJ8K__tw7Sb5O0gchsiBl4O3Ui0N7Q5uc6_SFAPiCSKAjt-_bbBhgTV0xj_fOvHeezYEhBUck-E1XU/s1600/fig9.jpg" /></a></div>
I'll not cover how to achieve the blur here; there are plenty of <a href="http://rastergrid.com/blog/2010/09/efficient-gaussian-blur-with-linear-sampling/" target="_blank">resources on the web</a>.<br />
<h3>
4. Upscale/Blend</h3>
So now we have our lens flare features, nicely blurred. How do we combine this with the original source image? There are a couple of important considerations to make regarding the overall rendering pipeline:<br />
<br />
<ul>
<li>Any post process motion blur or depth of field effect must be applied <i>prior </i>to combining the lens flare, so that the lens flare features don't participate in those effects. Technically the lens flare features would exhibit some motion blur, however it's incompatible with post process motion techniques. As a compromise, you could implement the lens flare using an accumulation buffer.</li>
<li>The lens flare should be applied before any tonemapping operation. This makes physical sense, as tonemapping simulates the reaction of the film/CMOS to the incoming light, of which the lens flare is a constituent part.</li>
</ul>
<div>
With this in mind, there are a couple of things we can do at this stage to improve the result:</div>
<div>
<br /></div>
<div>
<b>LENS DIRT</b></div>
<div>
<b><br /></b></div>
<div>
The first is to modulate the lens flare features by a full-resolution "dirt" texture (as used <a href="http://www.sliceofthe.net/wp-content/uploads/2011/10/bf3-2011-09-29-01-14-32-75.jpg" target="_blank">heavily</a> in <i>Battlefield 3</i>):</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEglY6PScg5yVtK4XUeI5k0jokRUSxiRoXvZ2AZ36O9P7eCBdnq8i6Rahnj8hJKqTbuhISe92RgkEoKq1wj3tplw_HNsUujsSNiHBwIMMEpUejcIysfcgW3KbJv2XgxzbtMM2L66MwHelHc/s1600/fig10.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEglY6PScg5yVtK4XUeI5k0jokRUSxiRoXvZ2AZ36O9P7eCBdnq8i6Rahnj8hJKqTbuhISe92RgkEoKq1wj3tplw_HNsUujsSNiHBwIMMEpUejcIysfcgW3KbJv2XgxzbtMM2L66MwHelHc/s1600/fig10.jpg" /></a></div>
<div class="listing">
<pre class="sh_glsl"> uniform sampler2D uInputTex; // source image
uniform sampler2D uLensFlareTex; // input from the blur stage
uniform sampler2D uLensDirtTex; // full resolution dirt texture
noperspective in vec2 vTexcoord;
out vec4 fResult;
void main() {
vec4 lensMod = texture(uLensDirtTex, vTexcoord);
vec4 lensFlare = texture(uLensFlareTex, vTexcoord) * lensMod;
fResult = texture(uInputTex, vTexcoord) + lensflare;
}</pre>
</div>
<div>
The key to this is the lens dirt texture itself. If the contrast is low, the shapes of the lens flare features tend to dominate the result. As the contrast increases, the lens flare features are subdued, giving a different aesthetic appearance, as well as hiding a few of the imperfections.</div>
<div>
<br /></div>
<div>
<b>DIFFRACTION STARBURST</b><br />
<br />
As a further enhancement, we can use a starburst texture in addition to the lens dirt:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipqz7yQwYB4fXQzrNtaAj6xJAYOT0XXp3KR3mxGAhyphenhyphenfxrSX_Y6sftyNivSKu9C5uLGFEgc_Gna7tRxeCWkXXcztmisMHjqNjQgPGbDyFWQNcPyOxq-BcDaltMHm0kAA4CIdaSvjFHXRlA/s1600/fig11.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipqz7yQwYB4fXQzrNtaAj6xJAYOT0XXp3KR3mxGAhyphenhyphenfxrSX_Y6sftyNivSKu9C5uLGFEgc_Gna7tRxeCWkXXcztmisMHjqNjQgPGbDyFWQNcPyOxq-BcDaltMHm0kAA4CIdaSvjFHXRlA/s1600/fig11.jpg" /></a></div>
As a static texture, the starburst doesn't look very good. We can, however, provide a transformation matrix to the shader which allows us to spin/warp it per frame and produce the dynamic effect we want:<br />
<br />
<div class="listing">
<pre class="sh_glsl"> uniform sampler2D uInputTex; // source image
uniform sampler2D uLensFlareTex; // input from the blur stage
uniform sampler2D uLensDirtTex; // full resolution dirt texture
<em>uniform sampler2D uLensStarTex; // diffraction starburst texture</em>
<em>uniform mat3 uLensStarMatrix; // transforms texcoords</em>
noperspective in vec2 vTexcoord;
out vec4 fResult;
void main() {
vec4 lensMod = texture(uLensDirtTex, vTexcoord);
<em>vec2 lensStarTexcoord = (uLensStarMatrix * vec3(vTexcoord, 1.0)).xy;</em>
<em>lensMod += texture(uLensStarTex, lensStarTexcoord);</em>
vec4 lensFlare = texture(uLensFlareTex, vTexcoord) * lensMod;
fResult = texture(uInputTex, vTexcoord) + lensflare;
}</pre>
</div>
The transformation matrix <code>uLensStarMatrix</code> is based on a value derived from the camera's orientation as follows:<br />
<div class="listing">
<pre class="sh_cpp"> vec3 camx = cam.getViewMatrix().col(0); // camera x (left) vector
vec3 camz = cam.getViewMatrix().col(1); // camera z (forward) vector
float camrot = dot(camx, vec3(0,0,1)) + dot(camz, vec3(0,1,0));</pre>
</div>
There are other ways of obtaining the <code>camrot</code> value; it just needs to change continuously as the camera rotates. The matrix itself is constructed as follows:<br />
<br />
<div class="listing">
<pre class="sh_glsl"> mat3 scaleBias1 = (
2.0f, 0.0f, -1.0f,
0.0f, 2.0f, -1.0f,
0.0f, 0.0f, 1.0f,
);
mat3 rotation = (
cos(camrot), -sin(camrot), 0.0f,
sin(camrot), cos(camrot), 0.0f,
0.0f, 0.0f, 1.0f
);
mat3 scaleBias2 = (
0.5f, 0.0f, 0.5f,
0.0f, 0.5f, 0.5f,
0.0f, 0.0f, 1.0f,
);
mat3 uLensStarMatrix = scaleBias2 * rotation * scaleBias1;</pre>
</div>
The scale and bias matrices are required in order to shift the texture coordinate origin so that we can rotate the starburst around the image centre.<br />
<h3>
Conclusion</h3>
</div>
<div>
So, that's it! This method demonstrates how a relatively simplistic, image-based post process can produce a decent looking lens flare. It's not quite photorealistic, but when applied subtly can give some lovely results. I've provided a <a href="http://www.john-chapman.net/demos/lensflare.zip">demo implementation</a>.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjyGqLSCAfAdT59iFB1yg-NAmwcPdqPBvduGGoU2uSvP9cjM3LxVF0z4YeOxps5t3goOebl13xb0W-WFh6UZb_cMFSGdLnEyE-pbKhyphenhyphenk_Glb5r5_tk1JMVoWNA7urZH_x8P3ZZTUv6BS4A/s1600/fig12.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjyGqLSCAfAdT59iFB1yg-NAmwcPdqPBvduGGoU2uSvP9cjM3LxVF0z4YeOxps5t3goOebl13xb0W-WFh6UZb_cMFSGdLnEyE-pbKhyphenhyphenk_Glb5r5_tk1JMVoWNA7urZH_x8P3ZZTUv6BS4A/s1600/fig12.jpg" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='100' height='480' src='https://www.youtube.com/embed/AjSr0zLBnx8?feature=player_embedded' frameborder='0'></iframe></div>
<div>
<br /></div>
John Chapmanhttp://www.blogger.com/profile/00717956687320127486noreply@blogger.com25tag:blogger.com,1999:blog-1326576410402178223.post-59681818650819473152013-01-27T10:39:00.002+00:002013-05-08T17:05:21.296+01:00Per-Object Motion Blur<i>Originally posted on 24/09/2012</i><br />
<br />
A while back I published a <a href="http://john-chapman-graphics.blogspot.co.uk/2013/01/what-is-motion-blur-motion-pictures-are.html">tutorial</a> describing a screen space technique for approximating motion blur in realtime. The effect was simplistic; it took into account the movement of a camera through the scene, but not the movement of individual objects in the scene. Here I'm going to describe a technique which addresses both types of motion. But let's begin with a brief recap:<br />
<h2>
A Brief Recap</h2>
Motion pictures are made up of a series of still images displayed in quick succession. Each image is captured by briefly opening a shutter to expose a piece of film/electronic sensor. If an object in the scene (or the camera itself) moves during this exposure, the result is blurred along the direction of motion, hence <em>motion blur</em>.<br />
<br />
The previous tutorial dealt only with motion blur caused by camera movement, which is very simple and cheap to achieve, but ultimately less realistic than 'full' motion blur. <br />
<br />
For full motion blur, the approach I'll describe here goes like this: render the velocity at every pixel to a <em>velocity buffer</em>, then subsequently use this to apply a post process directional blur at each pixel to the rendered scene. This isn't the only approach, but it's one of the simplest to implement and has been used effectively in a number of games.<br />
<h2>
Velocity Buffer</h2>
In order to calculate the velocity of a point moving through space we need at least two pieces of information:<br />
<ul>
<li><em>where</em> is the point right now (<strong>a</strong>)?</li>
<li><em>where</em> was the point <em>t</em> seconds ago (<strong>b</strong>)?</li>
</ul>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkSkDZk4s887-6oeCW9MeqUtIf6BPJpzT7-AxkoqjPEIAIuyzT-QRRtCNbK73hXb9yAecqFwLOdUhhlGdpvHgREC4zOFlgegd3hvhvk5mFFypGQYn3nLcGSUxmXCgvfDbzz73kufndqdY/s1600/fig1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkSkDZk4s887-6oeCW9MeqUtIf6BPJpzT7-AxkoqjPEIAIuyzT-QRRtCNbK73hXb9yAecqFwLOdUhhlGdpvHgREC4zOFlgegd3hvhvk5mFFypGQYn3nLcGSUxmXCgvfDbzz73kufndqdY/s1600/fig1.jpg" /></a></div>
<br />
Technically the velocity is <em>(a - b) / t</em> however for our purposes we don't need to use <em>t</em>, at least not when writing to the velocity buffer.<br />
<br />
Since we'll be applying the blur as a post process in image space, we may as well calculate our velocities in image space. This means that our positions (<em>a</em> and <em>b</em>) should undergo the model-view-projection transformation, perspective divide and then a scale/bias. The result can be used to generate texture coordinates directly, as we'll see.<br />
<br />
To actually generate the velocity buffer we render the geometry, transforming every vertex by both the <em>current</em> model-view-projection matrix as well as the <em>previous</em> model-view-projection matrix. In the vertex shader we do the following:<br />
<br />
<div class="listing">
<pre class="sh_glsl"> uniform mat4 uModelViewProjectionMat;
uniform mat4 uPrevModelViewProjectionMat;
smooth out vec4 vPosition;
smooth out vec4 vPrevPosition;
void main(void) {
vPosition = uModelViewProjectionMat * gl_Vertex;
vPrevPosition = uPrevModelViewProjectionMat * gl_Vertex;
gl_Position = vPosition;
}
</pre>
</div>
And in the fragment shader:
<br />
<div class="listing">
<pre class="sh_glsl"> smooth in vec4 vPosition;
smooth in vec4 vPrevPosition;
out vec2 oVelocity;
void main(void) {
vec2 a = (vPosition.xy / vPosition.w) * 0.5 + 0.5;
vec2 b = (vPrevPosition.xy / vPrevPosition.w) * 0.5 + 0.5;
oVelocity = a - b;
}
</pre>
</div>
You may be wondering why we can't just calculate velocity directly in the vertex shader and just pick up an interpolated velocity in the fragment shader. The reason is that, because of the perspective divide, the velocity is nonlinear. This can be a problem if polygons are clipped; the resulting interpolated velocity is incorrect for any given pixel:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPb8rTn3ukcXkRvzO4-Q5ZaPD7G3J8aG794XpuZ442ZcU3uqmb-2sdq7v8aKeRFB9vYplcd87Qob1dNCBhlNWbudHY9BSNb7z7Q8f23Uu6BJkWk4FFZJDzqvA5CfBIDzRQBr8sQZ45wuQ/s1600/fig2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPb8rTn3ukcXkRvzO4-Q5ZaPD7G3J8aG794XpuZ442ZcU3uqmb-2sdq7v8aKeRFB9vYplcd87Qob1dNCBhlNWbudHY9BSNb7z7Q8f23Uu6BJkWk4FFZJDzqvA5CfBIDzRQBr8sQZ45wuQ/s1600/fig2.jpg" /></a></div>
<br />
For now, I'm assuming you've got a floating point texture handy to store the velocity result (e.g. <code>GL_RG16F</code>). I'll discuss velocity buffer formats and the associated precision implications later.<br />
<br />
So at this stage we have a per-pixel, image space velocity incorporating both camera and object motion.<br />
<h2>
Blur</h2>
Now we have a snapshot of the per-pixel motion in the scene, as well as the rendered image that we're going to blur. If you're rendering HDR, the blur should (ideally) be done prior to tone mapping. Here are the beginnings of the blur shader:<br />
<div class="listing">
<pre class="sh_glsl"> uniform sampler2D uTexInput; // texture we're blurring
uniform sampler2D uTexVelocity; // velocity buffer
uniform float uVelocityScale;
out vec4 oResult;
void main(void) {
vec2 texelSize = 1.0 / vec2(textureSize(uTexInput, 0));
vec2 screenTexCoords = gl_FragCoord.xy * texelSize;
vec2 velocity = texture(uTexMotion, screenTexCoords).rg;
velocity *= uVelocityScale;
// blur code will go here...
}</pre>
</div>
Pretty straightforward so far. Notice that I generate the texture coordinates inside the fragment shader; you can use a varying, it doesn't make a difference. We will, however, be needing <code>texelSize</code> later on.<br />
<br />
What's <code>uVelocityScale</code>? It's used to address the following problem: if the framerate is very high, <code>velocity</code> will be very small as the amount of motion in between frames will be low. Correspondingly, if the framerate is very low the motion between frames will be high and <code>velocity</code> will be much larger. This ties the blur size to the framerate, which is technically correct if you equate framrate with shutter speed, however is undesirable for realtime rendering where the framerate can vary. To fix it we need to cancel out the framerate:<br />
<div class="listing">
<pre class="sh_cpp"> uVelocityScale = currentFps / targetFps;</pre>
</div>
Dividing by a 'target' framerate (shutter speed) seems to me to be an intuitive way of controlling how the motion blur looks; a high target framerate (high shutter speed) will result in less blur, a low target framerate (low shutter speed) will result in more blur, much like a real camera.<br />
<br />
The next step is to work out how many samples we're going to take for the blur. Rather than used a fixed number of samples, we can improve performance by adapting the number of samples according to the velocity:<br />
<div class="listing">
<pre class="sh_glsl"> float speed = length(velocity / texelSize);
nSamples = clamp(int(speed), 1, MAX_SAMPLES);</pre>
</div>
By dividing <code>velocity</code> by <code>texelSize</code> we can get the speed in texels. This needs to be clamped: we want to take at least 1 sample but no more than <code>MAX_SAMPLES</code>.<br />
<br />
Now for the actual blur itself:<br />
<div class="listing">
<pre class="sh_glsl"> oResult = texture(uTexInput, screenTexCoords);
for (int i = 1; i < nSamples; ++i) {
vec2 offset = velocity * (float(i) / float(nSamples - 1) - 0.5);
oResult += texture(uTexInput, screenTexCoords + offset);
}
oResult /= float(nSamples);</pre>
</div>
Note that the sampling is centred around the current texture coordinate. This is in order to reduce the appearance of artefacts cause by discontinuities in the velocity map:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZoRU1R0F7ATOjvsZL2f7yGSfyihBlcB3MpiauF8iZyW8132sKKZ9LEnsSFJRUdWgfGUzQeHaX_d1OYlnaQN7qGcFuRckaQ7Kr7HzFs58fjU1vmnMQ5zindv2v18tsfe1voEA81bQgc_M/s1600/fig3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZoRU1R0F7ATOjvsZL2f7yGSfyihBlcB3MpiauF8iZyW8132sKKZ9LEnsSFJRUdWgfGUzQeHaX_d1OYlnaQN7qGcFuRckaQ7Kr7HzFs58fjU1vmnMQ5zindv2v18tsfe1voEA81bQgc_M/s1600/fig3.jpg" /></a></div>
<br />
That's it! This is about as basic as it gets for this type of post process motion blur. It works, but it's far from perfect.<br />
<h2>
Far From Perfect</h2>
I'm going to spend the remainder of the tutorial talking about some issues along with potential solutions, as well as some of the limitations of this class of techniques.<br />
<h3>
Silhouettes</h3>
The velocity map contains discontinuities which correspond with the silhouettes of the rendered geometry. These silhouettes transfer directly to the final result and are most noticeable when things are moving fast (i.e. when there's lots of blur).<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgrj3FQPRgzpYW6JT9HDa6F-YgyMWA9A7NyTWxNywLLiBRnRsSsE7f16QkyGLT9aZZrhFBOVkCKIsnQZjbO11pSlCmn5wurq_i5oXBAnf3jX0lrNHDg1GxUsnocOgJb9XjbFxQH5mhahfA/s1600/fig4.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgrj3FQPRgzpYW6JT9HDa6F-YgyMWA9A7NyTWxNywLLiBRnRsSsE7f16QkyGLT9aZZrhFBOVkCKIsnQZjbO11pSlCmn5wurq_i5oXBAnf3jX0lrNHDg1GxUsnocOgJb9XjbFxQH5mhahfA/s1600/fig4.jpg" /></a></div>
<br />
One solution as outlined <a href="http://www.slideshare.net/ozlael/motionblur-3252650">here</a> is to do away with the velocity map and instead render all of the geometry a second time, stretching the geometry along the direction of motion in order to dilate each object's silhouette for rendering the blur.<br />
<br />
Another approach is to perform dilation on the velocity buffer, either in a separate processing step or on the fly when performing the blur. <a href="http://graphics.cs.williams.edu/papers/MotionBlurI3D12/">This paper</a> outlines such an approach.<br />
<h3>
Background Bleeding</h3>
Another problem occurs when a fast moving object is behind a slow moving or stationary object. Colour from the foreground object bleeds into the background:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZsZ1nQRDNs3zMBj_1SO3dmr2hJYJSrE1yI3LMnegJr4zpJA0vArtx9blaf6udWHDyUCHUrlHl9w39NIInlh_WxsNc1rWryjYF8R2j0S8ySGy9KDWRjyxcqK0kUyyGQMSrTHEDSlHClX0/s1600/fig5.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZsZ1nQRDNs3zMBj_1SO3dmr2hJYJSrE1yI3LMnegJr4zpJA0vArtx9blaf6udWHDyUCHUrlHl9w39NIInlh_WxsNc1rWryjYF8R2j0S8ySGy9KDWRjyxcqK0kUyyGQMSrTHEDSlHClX0/s1600/fig5.jpg" /></a></div>
<br />
A possible solution is to use the depth buffer, if available, to weight samples based on their relative depth. The weights need to be tweaked such that valid samples are not excluded.<br />
<h3>
Format & Precision</h3>
For the sake of simplicity I assumed a floating point texture for the velocity buffer, however the reality may be different, particularly for a deferred renderer where you might have to squeeze the velocity into as few as two bytes. Using an unsigned normalized texture format, writing to and reading from the velocity buffer requires a scale/bias:<br />
<div class="listing">
<pre class="sh_glsl">// writing:
oVelocity = (a - b) * 0.5 + 0.5;
// reading:
vec2 velocity = texture(uTexMotion, screenTexCoords).rg * 2.0 - 1.0;</pre>
</div>
Using such a low precision velocity buffer causes some artifacts, most noticeably excess blur when the velocity is very small or zero.<br />
<br />
The solution to this is to use the <code>pow()</code> function to control how precision in the velocity buffer is distributed. We want to increase precision for small velocities at the cost of worse precision for high velocities.<br />
<br />
Writing/reading the velocity buffer now looks like this:<br />
<div class="listing">
<pre class="sh_glsl">// writing:
oVelocity = (a - b) * 0.5 + 0.5;
oVelocity = pow(oVelocity, 3.0);
// reading:
vec2 velocity = texture(uTexMotion, screenTexCoords).rg;
velocity = pow(velocity, 1.0 / 3.0);
velocity = velocity * 2.0 - 1.0;</pre>
</div>
<br />
<h3>
Transparency</h3>
Transparency presents similar difficulties with this technique as with deferred rendering: since the velocity buffer only contains information for the nearest pixels we can't correctly apply a post process blur when pixels at different depths all contribute to the result. In practice this results in 'background' pixels (whatever is visible through the transparent surface) to be blurred (or not blurred) incorrectly.<br />
<br />
The simplest solution to this is to prevent transparent objects from writing to the velocity buffer. Whether this improves the result depends largely on the number of transparent objects in the scene.<br />
<br />
Another idea might be to use blending when writing to the velocity buffer for transparent objects, using the transparent material's opacity to control the contribution to the velocity buffer. Theoretically this could produce an acceptable compromise although in practice it may not be possible depending on how the velocity buffer is set up.<br />
<br />
A correct, but much more expensive approach would be to render and blur each transparent object separately and then recombine with the original image.<br />
<h2>
Conclusions</h2>
It's fairly cheap, it's very simple and it looks pretty good in a broad range of situations. Once you've successfully implemented this, however, I'd recommend stepping up to a more sophisticated approach as described <a href="http://graphics.cs.williams.edu/papers/MotionBlurI3D12/">here</a>.<br />
<br />
I've provided a <a href="http://www.john-chapman.net/demos/mblur.zip">demo implementation</a>.
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='100' height='480' src='https://www.youtube.com/embed/8PUzfBxzaYw?feature=player_embedded' frameborder='0'></iframe></div>
<br />John Chapmanhttp://www.blogger.com/profile/00717956687320127486noreply@blogger.com17tag:blogger.com,1999:blog-1326576410402178223.post-34374959452393052812013-01-24T19:22:00.002+00:002013-01-24T19:30:41.651+00:00Shadow Map Allocation<i>Originally posted on 15/08/2011</i><br />
<br />
When I first implemented shadow mapping I began by allocating a shadow map texture to each shadow casting light. As the number of shadow casting lights grew, however, I realized that this wasn't an adequate solution. Allocating a single shadow map per-light is a bad idea for three main reasons:<br />
<ol>
<li>Shadow maps have a definite memory cost, so in order to keep the texture memory requirement constant as more shadow casting lights are added, shadow map size would need to be reduced proportionally. This has an ultimate negative impact on shadow quality.</li>
<li>Rendering a shadow map can be skipped if the associated light volume doesn't intersect the view frustum, therefore any texture memory allocated for the shadow maps which aren't rendered is wasted.</li>
<li>Shadow lights whose influence on the final image is small (i.e. lights covering a smaller area or lights which are far away) require fewer shadow map texels to produce the same quality of shadow; rendering a fixed-size shadow map can therefore be both a waste of texture space <em>and</em> rendering time.</li>
</ol>
Issue #1 can simply be solved by allocating a fixed number of shadow maps up front, and using these as a shadow map 'pool', or by allocating a single shadow map texture and rendering to/reading from portions of it as if they were separate textures.<br />
<br />
Issues #2 and #3 are related in that they affect the amount of shadow map space that's actually required on a per-frame basis. Shadow maps which don't need to be rendered don't require any space (obviously), shadow maps which <em>do</em> need to be rendered require different amounts of shadow map space, depending on how they influence the final image.<br />
<br />
This all points the way to a solution in which the available shadow map space can be allocated from a shadow map 'pool' per-frame and per-light, based on a couple of criteria:<br />
<ol>
<li>how much space is actually available</li>
<li>how much space each light requires to get good (or good enough) quality results</li>
</ol>
The first criteria is simple enough; I divide a single shadow texture up into a number of fixed-size subsections, like this:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhTfR6AFWV87smQOr2cKZxQXwe58b7OjqtT9vqeFCKRS5OvJOcW224Ny6TZD6KtPpMVtGf3SqrCBgczcQ0qA_jqJaz-uySHwsvSQaa1yJdgYCF2CPtnEFHnfPVvZNJa5C9RCyRSgOUM2sE/s1600/fig1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhTfR6AFWV87smQOr2cKZxQXwe58b7OjqtT9vqeFCKRS5OvJOcW224Ny6TZD6KtPpMVtGf3SqrCBgczcQ0qA_jqJaz-uySHwsvSQaa1yJdgYCF2CPtnEFHnfPVvZNJa5C9RCyRSgOUM2sE/s1600/fig1.jpg" /></a></div>
<br />
So for a 2048<sup>2</sup> texture this gives me 2x1024<sup>2</sup>, 6x512<sup>2</sup> and 8x256<sup>2</sup> individual shadow maps, for a maximum of 16 shadow casting lights. These are indexed in order according to their relative size, as shown in the diagram. Even though there is a hard limit on the number of shadow maps, the simplicity of this scheme makes it attractive.<br />
<br />
The second criteria is a little more complex; for each light there needs to be a way of judging its 'importance' <em>relative to the other shadow casting lights</em> so that an appropriate shadow map can be assigned from the pool. This 'importance' metric needs to incorporate the radius and distance of a given light volume: <a href="http://en.wikipedia.org/wiki/Angular_diameter">angular diameter</a> is perfect for this. The actual calculation of angular diameter is done using trig:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgLEwW8Cy3MV5zA_PYARRjqf7y9gPCytCO5w4u5D5gALgWsqb4lHEGemMQZ0Ps_vrV0MnIJbqHoCSBaT7a6s9Vw1P0s9Grds7uP1phSsFzlO65xGYUusNhoIpI3KDUyXxiQrrt9sM6nuAc/s1600/exp1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgLEwW8Cy3MV5zA_PYARRjqf7y9gPCytCO5w4u5D5gALgWsqb4lHEGemMQZ0Ps_vrV0MnIJbqHoCSBaT7a6s9Vw1P0s9Grds7uP1phSsFzlO65xGYUusNhoIpI3KDUyXxiQrrt9sM6nuAc/s1600/exp1.jpg" /></a></div>
In practice the <em>actual</em> angular diameter isn't needed, since all we want to know is whether or not the angular diameter of one light's volume is bigger or smaller than another, so we can use a cheaper trig-less formula:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEic8v1TR-LzXIccTN6rFQhjrsiNtjYK7AnhuuMpq2rGbEwcubOFW9BOGCvQ6Qmdwaj6essW6MQftO9dY8aY39XYqYLzDh8BbCekuL_iYVmYKKa_YEyH_LvpISYx7Xfd2trMvUIUma28m5E/s1600/exp2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEic8v1TR-LzXIccTN6rFQhjrsiNtjYK7AnhuuMpq2rGbEwcubOFW9BOGCvQ6Qmdwaj6essW6MQftO9dY8aY39XYqYLzDh8BbCekuL_iYVmYKKa_YEyH_LvpISYx7Xfd2trMvUIUma28m5E/s1600/exp2.jpg" /></a></div>
Once every frame, we calculate this 'importance' value for each <i>visible </i>light, then they are sorted into importance order and assigned a shadow map from the pool. The most important lights get the biggest, the least important get the smallest. Here's the whole process in action:<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='320' height='266' src='https://www.youtube.com/embed/fSvCNachleI?feature=player_embedded' frameborder='0'></iframe></div>
<br />
This technique works best if the lights are spread apart, otherwise the discrepancies in shadow quality become more obvious and 'popping' (as individual lights skip between shadow map resolutions) becomes more noticeable. The worst case is to have lots of nearby lights of similar size being allocated different shadow map resolutions; it can be very easy to spot which light is getting assigned the bigger shadow map.<br />
<br />
Another drawback is when rendering from multiple POVs (e.g. for split-screen multiplayer). Since the importance metric is POV-dependant, the shadow maps may be valid for one view and not for another. You could use a separate shadow map pool per-view, or re-render all of the shadow maps prior to rendering each view.<br />
<br />
On the plus side this technique makes it very easy to add lots of shadow casting lights to a scene without too badly denting the available texture resources. It also helps to maximize performance, since rendering time and texture space get spent in the places they're needed most. By using portions of a single shadow map, scaling the quality becomes as simple using a larger or smaller texture.<br />
<br />
An additional idea would be to dynamically tessellate the main shadow map at runtime, based on the number of shadow lights and their importance. This may result in more popping, however, as the frequency of resolution changes for each light could be as high as once per frame.<br />
<br />
The importance metric can also be used to determine how to filter a shadow map more efficiently (e.g. whether to spend time doing a multisampled/stochastic lookups).<br />
<h2>
Update (26/06/2012)</h2>
I've been asked a couple of times about how to go about using a single shadow texture in the way I've indicated here, so I thought I'd patch this blog post with the requested info. It's pretty simple and can be used any time you want to render to/render from a texture sub-region.<br />
<ol>
<li>The first step is writing to the texture; bind it to the framebuffer and set the viewport to render into the texture sub-region.</li>
<li>When accessing the sub-region, scale/bias the texture coordinates as follows: <code><b>scale</b> = region size / texture size</code>, <code><b>bias</b> = region offset / texture size</code>.</li>
</ol>
The downside is that hardware texture filtering can cause texels to 'bleed' into the sub-regions if you're not careful. Edge-handling (wrap, repeat, etc.) needs to be performed manually in the shader. This isn't too much of a problem with shadow maps.<br />
<br />
I recently had another idea (which I've not played around with yet - let me know if you try this out) to spread the cost of shadow map rendering across multiple frames. This could be achieved by incorporating each shadow map's age (or <em>frames since rendered</em>) into the metric, such that <code>importance = radius / distance * (age + 1)</code>. Age gets incremented every frame until the shadow map gets rendered, in which case it gets reset to 0 (or to 1, in which case you can remove the '<code>+1</code>' from the importance calculation).<br />
<br />
In theory this will work because, as the shadow map gets older, it gets more 'important' that rendering occurs. Whether the linear combination above will work well enough in practice is something to be tested; it may be that age needs to become the dominant term more quickly.<br />
<br />
Integrating this temporal method with the above spatial method is made tricky by the fact that, in the temporal approach, shadow maps need to persist. Even if a shadow map wasn't updated in this frame we still need it for rendering (if the light is visible), so we can't allow it to be overwritten with another shadow map. Allowing a shadow map to be 'locked' may appear to solve this issue, however the circumstances under which a shadow map can be 'unlocked' aren't really clear: you can safely overwrite a shadow map if it isn't needed this frame. But what if it's needed next frame?John Chapmanhttp://www.blogger.com/profile/00717956687320127486noreply@blogger.com5tag:blogger.com,1999:blog-1326576410402178223.post-79865744613117418302013-01-23T16:51:00.001+00:002013-02-13T13:23:50.270+00:00Gamma Correction Overview<i>Originally posted on 20/04/2012</i><br />
<br />
Rendering in linear space is good because it is simple; lighting contributions sum, material reflectance values multiply. Everything in the linear world is simple and intuitive - inhabiting the linear world is a sure way of preventing your brain from squirting out through your nose.<br />
<br />
If the output of our display monitors was linear then this would be the end of the story. Alas, this is not the case...<br />
<h2>
Non-linear Outputs</h2>
The graph below shows how the output intensity of a typical monitor looks (the orange line) compared to linear intensity (the blue line). A monitor's response curve sags away from linear such that a pixel with a linear intensity of <em>0.5</em> appears about one fifth as bright as a pixel with a linear intensity of <em>1</em> (not half as bright, as we might have expected).<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEN0ATc4eymwaT5njSxaaWUQiJqS9ofD6JIfp6CctxsbLneXyR1-CVzO1D526scAQAqLzVaPKx0ANMUUVSwePGjQUm7UNTuPHQ0asknxaVouhNVUdHMsbfycDqs9wQGJz33o3nY7_a_lA/s1600/fig1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEN0ATc4eymwaT5njSxaaWUQiJqS9ofD6JIfp6CctxsbLneXyR1-CVzO1D526scAQAqLzVaPKx0ANMUUVSwePGjQUm7UNTuPHQ0asknxaVouhNVUdHMsbfycDqs9wQGJz33o3nY7_a_lA/s1600/fig1.jpg" /></a></div>
<br />
The result of this sag is that any uncorrected linear output from the display will appear much darker than it should.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivdN55ZdnzFy_RHkTAaGBotB-nLBPM4oIMAHZ_oZZxBejNuHJAXhI5Fw5VDJWafmXsYMPAcAV68WLVPNd-xyBx4FIaFWahqz3dQ9LBPkWi7UIc71dvEqoLZ2kCiVkjtNvRJYrMYH0ge3w/s1600/fig2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivdN55ZdnzFy_RHkTAaGBotB-nLBPM4oIMAHZ_oZZxBejNuHJAXhI5Fw5VDJWafmXsYMPAcAV68WLVPNd-xyBx4FIaFWahqz3dQ9LBPkWi7UIc71dvEqoLZ2kCiVkjtNvRJYrMYH0ge3w/s1600/fig2.jpg" /></a></div>
The solution is to 'pre-correct' the output intensity immediately prior to displaying it. Ideally for this we need some information about the response curve of the monitor in use. To this end we could provide a calibration option which allows users to select a gamma correction exponent that 'looks right' for their monitor. <em>Or</em> we could take the easy route and just assume an exponent of 2.2 (which is good enough for the majority of cases). However we choose the exponent, to pre-correct the output we simply raise to the power of <em>1/exponent</em> (the green line on the graph below).<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgdkR7yE_cCSTo_a3L0giMYNlMnMhniiGYD4NuNWEHGO_TzRVKOVWZOL-FYFFXBe2lsEUPzGa56INk0uzeUK6XNx3UMDTh-xKToSO5_jmx-6uamhB7J3heHRCwDc16Y-ZF_jF1L2gHzazc/s1600/fig3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgdkR7yE_cCSTo_a3L0giMYNlMnMhniiGYD4NuNWEHGO_TzRVKOVWZOL-FYFFXBe2lsEUPzGa56INk0uzeUK6XNx3UMDTh-xKToSO5_jmx-6uamhB7J3heHRCwDc16Y-ZF_jF1L2gHzazc/s1600/fig3.jpg" /></a></div>
<br />
This effectively cancels out the display's response curve to maintain a linear relationship between different intensities in the output. Problem solved. Well, not quite...<br />
<h2>
Non-linear Inputs</h2>
It is highly likely that some of the inputs to our linear rendering will be textures and that those textures will have been created from non-linear photographs and/or manipulated to look 'right' on a non-linear monitor. Hence these input textures are themselves non-linear; they are innately 'pre-corrected' for the display which was used to create them. This actually turns out to be a good thing (especially if we're using an 8 bit-per-channel format) as it increases precision at lower intensities to which the human eye is more sensitive.<br />
<br />
We can't use these non-linear textures directly as inputs to a linear shading function (e.g. lighting) - the results would simply be incorrect. Instead we need to linearize texels as they are fetched using the same method as above. This can be done manually in a shader or have the graphics driver do it automagically for us by using an <a href="http://en.wikipedia.org/wiki/SRGB">sRGB</a> format texture.<br />
<br />
End of story? Not quite...<br />
<h2>
Precision</h2>
For a deferred renderer there is a pitfall which programmers should be aware of. If we linearize a non-linear input texture, then store the linear result in a g-buffer prior to the lighting stage we will lose all of the low-intensity precision benefits of having non-linear data in the first place. The result of this is just horrible - take a look at the low-intensity ends of the gradients in the left image below:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiW72kGLryBH-LCtC_44zp-HXgrr2-EZb6IZOS2oCjL9_zJE0rpoa0MyaCVZkGWJkn4k2rXFwhKgQ3jI_5zSD5E0skEegFllzuIYuDinnmrg9oqaL8sUadsA8cQ90NTnI1kNMmvwd2iZrE/s1600/fig4.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiW72kGLryBH-LCtC_44zp-HXgrr2-EZb6IZOS2oCjL9_zJE0rpoa0MyaCVZkGWJkn4k2rXFwhKgQ3jI_5zSD5E0skEegFllzuIYuDinnmrg9oqaL8sUadsA8cQ90NTnI1kNMmvwd2iZrE/s1600/fig4.jpg" /></a></div>
Clearly we need to delay the gamma correction of input textures right up until we need them to be linear. In practice this means writing non-linear texels to the g-buffer, then gamma correcting the g-buffer as it is read at the lighting stage. As before, the driver can do the work for us by using an sRGB format for the appropriate g-buffer targets, or correcting them manually.<br />
<br />
What do I mean by 'appropriate'?<br />
<h2>
To Be (Linear), Or Not To Be (Linear)?</h2>
Which parts of the g-buffer require this treatment? It depends on the g-buffer organisation, but in general I'd say that any colour information (diffuse albedo/specular colour) should be treated as non-linear; it was probably prepared (pre-corrected) to 'look right' on non-linear display. Any geometric or other non-colour information (normals/material properties) should be treated as linear; they don't encode 'intensity' as colour textures do.<br />
<br />
Think of this post as a sort of quick-reference card; for more in-depth information take a look at the following resources:<br />
<br />
<a href="http://http.developer.nvidia.com/GPUGems3/gpugems3_ch24.html">"The Importance of Being Linear"</a> Larry Gritz/Eugene d'Eon, GPU Gems 3<br />
<br />
<a href="http://www.slideshare.net/ozlael/hable-john-uncharted2-hdr-lighting">"Uncharted 2: HDR Lighting"</a> John Hable's <em>must-read</em> GDC presentation<br />
<br />
<a href="http://en.wikipedia.org/wiki/Gamma_correction">Wikipedia's gamma correction entry</a> (and <a href="http://wikimediafoundation.org/">donate some money</a> to Wikipedia while you're at it)John Chapmanhttp://www.blogger.com/profile/00717956687320127486noreply@blogger.com4tag:blogger.com,1999:blog-1326576410402178223.post-40850411133890456132013-01-21T15:44:00.000+00:002013-01-24T19:29:56.541+00:00"Good Enough" Volumetrics for Spotlights<i>Originally posted on 06/01/2012</i><br />
<br />
Volumetric effects are one of the perennially tricky problems in realtime graphics. They effectively simulate the scattering of light through particles suspended in the air.
Since these effects can enhance both the realism and aesthetic appearance of a rendered scene, it would be nice to have a method which can produce "good enough" results cheaply and simply. As the title implies, "good enough" is the main criteria here; we're not looking for absolute photorealism, just something that's passable which adds to the aesthetic or the mood of a scene without costing the Earth to render.<br />
<br />
I'll be describing a volumetric effect for spot lights, although the same ideas will apply to other types lights with different volume geometries.<br />
<h2>
Coneheads</h2>
The volume affected by a spotlight is a cone, so that's what we'll use as the basis for the technique.<br />
How you generate the cone is up to you, but it must have per-vertex normals (they'll make life easier later on), no duplicated vertices except at the cone's tip and no base. I've found that having plenty of height segments is good for the normal interpolation and well worth the extra triangles.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPh-0exlqbul-vdnQ7BaZsXfnm4cWzmpBkCO6aeiU1PYs2F0zNGNBsv0w3XOzhqcL3D7UE0s_23nuSn1JZXFdI1hJJ0wrXdJZjNirlQ7C8zw-ooMnU2HnDADQIcuVkicIBV6Z-w82pndY/s1600/fig2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPh-0exlqbul-vdnQ7BaZsXfnm4cWzmpBkCO6aeiU1PYs2F0zNGNBsv0w3XOzhqcL3D7UE0s_23nuSn1JZXFdI1hJJ0wrXdJZjNirlQ7C8zw-ooMnU2HnDADQIcuVkicIBV6Z-w82pndY/s1600/fig2.jpg" /></a></div>
<br />
The basic idea is to render this cone in an additive blending pass with no face culling (we want to see the inside and outside of the cone together), with depth writes <em>disabled</em> but the depth test <em>enabled</em>. As the screenshot below shows, on its own this looks pretty terrible:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgn17741gVoHq_BChTIjUhg4_qc7VvxtIQGzbWtzwfv-_0Ae6r0ruqGWaoZ_eXRhFer3Th98SUJ-ECB0ncjFUkkOUzhWAGdtrh9uHhmd82nlFHrVCko8vIqhxoBSy3OobUKD8XDP1Ie7mM/s1600/fig3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgn17741gVoHq_BChTIjUhg4_qc7VvxtIQGzbWtzwfv-_0Ae6r0ruqGWaoZ_eXRhFer3Th98SUJ-ECB0ncjFUkkOUzhWAGdtrh9uHhmd82nlFHrVCko8vIqhxoBSy3OobUKD8XDP1Ie7mM/s1600/fig3.jpg" /></a></div>
<br />
<h2>
Attenuation</h2>
To begin to improve things we need to at least attenuate the effect along the length of the cone. This can be done per-fragment as a simple function of the distance from the cone's tip (<em>d</em>) and some maximum distance (<em>dmax</em>):<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi3Ukddtauxjdl31_n7_h1lfSdWkC89S_9sMWCY72rYRD0K-K32KJVAwJJ_Lniq8syn3diUlEsdlWIx6SeCG0WTFbD6aBvOGebI1wxrlKJxpg4TF4kforUH7LnsHFBNOecCdrj5Sqz6hEU/s1600/fig4.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi3Ukddtauxjdl31_n7_h1lfSdWkC89S_9sMWCY72rYRD0K-K32KJVAwJJ_Lniq8syn3diUlEsdlWIx6SeCG0WTFbD6aBvOGebI1wxrlKJxpg4TF4kforUH7LnsHFBNOecCdrj5Sqz6hEU/s1600/fig4.jpg" /></a></div>
<br />
Already things are looking a lot better:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_-24E3abCjzXjsKJMk7bp_eQ8Flqrta9Vy5sbhrugQx6846HW0MW4xYpSeuyX0MaluiZuOeFnABe2vkvb9qLBQPLIcKP5FA3j23n7rh17ejAPD4UQBWfif8hILmbrhyphenhyphenEJerGMAtebMjw/s1600/fig5.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_-24E3abCjzXjsKJMk7bp_eQ8Flqrta9Vy5sbhrugQx6846HW0MW4xYpSeuyX0MaluiZuOeFnABe2vkvb9qLBQPLIcKP5FA3j23n7rh17ejAPD4UQBWfif8hILmbrhyphenhyphenEJerGMAtebMjw/s1600/fig5.jpg" /></a></div>
<br />
<h2>
Soft Edges</h2>
The edges of the cone need to be softened somehow, and that's where the vertex normals come in. We can use the dot product of the view space normal (<em>cnorm</em>) with the view vector (the normalised fragment position, <em>cpos</em>) as a metric describing how how near to the edge of the cone the current fragment is.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgs0hE3yPCGUzfcUJjdczj_ZLhZr8_P099y7p-yb4UaSS6p_LdnQF1mWXYr3dk3su4V91aMSsLoWC75HAjOcigluXoXcmEu-mcs8dkT6VBtkGVvjgkCdrluCHNCNRRDsiPF3xMGehsNBvI/s1600/fig6.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgs0hE3yPCGUzfcUJjdczj_ZLhZr8_P099y7p-yb4UaSS6p_LdnQF1mWXYr3dk3su4V91aMSsLoWC75HAjOcigluXoXcmEu-mcs8dkT6VBtkGVvjgkCdrluCHNCNRRDsiPF3xMGehsNBvI/s1600/fig6.jpg" /></a></div>
<br />
Normalising the fragment position gives us a vector from the eye to the point on the cone (<em>cpos</em>) with which we're dealing. We take the absolute value of the result because the back faces of the cone will be pointing away but still need to contribute to the final result in the same was as the front faces. For added control over the edge attenuation it's useful to be able to raise the result to the power <em>n</em>.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgK4nrOI-VFXLPVf3909cZZS909Pr8Qse-5hUIQUYu2Z3jGpQc8sOpi3abUGgybX6vUu0k3a4_xJtrrKUQsC-luNMoeCduUoEzbPNb5lLuLZiVT-Y-NXRFaDYndz1UTYLJP7u14Ci8l2_I/s1600/fig7.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgK4nrOI-VFXLPVf3909cZZS909Pr8Qse-5hUIQUYu2Z3jGpQc8sOpi3abUGgybX6vUu0k3a4_xJtrrKUQsC-luNMoeCduUoEzbPNb5lLuLZiVT-Y-NXRFaDYndz1UTYLJP7u14Ci8l2_I/s1600/fig7.jpg" /></a></div>
<br />
Using per-vertex normals like this is simple, but requires that the cone geometry be set up such that there won't be any 'seams' in the normal data, hence my previous note about not having any duplicate vertices except at the cone's tip.<br />
<br />
One issue with this method is that when inside the cone looking up towards the tip the normals will tend to be perpendicular to the view direction, resulting in a blank spot. This can be remedied by applying a separate glow sprite at the light source position. <br />
<h2>
Soft Intersections</h2>
<a href="http://www.blogger.com/blogger.g?blogID=1326576410402178223" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"></a><a href="http://www.blogger.com/blogger.g?blogID=1326576410402178223" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"></a>As you can see in the previous screenshot there is a problem where the cone geometry intersects with other geometry in the scene, including the floor. Remedying this requires access to the depth buffer from within the shader. As the cone's fragments get closer to fragments already in the buffer (i.e. as the difference between the depth buffer value and the cone fragment's depth approaches 0) we want the result to 'fade out':<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjq08V1EHY_FjH2E7VuB-VoqcUdNQH_KZJ1aN8fHba-IrJPKiy1OrxHH_0t4CUaKDVrC6W-KR3GJ9t9ULuZVwNAbqRkW9UAuSxQlPvvOz0-LMSMGCVJ9Sj4MloFjdfczF3_cx2-Yi0zXhc/s1600/fig8.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjq08V1EHY_FjH2E7VuB-VoqcUdNQH_KZJ1aN8fHba-IrJPKiy1OrxHH_0t4CUaKDVrC6W-KR3GJ9t9ULuZVwNAbqRkW9UAuSxQlPvvOz0-LMSMGCVJ9Sj4MloFjdfczF3_cx2-Yi0zXhc/s1600/fig8.jpg" /></a><a href="http://www.blogger.com/blogger.g?blogID=1326576410402178223" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"></a><a href="http://www.blogger.com/blogger.g?blogID=1326576410402178223" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"></a></div>
<br />
The result should be clamped in <em>[0, 1]</em>. The <em>radius</em> can be set to make the edges softer or harder, depending on the desired effect and the scale of the intersecting geometry compared with the cone's size.<br />
This does produce a slightly unusual fogging effect around the cone's boundary, but to my eye it meets the "good enough" criteria.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbpeVGKcMn01f2dHyHsJqF3SmUCUL-p_1r10mvLqTQxSayZlqSA5cyFOv1Mo8BEiMcl6NoSsmTtN2Jez3TrdY2EjsKkpW1YQ9uNZYZPrTrJ-eppZkwIL41nmxGRT1tEgwANBOsB3A_4xI/s1600/fig9.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbpeVGKcMn01f2dHyHsJqF3SmUCUL-p_1r10mvLqTQxSayZlqSA5cyFOv1Mo8BEiMcl6NoSsmTtN2Jez3TrdY2EjsKkpW1YQ9uNZYZPrTrJ-eppZkwIL41nmxGRT1tEgwANBOsB3A_4xI/s1600/fig9.jpg" /></a></div>
<br />
Another issue is that the cone geometry can intersect with the camera's near clipping plane. This results in the effect 'popping' as the camera moves across the cone boundary. We can solve this in exactly the same way as for geometry intersections; as the cone fragment's depth approaches the near plane we fade out the result.<br />
<br />
That's it!John Chapmanhttp://www.blogger.com/profile/00717956687320127486noreply@blogger.com9tag:blogger.com,1999:blog-1326576410402178223.post-65775033933312886602013-01-17T15:34:00.003+00:002013-05-17T15:29:22.745+01:00Motion Blur Tutorial<i>Originally posted on 21/04/2011</i><br />
<h2>
What is motion blur?</h2>
Motion pictures are made up of a series of still images displayed in quick succession. These images are captured by briefly opening a shutter to expose a piece of film/<a href="http://en.wikipedia.org/wiki/Charge-coupled_device">electronic sensor</a> to light (via a lens system), then closing the shutter and advancing the film/saving the data. Motion blur occurs when an object in the scene (or the camera itself) moves while the shutter is open during the exposure, causing the resulting image to streak along the direction of motion. It is an artifact which the image-viewing populous has grown so used to that its absence is conspicuous; adding it to a simulated image enhances the realism to a large degree.<br />
<br />
Later we'll look at a screen space technique for simulating motion blur caused only by movement of the camera. Approaches to object motion blur are a little more complicated and worth a <a href="http://www.john-chapman.net/content.php?id=21">separate tutorial</a>. First, though, let's examine a 'perfect' (full camera <em>and</em> object motion blur) solution which is very simple but not really efficient enough for realtime use.<br />
<h2>
Perfect solution</h2>
This is a naive approach which has the benefit of producing completely realistic full motion blur, incorporating both the camera movement <em>and</em> movement of the objects in the scene relative to the camera. The technique works like this: for each frame, render the scene multiple times at different temporal offsets, then blend together the results:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTzW18HhF4FncbPi1D47HINNIrlGdTKGsiurW8_fxFnClcLNWxZXK4lMO9I4zTHdLXPrMFlTYIiAEGoptN80WfLAXn63aanR1zUpRl8uk6W7rHf5R8cdUOBQp2jKWk43yh6gbkwSCvWhY/s1600/fig2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTzW18HhF4FncbPi1D47HINNIrlGdTKGsiurW8_fxFnClcLNWxZXK4lMO9I4zTHdLXPrMFlTYIiAEGoptN80WfLAXn63aanR1zUpRl8uk6W7rHf5R8cdUOBQp2jKWk43yh6gbkwSCvWhY/s1600/fig2.jpg" /></a></div>
<br />
This technique is actually described in the <a href="http://fly.cc.fer.hr/~unreal/theredbook/chapter10.html">red book (chapter 10)</a>. Unfortunately it requires that the basic framerate must be <em>samples * framerate</em>, which is either impossible or impractical for most realtime applications. And don't think about just using the previous <em>samples</em> frames - this will give you trippy trails (and nausea) but definitely not motion blur. So how do we go about doing it quick n' cheap?<br />
<h2>
Screen space to the rescue!</h2>
The idea is simple: each rendered pixel represents a point in the scene at the current frame. If we know where it was in the <em>previous</em> frame, we can apply a blur along a vector between the two points in screen space. This vector represents the size and direction of the motion of that point between the previous frame and the current one, hence we can use it to approximate the motion of a point during the intervening time, directly analogous to a single <i>exposure </i>in the real world.<br />
<br />
The crux of this method is calculating a previous screen space position for each pixel. Since we're only going to implement motion blur caused by motion of the camera, this is very simple: each frame, store the camera's model-view-projection matrix so that in the next frame we'll have access to it. Since this is all done on the CPU the details will vary; I'll just assume that you can supply the following to the fragment shader: the <em>previous</em> model-view-projection matrix and the <em>inverse</em> of the <em>current</em> model-view matrix.<br />
<br />
<h3>
Computing the blur vector</h3>
In order to compute the blur vector we take the following steps within our fragment shader:<br />
<ol>
<li>Get the pixel's <em>current</em> view space position. There are a number of equally good methods for extracting this from an existing depth buffer, see <a href="http://mynameismjp.wordpress.com/2009/03/10/reconstructing-position-from-depth/">Matt Pettineo's blog</a> for a good overview. In the example shader I use a per-pixel ray to the far plane, multiplied by a per-pixel linear depth.</li>
<li>From this, compute the pixel's <em>current</em> world space position using the inverse of the <em>current</em> model-view matrix.</li>
<li>From this, compute the pixel's <em>previous</em> <a href="http://omega.di.unipi.it/web/IUM/Waterloo/node15.html">normalized device coordinates</a> using the <em>previous</em> model-view-projection matrix and a perspective divide.</li>
<li>Scale and bias the result to get texture coordinates.</li>
<li>Our blur vector is the current pixel's texture coordinates minus the coordinates we just calculated</li>
</ol>
The eagle-eyed reader may have already spotted that this can be optimized, but for now we'll do it long-hand for the purposes of clarity. Here's the fragment program:<br />
<div class="listing">
<pre class="sh_glsl"> uniform sampler2D uTexLinearDepth;
uniform mat4 uInverseModelViewMat; // inverse model->view
uniform mat4 uPrevModelViewProj; // previous model->view->projection
noperspective in vec2 vTexcoord;
noperspective in vec3 vViewRay; // for extracting current world space position
void main() {
// get current world space position:
vec3 current = vViewRay * texture(uTexLinearDepth, vTexcoord).r;
current = uInverseModelViewMat * current;
// get previous screen space position:
vec4 previous = uPrevModelViewProj * vec4(current, 1.0);
previous.xyz /= previous.w;
previous.xy = previous.xy * 0.5 + 0.5;
vec2 blurVec = previous.xy - vTexcoord;
}</pre>
</div>
<br />
<h3>
Using the blur vector</h3>
So what do we do with this blur vector? We might try stepping for <em>n</em> samples along the vector, starting at <code>previous.xy</code> and ending at <code>vTexcoord</code>. However this produces ugly discontinuities in the effect:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiRwXlyJbKl-MiI-gIbXBKGxjSX1sjI6owTBVp0zLQpAW1bJTLCce0r3CL1k7MjPPhvzPZkILvfqukBe3bqBDG0zV3ontDf34ce04n9ZHDxBGAVq-8yAmMccQ1JmXafoRAN8efeCYS_5ao/s1600/fig3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiRwXlyJbKl-MiI-gIbXBKGxjSX1sjI6owTBVp0zLQpAW1bJTLCce0r3CL1k7MjPPhvzPZkILvfqukBe3bqBDG0zV3ontDf34ce04n9ZHDxBGAVq-8yAmMccQ1JmXafoRAN8efeCYS_5ao/s1600/fig3.jpg" /></a></div>
<br />
To fix this we can center the blur vector on <code>vTexcoord</code>, thereby blurring across these velocity boundaries:<br />
Here's the rest of the fragment program (<code>uTexInput</code> the texture we're blurring):<br />
<div class="listing">
<pre class="sh_glsl">// perform blur:
vec4 result = texture(uTexInput, vTexcoord);
for (int i = 1; i < nSamples; ++i) {
// get offset in range [-0.5, 0.5]:
vec2 offset = blurVec * (float(i) / float(nSamples - 1) - 0.5);
// sample & add to result:
result += texture(uTexInput, vTexcoord + offset);
}
result /= float(nSamples);</pre>
</div>
<br />
<h3>
A sly problem</h3>
There is a potential issue around framerate: if it is very high our blur will be barely visible as the amount of motion between frames will be small, hence <code>blurVec</code> will be short. If the framerate is very low our blur will be exaggerated, as the amount of motion between frames will be high, hence <code>blurVec</code> will be long.<br />
<br />
While this is physically realistic (higher fps = shorter exposure, lower fps = longer exposure) it might not be aesthetically desirable. This is especially true for variable-framerate games which need to maintain playability as the framerate drops without the entire image becoming a smear. At the other end of the problem, for displays with high refresh rates (or vsync disabled) the blur lengths end up being so short that the result will be pretty much unnoticeable. What we want in these situations is for each frame to <em>look</em> as though it was rendered at a particular framerate (which we'll call the 'target framerate') regardless of the <em>actual</em> framerate.<br />
<br />
The solution is to scale <code>blurVec</code> according to the current <em>actual</em> fps; if the framerate goes up we increase the blur length, if it goes down we decrease the blur length. When I say "goes up" or "goes down" I mean "changes relative to the target framerate." This scale factor is easilly calculated:<br />
<br />
<code> mblurScale = currentFps / targeFps</code>
<br />
<br />
So if our target fps is 60 but the actual fps is 30, we halve our blur length. Remember that this is not physically realistic - we're fiddling the result in order to compensate for a variable framerate.<br />
<br />
<h3>
Optimization</h3>
The simplest way to improve the performance of this method is to reduce the number of blur samples. I've found it looks okay down to about 8 samples, where 'banding' artifacts start to become apparent.<br />
<br />
As I hinted before, computing the blur vector can be streamlined. Notice that, in the first part of the fragment shader, we did two matrix multiplications:<br />
<div class="listing">
<pre class="sh_glsl">// get current world space position:
vec3 current = vViewRay * texture(uTexLinearDepth, vTexcoord).r;
current = <em>uInverseModelViewMat * current;</em>
// get previous screen space position:
vec4 previous = <em>uPrevModelViewProj * vec4(current, 1.0);</em>
previous.xyz /= previous.w;
previous.xy = previous.xy * 0.5 + 0.5;</pre>
</div>
These can be combined into a single transformation by constructing a current-to-previous matrix:<br />
<br />
<code>mat4 currentToPrevious = uPrevModelViewProj * uInverseModelViewMat</code><br />
<br />
If we do this on the CPU we only have to do a single matrix multiplication per fragment in the shader. Also, this reduces the amount of data we upload to the GPU (always a good thing). The relevant part of the fragment program now looks like this:<br />
<div class="listing">
<pre class="sh_glsl"> vec3 current = vViewRay * texture(uTexLinearDepth, vTexcoord).r;
<em>vec4 previous = uCurrentToPreviousMat * vec4(current, 1.0);</em>
previous.xyz /= previous.w;
previous.xy = <em>previous.xy * 0.5 + 0.5;</em></pre>
</div>
<h2>
Conclusion</h2>
Even this limited form of motion blur makes a big improvement to the appearance of a rendered scene; moving around looks generally smoother and more realistic. At lower framerates (~30fps) the effect produces a filmic appearance, hiding some of the temporal aliasing that makes rendering (and stop-motion animation) 'look fake'.<br />
<br />
If that wasn't enough, head over to the <a href="http://www.john-chapman.net/content.php?id=21">object motion blur tutorial</a>, otherwise have some links:<br />
<br />
<a href="http://jinsu0000.springnote.com/pages/6456715/attachments/3962623">"Stupid OpenGL Shader Tricks"</a> Simon Green, NVIDIA<br />
<br />
<a href="http://http.developer.nvidia.com/GPUGems3/gpugems3_ch27.html">"Motion Blur as a Post Processing Effect"</a> Gilberto Rosado, GPU Gems 3<br />
<br />
<a href="http://www.youtube.com/watch?v=uEK9mitagS8">Dinoooossaaaaaaurs!</a>John Chapmanhttp://www.blogger.com/profile/00717956687320127486noreply@blogger.com18tag:blogger.com,1999:blog-1326576410402178223.post-61206383942983228202013-01-15T15:45:00.000+00:002013-08-14T12:53:17.084+01:00SSAO Tutorial<i>Originally posted on 05/01/2011</i><br />
<h2>
Background</h2>
Ambient occlusion is an approximation of the amount by which a point on a surface is occluded by the surrounding geometry, which affects the accessibility of that point by incoming light. In effect, ambient occlusion techniques allow the simulation of proximity shadows - the soft shadows that you see in the corners of rooms and the narrow spaces between objects. Ambien occlusion is often subtle, but will dramatically improve the visual realism of a computer-generated scene:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhPMTOOF3sGCascpGObqPuKl-ubmH-cDOvVbklSqMrSZxZqlgMQ0zX_susYv2S55AV0eAh-qE9F2hufrj3KM7ue3ohDKCoRcAnPNVIarTeK1Mt_ec13MKQfvp4xiNP9NpfRQzcwv-3YAbo/s1600/fig1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhPMTOOF3sGCascpGObqPuKl-ubmH-cDOvVbklSqMrSZxZqlgMQ0zX_susYv2S55AV0eAh-qE9F2hufrj3KM7ue3ohDKCoRcAnPNVIarTeK1Mt_ec13MKQfvp4xiNP9NpfRQzcwv-3YAbo/s1600/fig1.jpg" /></a></div>
The basic idea is to compute an <em>occlusion factor</em> for each point on a surface and incorporate this into the lighting model, usually by modulating the ambient term such that more occlusion = less light, less occlusion = more light. Computing the occlusion factor can be expensive; offline renderers typically do it by casting a large number of rays in a normal-oriented hemisphere to sample the occluding geometry around a point. In general this isn't practical for realtime rendering.<br />
<br />
To achieve interactive frame rates, computing the occlusion factor needs to be optimized as far as possible. One option is to pre-calculate it, but this limits how dynamic a scene can be (the lights can move around, but the geometry can't).<br />
<br />
Way back in 2007, Crytek implemented a realtime solution for <em><a href="http://en.wikipedia.org/wiki/Crysis">Crysis</a>, </em>which quickly became the yardstick for game graphics. The idea is simple: use per-fragment depth information as an approximation of the scene geometry and calculate the occlusion factor in <em>screen space</em>. This means that the whole process can be done on the GPU, is 100% dynamic and completely independent of scene complexity. Here we'll take a quick look at how the <i>Crysis </i>method works, then look at some enhancements.<br />
<h2>
Crysis Method</h2>
Rather than cast rays in a hemisphere, Crysis samples the depth buffer at points derived from samples in a sphere:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi-VKifqEqZKSXGR64QyRVSxBJkGM1_K88SZ5sR5ELYKS13ctnUi5jBI1FnJknOKie5sU9H2PVBjL_Kl5BsqQsUl21Mgu00TikY9DzeIolM1xYnvTI2syJNDYkFUPaLHClZB-kJdglibpo/s1600/fig2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi-VKifqEqZKSXGR64QyRVSxBJkGM1_K88SZ5sR5ELYKS13ctnUi5jBI1FnJknOKie5sU9H2PVBjL_Kl5BsqQsUl21Mgu00TikY9DzeIolM1xYnvTI2syJNDYkFUPaLHClZB-kJdglibpo/s1600/fig2.jpg" /></a></div>
<br />
This works in the following way:<br />
<ul>
<li>project each sample point into screen space to get the coordinates into the depth buffer</li>
<li>sample the depth buffer</li>
<li>if the sample position is behind the sampled depth (i.e. inside geometry), it contributes to the occlusion factor</li>
</ul>
Clearly the quality of the result is directly proportional to the number of samples, which needs to be minimized in order to achieve decent performance. Reducing the number of samples, however, produces ugly 'banding' artifacts in the result. This problem is remedied by randomly rotating the sample kernel at each pixel, trading banding for high frequency noise which can be removed by blurring the result.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj0qldscr1VmQeJ3vLgU0tSOobhyphenhyphen0JA4hs-7H1PqKxhJGcCTf_diFSmiA9SghczJKrZQR1dEun71uiDt_gH5l3PEmokLwanijOI8LLVcZn8KvdxQ9pOWsnOR-fQlnYNGfzEFI6lxo3wspw/s1600/fig3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj0qldscr1VmQeJ3vLgU0tSOobhyphenhyphen0JA4hs-7H1PqKxhJGcCTf_diFSmiA9SghczJKrZQR1dEun71uiDt_gH5l3PEmokLwanijOI8LLVcZn8KvdxQ9pOWsnOR-fQlnYNGfzEFI6lxo3wspw/s1600/fig3.jpg" /></a></div>
The <i>Crysis </i>method produces occlusion factors with a particular 'look' - because the sample kernel is a sphere, flat walls end up looking grey because ~50% of the samples end up being inside the surrounding geometry. Concave corners darken as expected, but convex ones appear lighter since fewer samples fall inside geometry. Although these artifacts are visually acceptable, they produce a stylistic effect which strays somewhat from photorealism.<br />
<h2>
Normal-oriented Hemisphere</h2>
Rather than sample a spherical kernel at each pixel, we can sample within a hemisphere, oriented along the surface normal at that pixel. This improves the look of the effect with the penalty of requiring per-fragment normal data. For a deferred renderer, however, this is probably already available, so the cost is minimal (especially when compared with the improved quality of the result).<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeKg1k29SbO_dKk9cUXIwk9_-WWHhNVmN1sP_jUB0NaXDNM6DNAjGiUREf1XQS1i7mpPxbdIfODoEhD4bJKIJ1doQbk_8wSJFh-Gcd1awl_fdES2NrDajZhYx5pfaYOqiuK99bDnOki4c/s1600/fig4.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeKg1k29SbO_dKk9cUXIwk9_-WWHhNVmN1sP_jUB0NaXDNM6DNAjGiUREf1XQS1i7mpPxbdIfODoEhD4bJKIJ1doQbk_8wSJFh-Gcd1awl_fdES2NrDajZhYx5pfaYOqiuK99bDnOki4c/s1600/fig4.jpg" /></a></div>
<h3>
Generating the Sample Kernel</h3>
The first step is to generate the sample kernel itself. The requirements are that<br />
<ul>
<li>sample positions fall within the unit hemisphere</li>
<li>sample positions are more densely clustered towards the origin. This effectively attenuates the occlusion contribution according to distance from the kernel centre - samples closer to a point occlude it more than samples further away</li>
</ul>
Generating the hemisphere is easy:<br />
<div class="listing">
<pre class="sh_cpp">for (int i = 0; i < kernelSize; ++i) {
kernel[i] = vec3(
random(-1.0f, 1.0f),
random(-1.0f, 1.0f),
random(0.0f, 1.0f)
kernel[i].normalize();
}
</pre>
</div>
This creates sample points on the surface of a hemisphere oriented along the z axis. The choice of orientation is arbitrary - it will only affect the way we reorient the kernel in the shader. The next step is to scale each of the sample positions to distribute them within the hemisphere. This is most simply done as:<br />
<div class="listing">
<pre class="sh_cpp"> kernel[i] *= random(0.0f, 1.0f);
</pre>
</div>
which will produce an evenly distributed set of points. What we actually want is for the distance from the origin to falloff as we generate more points, according to a curve like this:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIyD8FUS1Mg8tI_vhtxnSO0aj8VQ4hPCrbBQAU5xbElHx7rR3SWU7DiyEJGDSf0huqU2MGpFnQUCJuwu2R8MaC3NlUBr6UhJdrw18KJoC2oF9dewpVqUKYI-hu262XwaZ6yH1XfBeQiXA/s1600/fig5.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIyD8FUS1Mg8tI_vhtxnSO0aj8VQ4hPCrbBQAU5xbElHx7rR3SWU7DiyEJGDSf0huqU2MGpFnQUCJuwu2R8MaC3NlUBr6UhJdrw18KJoC2oF9dewpVqUKYI-hu262XwaZ6yH1XfBeQiXA/s1600/fig5.jpg" /></a></div>
<br />
We can use an accelerating interpolation function to achieve this:<br />
<div class="listing">
<pre class="sh_cpp"> float scale = float(i) / float(kernelSize);
scale = lerp(0.1f, 1.0f, scale * scale);
kernel[i] *= scale;
</pre>
</div>
<h3>
Generating the Noise Texture</h3>
Next we need to generate a set of random values used to rotate the sample kernel, which will effectively increase the sample count and minimize the 'banding' artefacts mentioned previously.<br />
<div class="listing">
<pre class="sh_cpp">for (int i = 0; i < noiseSize; ++i) {
noise[i] = vec3(
random(-1.0f, 1.0f),
random(-1.0f, 1.0f),
0.0f
);
noise[i].normalize();
}
</pre>
</div>
Note that the <em>z</em> component is zero; since our kernel is oriented along the <em>z</em>-axis, we want the random rotation to occur around that axis.<br />
<br />
These random values are stored in a texture and tiled over the screen. The tiling of the texture causes the orientation of the kernel to be repeated and introduces regularity into the result. By keeping the texture size small we can make this regularity occur at a high frequency, which can then be removed with a blur step that preserves the low-frequency detail of the image. Using a 4x4 texture and blur kernel produces excellent results at minimal cost. This is the same approach as used in <i>Crysis</i>.<br />
<h3>
The SSAO Shader</h3>
With all the prep work done, we come to the meat of the implementation: the shader itself. There are actually two passes: calculating the occlusion factor, then blurring the result.<br />
<br />
Calculating the occlusion factor requires first obtaining the fragment's view space position and normal:<br />
<div class="listing">
<pre class="sh_glsl"> vec3 origin = vViewRay * texture(uTexLinearDepth, vTexcoord).r;
</pre>
</div>
I reconstruct the view space position by combining the fragment's linear depth with the interpolated <code>vViewRay</code>. See <a href="http://mynameismjp.wordpress.com/2009/03/10/reconstructing-position-from-depth/">Matt Pettineo's blog</a> for a discussion of other methods for reconstructing position from depth. The important thing is that <code>origin</code> ends up being the fragment's view space position.<br />
Retrieving the fragment's normal is a little more straightforward; the scale/bias and normalization steps are necessary unless you're using some high precision format to store the normals:<br />
<div class="listing">
<pre class="sh_glsl"> vec3 normal = texture(uTexNormals, vTexcoord).xyz * 2.0 - 1.0;
normal = normalize(normal);
</pre>
</div>
Next we need to construct a change-of-basis matrix to reorient our sample kernel along the origin's normal. We can cunningly incorporate the random rotation here, as well:<br />
<div class="listing">
<pre class="sh_glsl"> vec3 rvec = texture(uTexRandom, vTexcoord * uNoiseScale).xyz * 2.0 - 1.0;
vec3 tangent = normalize(rvec - normal * dot(rvec, normal));
vec3 bitangent = cross(normal, tangent);
mat3 tbn = mat3(tangent, bitangent, normal);
</pre>
</div>
The first line retrieves a random vector <code>rvec</code> from our noise texture. <code>uNoiseScale</code> is a <code>vec2</code> which scales <code>vTexcoord</code> to tile the noise texture. So if our render target is 1024x768 and our noise texture is 4x4, <code>uNoiseScale</code> would be (1024 / 4, 768 / 4). (This can just be calculated once when initialising the noise texture and passed in as a uniform).<br />
<br />
The next three lines use the <a href="http://en.wikipedia.org/wiki/Gram%E2%80%93Schmidt_process">Gram-Schmidt process</a> to compute an orthogonal basis, incorporating our random rotation vector <code>rvec</code>.<br />
<br />
The last line constructs the transformation matrix from our <code>tangent</code>, <code>bitangent</code> and <code>normal</code> vectors. The <code>normal</code> vector fills the <em>z</em> component of our matrix because that is the axis along which the base kernel is oriented.<br />
<br />
Next we loop through the sample kernel (passed in as an array of <code>vec3</code>, <code>uSampleKernel</code>), sample the depth buffer and accumulate the occlusion factor:<br />
<div class="listing">
<pre class="sh_glsl">float occlusion = 0.0;
for (int i = 0; i < uSampleKernelSize; ++i) {
// get sample position:
vec3 sample = tbn * uSampleKernel[i];
sample = sample * uRadius + origin;
// project sample position:
vec4 offset = vec4(sample, 1.0);
offset = uProjectionMat * offset;
offset.xy /= offset.w;
offset.xy = offset.xy * 0.5 + 0.5;
// get sample depth:
float sampleDepth = texture(uTexLinearDepth, offset.xy).r;
// range check & accumulate:
float rangeCheck= abs(origin.z - sampleDepth) < uRadius ? 1.0 : 0.0;
occlusion += (sampleDepth <= sample.z ? 1.0 : 0.0) * rangeCheck;
}
</pre>
</div>
Getting the view space sample position is simple; we multiply by our orientation matrix <code>tbn</code>, then scale the sample by <code>uRadius</code> (a nice artist-adjustable factor, passed in as a uniform) then add the fragment's view space position <code>origin</code>.<br />
We now need to project <code>sample</code> (which is in view space) back into screen space to get the texture coordinates with which we sample the depth buffer. This step follows the usual process - multiply by the current projection matrix (<code>uProjectionMat</code>), perform <i>w</i>-divide then scale and bias to get our texture coordinate: <code>offset.xy</code>.<br />
<br />
Next we read <code>sampleDepth</code> out of the depth buffer (<code>uTexLinearDepth</code>). If this is in front of the sample position, the sample is 'inside' geometry and contributes to occlusion. If <code>sampleDepth</code> is behind the sample position, the sample doesn't contribute to the occlusion factor. Introducing a <code>rangeCheck</code> helps to prevent erroneous occlusion between large depth discontinuities:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhPKxx5T6erL1q6jL5w49svUy-ks3mrHFElxR3RbTH80NWk4-a3eO0o5YnTeEFlz-HzBKobJ2sS8gfpkjs1dlcfwyUCSjsQak3xepaVmNItkSaWdZOylSzA5Az-yZrfa5FQlXCeRde7sTk/s1600/fig6.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhPKxx5T6erL1q6jL5w49svUy-ks3mrHFElxR3RbTH80NWk4-a3eO0o5YnTeEFlz-HzBKobJ2sS8gfpkjs1dlcfwyUCSjsQak3xepaVmNItkSaWdZOylSzA5Az-yZrfa5FQlXCeRde7sTk/s1600/fig6.jpg" /></a></div>
<br />
As you can see, <code>rangeCheck</code> works by zeroing any contribution from outside the sampling radius.<br />
<br />
The final step is to normalize the occlusion factor and invert it, in order to produce a value that can be used to directly scale the light contribution.<br />
<div class="listing">
<pre class="sh_glsl"> occlusion = 1.0 - (occlusion / uSampleKernelSize);
</pre>
</div>
<h3>
The Blur Shader</h3>
The blur shader is very simple: all we want to do is average a 4x4 rectangle around each pixel to remove the 4x4 noise pattern:<br />
<div class="listing">
<pre class="sh_glsl">uniform sampler2D uTexInput;
uniform int uBlurSize = 4; // use size of noise texture
noperspective in vec2 vTexcoord; // input from vertex shader
out float fResult;
void main() {
vec2 texelSize = 1.0 / vec2(textureSize(uInputTex, 0));
float result = 0.0;
vec2 hlim = vec2(float(-uBlurSize) * 0.5 + 0.5);
for (int i = 0; i < uBlurSize; ++i) {
for (int j = 0; j < uBlurSize; ++j) {
vec2 offset = (hlim + vec2(float(x), float(y))) * texelSize;
result += texture(uTexInput, vTexcoord + offset).r;
}
}
fResult = result / float(uBlurSize * uBlurSize);
}
</pre>
</div>
The only thing to note in this shader is <code>uTexelSize</code>, which allows us to accurately sample texel centres based on the resolution of the AO render target.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjcbbIS1yl1p84qLGvKrIg-fibogT579o9gFTWn_3b_MtOD-OEn5A3IXRgy_ZUNxveNK2KrCfz6VMXbtVtyjG4eDjdenEM3lpMu0PrvYcKdGuD_O7aKcxBEywmdihtvK-eBQqnOg3rEjbI/s1600/fig7.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjcbbIS1yl1p84qLGvKrIg-fibogT579o9gFTWn_3b_MtOD-OEn5A3IXRgy_ZUNxveNK2KrCfz6VMXbtVtyjG4eDjdenEM3lpMu0PrvYcKdGuD_O7aKcxBEywmdihtvK-eBQqnOg3rEjbI/s1600/fig7.jpg" /></a></div>
<br />
<h2>
Conclusion</h2>
The normal-oriented hemisphere method produces a more realistic-looking than the basic <em>Crysis</em> method, without much extra cost, especially when implemented as part of a deferred renderer where the extra per-fragment data is readily available. It's pretty scalable, too - the main performance bottleneck is the size of the sample kernel, so you can either go for fewer samples or have a lower resolution AO target.<br />
<br />
A demo implementation is available <a href="http://www.john-chapman.net/demos/ssao.zip">here</a>.
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='100' height='480' src='https://www.youtube.com/embed/-IFxjKT7MXA?feature=player_embedded' frameborder='0'></iframe></div>
<br />
The <a href="http://en.wikipedia.org/wiki/Screen_Space_Ambient_Occlusion">Wikipedia article on SSAO</a> has a good set of external links and references for information on other techniques for achieving real time ambient occlusion.John Chapmanhttp://www.blogger.com/profile/00717956687320127486noreply@blogger.com164