flipcode IOTD - Scali (01-16-2004)

Submitted by , posted on 16 January 2004

Image Description, by

What you see here is the rather well-known car mesh, using the rather well-known Blinn-Phong lighting equation. The mesh is low-poly, and a normalmap provides detailed info for per-pixel lighting. So far nothing special.

But it is rendered on my trusty GeForce2, not on a modern 3d card (if anyone wants to donate such a card, feel free though). And that's the thing I would like to talk about. When I wanted to implement the pow() for the specular light, I ran into the problem that the alphablend ops are rather limited, so a conventional multipass technique would not work well. It would be much better if you could just read back the framebuffer as a texture, and pass it through the entire shading pipeline again. So this is what I did. And since I don't think this alternative method of multipass rendering is very common, I thought I'd write something on it. I think it can be very useful in general, not just for old hardware like my GeForce2, but perhaps also for more modern hardware, when you run into the instruction limit for pixelshaders, for example.

Rendering the screen to a texture is easy, I suppose. The tricky part is getting the right texture coordinates at each pixel, so you can sample the previous pass 1:1. I did this by having the card generate texture coordinates from the cameraspace position automatically, and then using a texturematrix (basically a projection matrix), and finally projecting the homogenous coordinates to 2d screen coordinates (in 0..1 range). You should be careful here, and make sure that you align pixel and texel centers properly... You should know that both pixels and texels get sampled at their center... but while the top-left corner of a texture is (0,0), the top-left corner of the screen is actually (-0.5,-0.5). In our case we have normalized coordinates, so instead of the screen going from (0,0) to (width,height), we go from (0,0) to (1,1). So effectively this means we don't need to map (-0.5,-0.5) to (0,0), but actually (-0.5/width,-0.5/height). Another thing is that usually the y axis points up in camera space, while you want it pointing down in screen/texturespace, so you need to flip that too, when mapping y to v.

To perform all this, I take my projection matrix, and multiply it by the following matrix:

| 0.5             0                0 0 |
| 0               -0.5             0 0 |
| 0               0                0 1 |
| 0.5+(0.5/width) 0.5+(0.5/height) 1 0 |

As you see, it scales and translates x and y from the (-1,1) range to the (0,1) range required for the texture, and it does the translation to align the texel and pixel centers (note that width and height are the dimensions of the current rendertarget, not the texture itself!). I also swap z and w around, because I don't need z, and this way I can pass 3d homogenous coordinates that get projected to 2d. So, I then set this matrix, set the states to inform the rasterizer to expect 3d coordinates, which it should project to 2d, and we're all set. If I now render my geometry again, I can sample pixels from the previous pass directly from a texture, and pass them through the entire pipeline... In my case this meant that I could use 2 modulation operations, which can give me specular4 in one pass. With 2 passes I get the specular16 that I want. You can also use smaller rendertargets to speed the operations up, at the cost of some accuracy, ofcourse... In the case of the specular highlight it actually works out nicely, because the upsampling eliminates some of the aliasing that you get from the low precision of 8 bit processing, and it's cheaper than applying an actual blur-filter ofcourse.

(Note by the way that the mapping is not 100% exact... I tried rendering an image and subtracting it from itself, and here and there some pixels were not entirely black, but I'm not sure if that is due to inaccuracy in the hardware, or if the matrix itself is not accurate enough. However, it was good enough for my needs, and I think it should work out fine in most cases. Perhaps it can be completely eliminated if the bilinear filter is disabled (using textures of the same size as the backbuffer ofcourse), but I forgot to test it at the time).

Anyway, I found this method interesting and useful, I hope that some of you do aswell.

If you have a PC with Windows and DirectX 9.0b installed, you can see it live here:
http://scali.eu.org/~bohemiq/PhongCar.zip
http://scali.eu.org/~bohemiq/PhongCar2.zip

You will need a card with dot3 support, 2 textures per pass and cubemap support. Original GeForce/Radeon or higher should work, in general.

Scali

[prev]

Image of the Day Gallery
www.flipcode.com

[next]