Not logged in, Join Here! or Log In Below:  
News Articles Search    

 Home / 3D Theory & Graphics / Steep Parallax Mapping Account Manager
Archive Notice: This thread is old and no longer active. It is here for reference purposes. This thread was created on an older version of the flipcode forums, before the site closed in 2005. Please keep that in mind as you view this thread, as many of the topics and opinions may be outdated.

March 03, 2005, 02:07 PM

Max McGuire & I released a new technique that modifies Parallax Mapping so that it can handle steep edges (high frequencies). The results are good enough that it can even render Lengyel's Shell-style fur in a single pass. This is hard because fur is the ultimate in high frequencies-- the bump map might alternate from 0 to 255 between individual texels.



March 03, 2005, 05:28 PM

Is this any different from relief mapping?


March 04, 2005, 10:41 AM

There are a couple of techniques named Relief Mapping. Olivera and Bishop's Relief Texture Mapping from SIGGRAPH pre-transforms textures in a way that is clever but can't run on today's GPUs. Steep Parallax Mapping was designed to remove the "low frequency" restriction from Parallax Mapping without requiring new assets or a significant change to the shader.

Fabio Policarpo's Relief Mapping (that you linked) from the ShaderTech website uses the same the ray tracer idea as we do (as does the GPU Gems 2 paper we cite; I wasn't aware of Policarpo's shader until now). The differences are subtle, and all are useful techniques.

GPU Gems 2 needs an extra data structure and 3D textures; it may be the right answer for some future generation of hardware but I think the data structures are suboptimal for today's art pipelines and hardware.

The ShaderTech one uses a binary search for the intersection surface after the linear search (that's a good but slow idea, we use mip-map biasing to fix the sampling issue), but *always* pays the worst case ray march. I think it also has a bug-- he fails to normalize the st stepping direction by the z coordinate (interestingly, normalization by length or by w turns out to always be unnecessary since xyz are all scaled equally).

I think we're the first to demonstrate fur and self-shadowing, a simple but useful extension. In the case of fur our, faster texture lookups and early-out (on PS3.0 hardware; on PS2.0 we use a hard-coded 9-step ray march) really shine. On average the ray cast terminates after only one or two texture reads, so the performance is much better than Shells (or the ShaderTech method) and only slightly worse than pure parallax mapping.



March 04, 2005, 02:23 PM

Is there a working demo of this technique available anywhere? If so, what are the minimum hardware requirements?

I'd like to try this out, but I have a Radeon 9600, which I believe doesn't support PS3.0. The shadertech relief-mapping demo didn't run on it. Also, I suspect that your method would perform poorly on it since there would be no early-out.

This isn't a criticism of your method, but it's a bit frustrating that it doesn't work as intended on anything but the currently higher-end cards. (I can't upgrade my video card, as my computer is a notebook.)

"...without requiring new assets or a significant change to the shader."

Just quibbling now, but I'd say replacing a parallax mapping shader with a ray tracing shader is a pretty significant change. =)

Also, the shadertech demo has self-shadowing. But not fur. =)


March 04, 2005, 03:38 PM

Really, it is just a 3-line change to the shader for the minimal implementation. I put the critical inner loop code on the PDF to show this. The art pipeline is unchanged and the C++ code is unchanged from a regular bump-map implementation. This is cool because you could add it to a mod without access to the original code (e.g., you could probably hack this into HL2 and Doom3 if you snuck a bump map into the alpha channel of the normal map). Of course, adding self-shadowing and some of the optimizations take more than those three lines, but the core idea is a small implementation difference from Parallax Mapping.

We'll have some demos and a more full tech report out after the presentation at I3D in April. The PS3.0 version is GLSL and can shade every pixel of 800 x 600 at 50 fps on GeForce6800. I expect beter rates for newer drivers, though-- NVIDIA found a bug in their driver under our demo and is patching it shortly. We have a PS2.0 version in DirectX that would run on your card, but hasn't been fully optimized (or debugged, for that matter).

I didn't see the self-shadowing code in the ShaderTech .cg file, but I believe you. The idea of ray-tracing in a pixel shader has been out there for some time (there were two different SIGGRAPH papers on the topic in previous years). The only real issues are how you handle the sampling and optimizing-- pixel shader cycles aren't cheap. This seems like such a powerful primitive that I hope that future hardware will support it more directly.



March 04, 2005, 05:55 PM

Reedbeta wrote: This isn't a criticism of your method, but it's a bit frustrating that it doesn't work as intended on anything but the currently higher-end cards. (I can't upgrade my video card, as my computer is a notebook.)

You can do this with PS 2.0, but you are limited in the number of steps you can take because of the instruction count.



March 04, 2005, 06:42 PM

Good work!
Although Max smells like Turnips.



March 04, 2005, 08:20 PM

And more importantly, there's no early out, so performance will likely be awful.

I didn't say it doesn't work on PS2.0, only that it doesn't work "as intended".


March 05, 2005, 12:59 PM

Performance isn't actually that bad. If you compare it to something like rendering 16 alpha blended shells on a character for fur, you're doing pretty comparable per-pixel work.



March 05, 2005, 04:47 PM

Hey. I should mention off the top that I wrote the gpu gems article you cited in your poster, so I may be a little biased.

I am curious, how do you handle aliasing in this technique? This is a problem common to techniques based on sampling. For example, it would be interesting to see how the algorithm looks on textured slopes of varying steepness. It would seem the way to do this would be with the "LOD" parameter, but you have omitted its computation. The problem then becomes choice of LOD, because for surfaces such as hair, filtering depth values is not necessarily the desired behaviour.

Also, do you have any performance tests that back up your claim that there is a performance increase vs. distance mapping? I would be interested in seeing comparisons, since I didn't have time to include these in my chapter.

Anyhow, very nice results and congratulations on the poster.

PS. You may want to correct your poster, my last name has two N's.


March 05, 2005, 08:22 PM

Hi William. Your article was very nice and it was especially good to have posted publicly on the NVIDIA site. The citation I gave is:

[1] Donnelly, Per-Pixel Displacement Mapping with Distance Functions, to appear in GPU Gems 2, 2005

That's the way your name appears in the table of contents. Maybe your PDF viewer hid the second 'n'? E-mail me to be sure I get it correct on the poster.

The fur image in the extended abstract uses LOD bias (1.5) applied in the OpenGL code, outside the shader. Since we compute the detail level in the shader, it is possible to simply apply this constant at that location as well. At 8x super-sampling no LOD bias is necessary unless the bump height is extreme, but I took those shots with FSAA at 2x and 4x to give a fair comparison. To avoid aliasing & z-fighting in the shadows I also bump the shadow ray. I've experimented with other forms of filtering the reconstructed heightmap inside the shader, but for games I think that LOD bias is best because it is very efficient.

I haven't implemented distance mapping, however we did implement Shells and Parallax mapping. If the code is publically available it will be easy to run tests against, otherwise perhaps you can send me a demo.



March 05, 2005, 08:41 PM

I think he was referring to the sentence: "We simplify Donelly's method...."


March 07, 2005, 12:23 PM

Ah, LOD bias. In trying to be clever we sometimes forget the simplest solutions. :)

I will try to make the code from the gpu gems 2 cd available soon, since several people have asked for it.


March 07, 2005, 12:27 PM

There's also something very similar in ShaderX 3 iirc (but not quite so good results).


March 09, 2005, 12:05 AM



March 09, 2005, 12:08 AM

I've been doing some experimenting with hybrid approaches to several of these techniques and I think I have some of the best results on PS2.0 hardware. My implementation makes use of the g and b channels of the heightmap, storing a map representing the shortest distance from the top and bottom of the heightmap to heightmap respectively. Several interesting things are done at this point. In the vertex shader, the tangent space eye vector is computed such that the z component is -1.0 and the x and y components are what they should be relative to the z component taking into account the 'bump depth'. In the pixel shader, the distance/heightmap is sampled at the ray origin and at the ray origin + the tangent space eye vector (this texel position is at the 'bottom' of the imaginary bounding box representing the displaced surface). At this point, the tangent space eye vector's length is trimmed depending on the actual distance we now know no surface resides in. A linear ray march is performed, sampling the heightmap as we go. In my particular implementation on baseline PS2.0 hardware (I also have a radeon 9600 pro), I taking 15 samples on the ray march. One last refinement takes place by raytracing an intersection of the eye vector and the plane that exists by the differing xy positions and heights at the last point found inside and outside the heightmap. Performance is roughly 70 fps for each pixel at 800x600. Here's a diagram illustrating the improvements.


March 09, 2005, 02:06 AM

Just thought I'd mention my email is friesen@(remove this) in case you had any questions or comments.


March 09, 2005, 02:28 AM

I hate to post 3 times in a row, but I forgot that some people might like to see results!

Just a few notes, the perspective distortion weirdness in results2 has since been fixed. Results 3 shows off specular lighting, blending and refractions used in conjunction with the displacement mapping shader to make a realistic looking water pool effect (the real thing is animated and is fairly neat to watch, the heightmap/normals are generated in a renderable texture in another shader, in theory, a physical water simulation could be done there.) Although the regular displacement mapping shader actually works in 1 pass, unfortunately with the addition of the refractions this particular shot was done in 2 passes. Performance was decreased but still very similar. And lastly, results4 shows off silhouetting which is not completely working but I'm still working on it.

Once again, questions/comments appreciated.


March 09, 2005, 03:34 PM

That silhouette shot is awesome. How do you achieve that? If you can do that sort of thing at 70fps on PS2.0 hardware, you are my new god =D ohh, and did I mention - LET ME DOWNLOAD YOUR SOURCE CODE!


March 09, 2005, 04:07 PM

I'm not ready to release code quite yet. The current implementation is written in GLSL (but in such a way that it is very explcit, translating to the intended assembly instructions, otherwise it would never work). There are a few refinements I want to implement for robustness purposes before I release it. Namely, the ability to produce accurate silhouettes for a displacement map across on arbitrary mesh by generating a tetrahedron for each triangle (it's not as bad as it sounds because you can use _very_ low detail base meshes for this and still get a very detailed looking model due to the displacement mapping) as well as spherical interpolation for the normals across the 'top' side of the tetrahedron. The reason it's so fast compared to the PS3.0 implementations of 'relief mapping' and other approaches is because there is a very low order chain of dependent texture reads and no flow of control branches or loops (well, obviously otherwise it wouldn't work in PS2.0 in one pass). Methods like binary search although accurate, require an additional order dependent texture read for each successive iteration. This creates a bottleneck in the architecture, as shaders actually process pixel shader instructions in chunks, and those chunks are bounded by dependent texture reads.


March 09, 2005, 04:57 PM

Fair enough. Do you mean that the displacement map has one texel for each triangle in the mesh, which controls the height of the tetrahedron? I'm not quite sure what you mean here. Does this require vertex program texture reads? Do you apply this to the whole mesh all the time, or do you detect where there are silhouettes and apply it only there?

Also, I just noticed that screenshot 1 appear to have a kind of odd seam running down the right side. What causes this?


March 09, 2005, 05:35 PM

Goodness no, not one tetrahedron per texel. There is one tetrahedron generated for each triangle in the base mesh. There are no vertex program texture reads. The seam you see in that shot was since fixed, it had to due with silhouetting and limited floating point precision.

Here's a larger version of results4.


March 27, 2005, 10:20 PM

I put the poster (which includes a larger source code snippet), some new results with performance numbers for Steep Parallax, and the bump maps used in the figures on the Brown games website.

Demo and full shaders will follow with a later publication.



March 27, 2005, 11:47 PM

Fixed your name, sorry for the original misspelling!



March 28, 2005, 12:01 AM

I understand you've got this working on PS2.0 hardware. Any chance I could see a code snippet for that? I was trying to implement your technique on my Radeon 9600, but I could only get it to work with 4 iterations and without LOD biasing, which looks very bad. It may just be the ATI drivers' shoddy GLSL support, but the moment I tried to use LOD biasing, the shader crashed the GPU.


March 28, 2005, 08:33 AM

I can't get GLSL programs to work reliably on ATI hardware, either. We used HLSL under DirectX for PS 2.0. Max packed nine iterations into PS 2.0. I'll ask him about releasing the code; we haven't polished the DirectX version very much compared to the GLSL one.



March 31, 2005, 08:44 PM

I'm not quite ready to release source yet...

The trick to getting a higher number of iterations in the pixel shader is maximizing the precalculation of values in each texture stage in such a way that you will be using all of your temporary registers. ARB_FP1.0 specifies 32 4 component temporary registers, whereas DirectX 9 only includes 12 in the specification, making a DirectX PS2.0 implementation impossible.

After the initial calculations and samples that will yield the ray origin and direction (modulated based on the known ray endpoint and number of samples), the coordinates for the heightmap samples can be generated for each sample in a unique temporary register to prevent texture sample indirection. It is important to emphasize that the 15 sample limit on PS2.0 hardware is not based on the instruction limit or texture instruction limit, but based on the number of availible temporary registers. At this point, the heightmap texture may be sampled into the remaining 15 registers and the height for the corresponding sample position subtracted from it. You might be thinking that extra samples could be gained by taking advantage of the fact that only one component of the heightmap is useful here, but unfortunately texture sampling instructions in PS2.0 must write out all register components.

The actual point of intersection is determined by performing a series of cmp instructions to determine the first sample under the heightmap, and then by refining the point of intersection as I mentioned earlier.

It is important to note that I did not come up with this implementation entirely, only the refinement that I diagrammed that shortens the amount of distance the ray must sample considerably, increasing accuracy. The main idea was developed by a friend of mine Keith Yerex, who works for Bioware.

The reason I'm not releasing code or a demo yet is because I am having difficulties implementing the dual distance map optimization with tetrahedron based geometry to displace arbitrary meshes. If I am not able to come up with a solution soon, I'll just release what I have, which works quite well for displacing locally planar meshes.


April 01, 2005, 04:56 AM

I was actually asking about Morgan's technique...but I'd be happy to see yours too, as soon as you have something you're willing to release =)

This thread contains 28 messages.
Hosting by Solid Eight Studios, maker of PhotoTangler Collage Maker.