Not logged in, Join Here! or Log In Below:  
 
News Articles Search    
 


Submitted by Henri, posted on July 17, 2001




Image Description, by Henri



Greetings - yep... its an ATR - Another Terrain Renderer.

- this is my claim to fame :) my post-graduate research focusses on developing a new continuous level-of-detail algorithm for terrain - the pictures above demonstrate the algorithm in action.

Ignore the abysmal frame-rates... my PC and graphic hardware was new back in 1996. On more up-to-date PCs the frame rates are on average in the range of 60 to 120. (The terrain data is a 1024x1024 heightfield, no far-clipping is performed.)

The algorithm utilizes split/merging (ala ROAM) - but ignores the idea of priority queues in favour of 4 constant-time LIFO queues; additionally the underlying mesh structure is completely different to ROAM's RTIN approach in favour of a form of ETRN (equilateral triangle regular network). This mesh structure tesselates faster and requires less work from top-most to bottom-most LOD compared to other approaches.

Additionally the mesh structure varies quicker from high to low detail. And it is extremely hardware friendly (in the sense that it can be used to generate long triangle strips).

Note from the screenshots: on average there are only about half as many vertices that require processing as there are tris in the scene.

The executable and (surprisingly short) source and a basic description of the algorithm are available here:

http://www.cs.sun.ac.za/~henri/diamond.zip

Press "v" a couple of times when executing to texture the terrain.

The official theoretical paper will only come in a couple of months... so don't wait up for that one. ;)

I'd love some (constructive) feedback... so go and gimme some.


[prev]
Image of the Day Gallery
www.flipcode.com

[next]

 
Message Center / Reader Comments: ( To Participate in the Discussion, Join the Community )
 
Archive Notice: This thread is old and no longer active. It is here for reference purposes. This thread was created on an older version of the flipcode forums, before the site closed in 2005. Please keep that in mind as you view this thread, as many of the topics and opinions may be outdated.
 
Richard Garand

July 17, 2001, 01:03 PM

This looks great...is there any popping?

 
cubicle

July 17, 2001, 01:42 PM

nice screenies.. but the demo is a no go for my machine. it claims it cannot create a d3d screen on my geforce :(

 
Dan MacDonald

July 17, 2001, 01:42 PM

Im curious about the wireframe images they appear to be an overhead view of the polys in the terrain. Where is the viewing frustrum in relation to the images? in the bottom left corner?

 
Rectilinear Cat

July 17, 2001, 01:55 PM

Cool! Haven't had a chance to run the demo yet....ROAM promises tri-strip lenghts of around 4 triangles (crappy). What is your average triangle count per strip?

 
prolo

July 17, 2001, 02:00 PM

Pretty nice... I don't care for roam though. :P How do you exit the program?? Alt+Ctrl+Del? Not too elegant!! :P

 
Squint

July 17, 2001, 02:03 PM

Those look like triangle fans to me, not strips, in which case I think I can guess part of what you are doing ;)

Incidently, you can change a fan into a strip by just alternating up the vertices (if its drawn with the first corner on the edge of a square like those).

ie, the fan (1, 2, 3, 4, 5) is the strip (1, 5, 2, 4, 3)

dunno if that makes for any speed increase, but it opens up the possiblilty of linking them to the next one along to make longer strips... (Im trying to do that myself only I always find more interesting distractions whenever I start to think about it)

Besides, my algorithm wouldnt get 16 FPS on 1996 hardware yet.....

 
Allanon

July 17, 2001, 02:11 PM

ALT+F4 will exit the program.

 
ecko_53

July 17, 2001, 02:25 PM

As soon as I can run fullscreen apps in low resolutions I'll let ya know what I think :) I scimmed over your article and it looks promissing. I suggest you keep up the good work!

 
Pseudo_Me

July 17, 2001, 02:41 PM

In fact Alt-F4 exits just about every windows program. I never bother making a specific way to exit my demos either. Alt-f4 is just as easy as anything else.

 
Pseudo_Me

July 17, 2001, 02:44 PM

It looks nice, other than the popping. Have you considered trying to minimize the appearance of the popping by interplating the new verticies position?

Also, I got about 30-40 on my GeForce2 with an 866 PIII. Not horrible, but still kinda slow for a game.

 
Beast

July 17, 2001, 03:09 PM

"Also, I got about 30-40 on my GeForce2 with an 866 PIII. Not horrible, but still kinda slow for a game."

That's somthing I don't really understand,
I've got about 50 to 60 and my computer is P2 at 450 MHz with a
TNT Graphicscard ( and I mean TNT, not TNT2! ).

 
Kurt Miller

July 17, 2001, 03:13 PM

">>>>I got about 30-40 on my GeForce2 with an 866 PIII
>>That's somthing I don't really understand, I've got about 50 to 60 and my computer is P2"

Same here. According to the fps display, I got a consistent 50 - 60 with a p2-350 / TNT2.

 
Sebastian Sylvan

July 17, 2001, 03:17 PM

If you plan on using this for a game you should give up any thought on per-vertex manipulation and go for something more hardware-friendly.

Geomipmaps spring to mind.

 
lunar

July 17, 2001, 03:27 PM

I tested your demo and it looks pretty good. A few things I think are worth noting:

1: You need to implement occlusion culling. :)
2: I don't get very good framerates. ~60fps on average with a GeForce3
which leads me to believe that the software is doing a lot more
work than the hardware. ~40 fps for 5000 tris isn't the greatest
considering the hardware.

I think the reason we're seeing slower speeds is because the algorithm is software bound like ROAM. I haven't looked at the code or anything, but I think occlusion culling would speed things up a lot for you. You will run into problems with speed if things like trees and water are added to the landscape. Also, if you plan on having multiple textures it will bog down a lot. It looks really good so far, and I really didn't notice much popping at all. I would say try to limit the amount of work the software does though. Have a good one.

 
Nick Van Tassell

July 17, 2001, 03:47 PM

There's a lot of popping and the frame rate does not to be as high as that of a similar ROAM implementation. Here's the ROAM PDF: http://www.llnl.gov/graphics/ROAM/roam.pdf, you may want to look at it so you can aleviate the popping artifacts.

Respectfully submitted, this is GREAT work, I just want to help make it better.

 
Lucid

July 17, 2001, 04:21 PM

I would like to remind people to turn V-Sync off while trying to get a real feel for speed(as with the hardware mentioned by others better performance should be expected)...
On my system...
AMD 1.2 giga
GeForce2 Ultra
I got between 120-300+ fps average while viewing a decent amount of land... Think it averaged about 180 fps...

cool

 
xstreme2000

July 17, 2001, 04:42 PM

Nice ;)
Just thought I'd let you know about 85 fps on a 1000Mhz Athlon with a ATI Radeon 64DDR
All you gotta do now I finish it :)

 
LoreKeeper

July 17, 2001, 04:44 PM

Hey-Ho! It's me... Henri

thanks to all responses; I'll look into some of the problems/suggestions...

First-off... I'm kinda surprised at the 40fps on high-end machines - but the program doesn't feature elaborate card detection and hardware optimization (quick-n-dirty really). I did specify DirectX managed resources... so I kinda hoped that DirectX would've figured out the Hardware.

Also... the primary graphics adapter is selected... so if your kick-ass card is secondary, you won't benefit from anything.

I'll try to cover all the necessary points:

Vertex-Popping: there is some, but not alot - by setting the target priority you can trade popping for tris. Personally I suggest adding geo-morphing; this will vastly reduce visual popping and you can get away with drawing even less tris.

Occlusion-Culling: you are right ofcoz - occlusion culling (or some other form of PVS) would greatly speadup the rendering. I should have made it more explicit that the implementation is a pure CLOD algorithm - it forms part of my research thesis for this year, as such I cannot water-down the results with additional optimizations that are in essence additional algorithms. WYSIWYG using purely the CLOD algorithm (not counting frustum culling).

Triangle-Strips: *GrinZ* - look at the bottom right image, those are positively long sessions of triangle-strips. I've not implemented a tri-strip version (see below) but strips of 50 and more tris aren't uncommon. I guess 20 tris/strip should be achievable.

ROAM: I've very carefully studied and implemented the ROAM algorithm, as a test bed to compare it with my algorithm; using my implementations Diamond beats ROAM. Obviously there exist ROAM implementations that could beat the Diamond implementation above - but whatever optimization and technique was applied in those implementations can also be used in Diamond. The point is that in my opinion Diamond's structure and queueing mechanisms are more efficient for terrain generation and that if you have similarily implemented Diamond and ROAM renderers, then Diamond will beat ROAM.

I greatly admire the ROAM algorithm and paper, and consider it the most important advance in terrain LOD (ROAM and the predecessor it is based on, by Lindstrom I think...) - however ROAM was designed with completely different requirements in mind than are viable to us with modern graphic hardware. Diamond can benefit from advances in graphic hardware much more readily then ROAM can.

Vertices: infact the implementation above doesn't use triangle-stripping becoz the more efficient indexed vertexbuffers are used. This ensures that in the above implementation not a single vertex is duplicated and a vertex is transformed only once per frame.

prolo: have you tried [ALT]-[F4]?

Dan: yep... bottom left corner... facing more or less East-Northeast.

Squint: any triangle-fans are incidental - I use arbitrarily chosen tri structures that morph between the LODs. You can obviously look at those properly, and ensure that all LOD transitions are tri-strip friendly. All pure LOD sections are definately tri-strip friendly.

Sebastian: per-vertex manipulation is expensive... admittedly, but the frame-coherency ensures that only about 5-30 splits/merges occur per frame. No additional per vertex manipulation occurs... so that should be fine. On the other hand... a continuous LOD like ROAM and Diamond can produce considerably lower tri counts whilst offering a more accurate depiction of the terrain. All in all the CPU workload is kept at a minimum - and the GPU should be gettin the brunt of the work (for all practical purposes the GPU [not the CPU] is the bottleneck in this algorithm).

Final: Seems that some high-end cards suffer a bit - the GeForce2 I tested on worked fine... delivering 60fps on average (on a standard CPU). My guess is that there is some other (implementation-specific) hiccup that makes the performance be merely adequate on some machines... if anybody plays with the code and comes up with a suggestion - I'd love to hear about it. [Unforetunately I'm not familiar with any GeForceX specific optimizations or limitations... probably there is some dead-simple setting somewhere that I didn't implement.]

I may take the time to write a C++ version; although anybody with a off-night's worth of time should be able to make a quick conversion from the Delphi source.

Overall the provided source and article should be sufficient for anybody that is curious to implement the Diamond. I claim that the algorithm is suitable for very powerful terrain engines, particularly provided that additional supporting features such as occlusion culling, PVS and geo-morphing are implemented as well.

*grinZ* - okay... this got a bit long... hope you beared with me. ;)
Henri

 
Scythe

July 17, 2001, 10:45 PM

Its rather ironic that you would post this today, just last night I was looking at my two terrain engines ( one ROAM and one similar to the one used in Soul Ride from the Gamasutra article ) and thinking that both of these methods are not making use of any of the hardware acceleration available today.

So, this morning set out to design something better that could make use of the hardware. Then around noon, I head over to flipcode and there it is :)
For once, I wish I could invent something but someone always beats me to it :p

Anyhow, this algorithm looks quite promising and Im now in the process of learning object Pascal (AARGGhhhh!!!) so I can read the code.
If you decide to write a C or C++ version, Id love to see it.

 
Coriolis

July 18, 2001, 01:48 AM

I also apologize for the long post :)

While this is interesting work, I fail to see any compelling difference between this algorithm and ROAM. You have changed the shape of the base triangle, and have chosen split, merge, and T-junction elimination appropriate to this new base triangle, just as you mention in your description. This may or may not be conducive to triangle strips than ROAM, but triangle strips are frequently less efficient than a triangle soup that you give to the card in draw buffers anyway, due to the added cost of extra communication with the card. You also mention this fact, and that you use vertex indices instead of strips.

Now, you claim that the primary advantage of your algorithm compared to ROAM is that diamonds do not need a priority queue. Well, actually, priority queues only appear in the ROAM paper in the optimization section as a method to avoid recalculating the weights for all potentially merged or split vertices every frame. If you were going to update all splittable and all mergeable triangles per-frame in ROAM, you could use the exact same four-queue approach as diamonds use. ROAM listed this as an optimization under the assumption that the cost of updating the priority queues is going to be less than the cost of re-evaluating every vertex that could potentially change; if this isn't true, just don't do the priority queue.

In summary... the only real distinction I see between this algorithm and ROAM is the shape of the basis triangle, and the consequential adjustments to the merge and split operations. This choice may or may not have advantages; I haven't looked at it enough to reach such a conclusion. It may adapt more quickly, or it may use fewer triangles, or it may be more efficient to render; I don't know. The only thing that appeared obvious to me aeshtetically from the demo is that it tends to make sliver triangles in a lot of places, where ROAM guarantees all triangles are right triangles. This may be a non-issue, or it may just mean a different T-junction removal algorithm is needed.

 
shrike

July 18, 2001, 04:00 AM

The solid render looks like mud without lighting, or distinct lighting at any rate.

 
tcs

July 18, 2001, 04:43 AM

The point is that such complicated LOD algorithms are pretty useless today... ANY per-vertex algorithm is inferior to a simple sector based LOD, stuff like geometric mipmaps etc. As one poster said, 5000K tris on a GF3 at 60FPS sucks, you can do the same on a TNT1. If you just render such simple case polygons, you should easily get a few million polys per second.

Tim

 
Johan Runeson

July 18, 2001, 05:06 AM

Now I've seen lots and lots of (C)LOD terrains here (and elsewhere), and
they all seem to have one thing in common: the triangles far away from
the viewer look smaller (on the screen) than the triangles nearby.
(The ROAM paper is a notable exception to this.) Shouldn't a proper
LOD algorithm strive to make all triangles occupy the same
screenspace? In some applications, I would even imagine that the
details of the nearby landscape would be more important than the
far-away stuff, and so should be rendered with even smaller (screen
space) triangles.

/Johan Runeson

 
Rob James

July 18, 2001, 06:23 AM

"The point is that such complicated LOD algorithms are pretty useless today... "

That's a very sweeping statement and I have to disagree - in part.

The vast majority of the real-time terrain demo's i've seen are based on restricted sets of terrain data - perhaps a million verts at most. This data is static, and I agree that complicated LOD is not the way to go in these cases. But I do get fed up with loading up the latest terrain demo and find that all too quickly I fly of the end of the world like reaching the edge of a tabletop!

Look at the x-isle demo. It's a single, highly defined island. Total poly count around 200K with 60K rendered each view?. It runs very well on my GF3 P900 :) It doesn't use LOD although it does do some occlusion culling. The point is that you are boxed in by the terrain. Without dynamic terrain you cant roam (no pun intended) across vast tracts of terrain. Taken to the extreme, how would you manage an entire planet down to meter resolution ? You couldn't even store the dataset on a harddrive. What LOD is good for is managing these dynamic terrains. Of course there's always a hybrid approach with low-res tiles (no lod) with added dynamic detail (LOD)

Rob James

 
Rob James

July 18, 2001, 06:27 AM

ooops.

Sorry tim I just spotted your sig line
"Webmaster of glvelocity.gamedev.net
Employee of www.crytek.de"

I mentioned x-isle without realsing where you worked!

I guess you know more about x-isle than anyone :) My kids LOVE the dino's.

What 'is' the poly count of the isle by the way ?

Rob J

 
L.e.Denninger

July 18, 2001, 06:33 AM

A simple reminder for people with kickass graphics cards etc. :

the fact that you have a kickass system + a ruling viodcard doesn't mean jack shit if it's not properly set up :)

When the first Geforce3 arrived here, we ofcourse installed the latest NVidia-detonators, and benchmarked it with 3DMark2001.

Result : the Geforce3 was about just as fast as my own Geforce1 SDR.

After kicking in the latest chipset-drivers (VIA 4in1) and other blah that had nothing to do with the *videocard*-drivers, it suddenly went through the roof.
(Ofcourse, tweaking your OpenGL / Direct3D-settings always helps too - but that didn't cause the major speed increase)

sequences that would do 0.8fps suddenly did 25fps.

stuff that did 25fps suddenly did 140fps :)

 
Sebastian Sylvan

July 18, 2001, 06:38 AM

It's not the number of per-vertex operations you do (most of the time the per-vertex operations in themselves are pretty cheap) it is THAT you do them. If you modify one vertex in a vertex buffer/VAR you might as well modify them all. Doesn't make too much of a difference.

 
Joachim Hofer

July 18, 2001, 07:41 AM

Man, you speak out of my soul. I suppose that I am very close to creating planets as a single truly curved surface (Described by a function). And I will definitly do need roam or something equivalent as this to display them, as if I stored the mesh as polygons, I would run out of any harddisk space.
I am really bored about those statements like "Your engine is no good, it displays no more than 5000 tris. Mine displays 500000."
A polygon-pusher is always the fastest. But I believe that high polygon counts are not everything. And I think that this is the case especially on todays hardware.
I think that the faster the hardware will get, the more LOD we will need, because as soon as you can display so many polygons that each of them covers one pixel, there will be no more need to increase the polygon count. Got what I mean? And as soon as you have billions of polygons in a scene, you will have no chance but using a decent LOD.
LOD is the future, I think. Especially on todays (and tomorrows) hardware. I think we should concentrate on this topic much more.
btw. This is a really good IOTD. I will take a closer look at your diamonds, and possibly take them rather than my quadtree...

 
LoreKeeper

July 18, 2001, 12:59 PM

Hi... me again (Henri)

I agree that in some aspects using terrain block-LOD is viable and certainly is easy to program for. But continuous LOD has several distinct advantages and can, if the vertex overhead is sufficiently small then it beats a block-based LOD by orders of magnitude.

Okay... you say the vertex overhead cannot be made sufficiently small? Well, the frame-coherency already minimizes the work that needs to be done be frame update. And all the work that is required for the vertecies can be deferenced and handled on a need-to basis. (Refer to ROAM paper for some details.)

Coriolis: your statements on the similarities between ROAM and Diamond are mostly valid. But then again, ROAM is only a relatively small step forward from the split/merge terrain solution proposed by Lindstrom. Diamond simply takes it another step forward.

Additionally - ROAM requires priority queueing not as an optimization, but to ensure that target-Tri-counts can be achieved, and that the most optimal sub-solution is maintained at every step (a greedy algorithm approach) - Diamond in turn ignores priorities and works on a LIFO system.

The changes in the underlying structure have several advantages, apart from being considerably more strip-friendly (and strips can reduce the hardware overhead by upto two-thirds). Slivers as you mention exist, but only arbitrarily so - the Diamond is at its theoretical best using equilateral triangles as base structure, and right-triangles as transition tris.

Diamond is not supposed to represent a vast new leap in terrain algorithm technology, just simply a better way to handle terrain - I think that the split/merge approach is the theoretical optimal solution for CLOD handling, and any algorithm that produces CLOD meshes will incorporate merges and splits or some similar technique, if it wishes to achieve frame-coherency.

In my opinion the current measure of any terrain algorithm, is whether the CPU or the GPU is the bottleneck. The only reason why block-LODs are considered "envouge" is becoz they can be implemented to ensure that the GPU is the bottleneck. Well, I claim that Diamond is similarly efficient (potentially utilizing the GPU to its capacity). But additionally Diamond maintains the advantages of a CLOD renderer (lower tri counts, less popping, higher terrain accuracy).

One last note... - theoretically the Diamond can be implemented to maintain both frame-coherency and data-coherency. Data-coherency meaning that vertexbuffers are only updated with those vertices that are changed (typically maybe 5% of the current vertexbuffer size).

The implementation I've offered above does not use a number of potential optimizations - this is to ensure that the source reflects the theory easily, and that people can learn and understand the algorithm and source.

And then: note that some people with good hardware setup achieve 180fps on average (Lucid, above). I think that should be sufficient for anybodies demands. ;)

 
tcs

July 18, 2001, 02:48 PM

lol, yes, pretty funny ;-)

Well, I can't say you anything about the current state of the project etc, but when you take a look at the old demo and the MTri/s counter, you can see that we easily render a few million polygons per second. And even more on GF3 cards. And this was an old demo ;-)

In my opinion, all these CLOD algorithms are not suitabel for HW T&L, since a simple LOD scene that is just targeted at pushing polys renders 10x the polygons with less CPU usage and less popping.

Tim

 
This thread contains 34 messages.
First Previous ( To view more messages, select a page: 0 1 ... out of 1) Next Last
 
 
Hosting by Solid Eight Studios, maker of PhotoTangler Collage Maker.