Not logged in, Join Here! or Log In Below:  
News Articles Search    

Submitted by Tim C. Schröder, posted on August 18, 2001

Image Description, by Tim C. Schröder

Well, I thought it's time to share some new pics from my current engine. Basically it is a GeForce 3 optimized FPS engine, with 100% dynamic lighting as the main feature. All the shots you see here have a preprocessing time of 0ms, except the 500ms for the Octree during the level compilation;-) I tried to get the lighting model working on GF1/GF2 cards, but I failed. Not possible. At least it screams on the GeForce 3. Well, I guess you want to see the feature list, here it goes:
  • Vertex.. ahh... SMARTSHADER(TM) for basically every triangle on the screen
  • Pixel shaders for the entire lighting
  • DOT3 diffuse + specular per-pixel lighting on every surface (Well, not on the skybox...)
  • Per-pixel normalization cubemap or pixel shader normalization for every surface
  • Tangent space setup done by vertex shaders
  • DOT3 self-shadowing
  • PPA for every surface
  • Realtime general shadow solution, everything shadows on everything including on itself
  • Colored lights
  • Blinking, flickering and pulsating lights through a shader definition file
  • Lights can be assigned to splines
  • Detail texturing
  • Hyper texturing
  • Advanced vertex buffer optimization code to gurantee best T&L performance
  • Light flares + coronas, implemented through vertex shaders
  • Ellipsoid based collison detection / handling
  • Realtime in game light editor, modify every aspect of the lighting without any reloading
  • Configuration system allows you to change basically everything without any code rebuild
  • Basically any damn 3D feature in the world. If it is not supported yet, it will be in the future
  • The engine is incredible CPU limited at the moment, this is my main problem. Rendering brute force is sometimes even faster than performing HSR. I can render 4 quad texture passes without much drop in performance. The CPU code isn't sooo unoptimized, but the GF3 is card that handles everything you throw at it and just screams for more, so my crappy 700Mhz machine can't keep up.

    If you are interested in a discussion about realtime lighting algorithms, you know my mail. Note: A discussion is a conversation between to similar skilled people that learn from each other. So please no "Teach me how to do this !!!" mails. I'm quite busy with my work and writing my engine, so really no time for tutorials / explanations, sorry ;-(

    Anyway, comments welcome.

    Tim C. Schröder

    Image of the Day Gallery


    Message Center / Reader Comments: ( To Participate in the Discussion, Join the Community )
    Archive Notice: This thread is old and no longer active. It is here for reference purposes. This thread was created on an older version of the flipcode forums, before the site closed in 2005. Please keep that in mind as you view this thread, as many of the topics and opinions may be outdated.

    August 18, 2001, 06:38 PM

    I think the actual shape of the level should just be 10K - 20K polys, so this is even small enough for brute force processing. 90% of the polys should be in detail objects and curved surfaces. You can process the high poly objects with different algorithms and cull them very fast. Also, you can write hyper-optimized code to generate shadow volumes from your curved surface primitves, this can also be done REALLY fast in a vertex shader.

    Just keep in mind that with todays HW we can write the *perfect* direct illumination model. Not global illumination, but I think we can create the ultimate direct illumination system in realtime.



    August 18, 2001, 06:41 PM

    like, DUH!


    August 18, 2001, 06:51 PM

    Ok, the warehouse and the edge, the two best maps ever. The next shots will use Q2DM1, ok ? ;-)



    August 18, 2001, 07:07 PM

    too bad that 95% of the gamers out there dont have "todays hardware" :(. .. But the time will come^^.

    i myself sit on a l33t4ss PII 350 with GF2MX...


    August 18, 2001, 07:42 PM

    Well, I think that even 10x 2Ghz could only reach a fraction of the performance of a GF3 ;-) You'll NEVER EVER beat a GF3, maybe many many years in the future. I would go so far and say that even a cluster out of multiprocessor P4 systems wouldn't stand a chance against a single GF3.

    A GF1 renders 15M specular shaded tris per second, please show me this on any CPU of your choice. Try to show my Quake3 in SW on any PC system of your choice at 2048x1536@32@24FPS. Something current gfx cards easily beat. Doing two 4 component dot products in a single cycle is something that current GPUs can easily do, how many CPUs do you need to do the same ?

    A GF3 has 76GLFOPs, how many does a P4 have, 1 ? 0.5 ? Less ? ;-))
    You are going to spend 500.000$ and you are going to need a second house just to get the same power as a GF3.

    With the curretn CPUs you can't even dream of reaching the powerf of a GF3. Oh well, did I mention the XBox GPU is even much more powerful ? ;-) And before you can even buy this 2Ghz CPU, we are going to have Radeon2 and GF3Ultra, which surely add another 20GFLOPs.
    These GPUs are so fast at what they do, they could actually measure the speed of them by indicating how many times they are faster than a 2Ghz CPU ;-)

    Actually, I use anisotropic filtering. Also, the 6 textures are just for the single light case without shadows.


    August 18, 2001, 07:43 PM

    Why should they bother to buy ? I would not have a GF3 when I wanted to play just games... Most games barely use the features of a GF1.



    August 18, 2001, 07:45 PM

    Very stupid question, but is this using ogl or d3d8?


    August 18, 2001, 07:52 PM



    August 18, 2001, 08:48 PM

    I would also like to toss my bag of awe and respect into the pot:

    Tim, that looks really, really great. Can't wait for the 'DivX of the Day' to see the pulsing, flashing lights :) (Just kidding!). Anyways, keep up the great work and please post another pic once you have the shadows working in the levels!

    - Marco


    August 18, 2001, 08:58 PM

    Thanks a lot !

    I one or two months I'll probably have something new to show.


    August 18, 2001, 09:47 PM

    Call me crazy, but it looks like you're using bumpmapping in those images. If so, where did you find Quake 2 bump maps?


    August 19, 2001, 12:33 AM

    Not true. In my engine I only use the silhouette edges for the shadows and even on objects with a quite complex topology, the shadow volumes are correct.


    August 19, 2001, 02:37 AM

    First of all, great looking pictures!

    Did you read my post over at DirectXDev
    about the pipeline structure? Any
    constructive critisism to such an appoach.
    (David Hallgren)

    I'm working on a similar approach to lights
    and shadows. Did you come up with the
    algoritm for the shadows yourself? I've
    seen the interviews with Carmack and gets
    the impression that he first stencils out
    whats in shadow for the current light, then
    render the tris lit by it. That saves the
    fillrate that it takes adding the shadows
    afterwards of course but it also means that
    were lights doesn't shine, it's completly
    black, right?
    What I love is the attempts to make a
    general solution for everything, without

    How do you handle the depthsorting of
    transparent tris?

    Keep up the good work!

    zed zeek

    August 19, 2001, 03:34 AM

    i was reading the other day about some ppl were saying with pixel shaders a fast athlon in software doing about the same speed as a geforce3 in hardware, but unfortunately i dont have either an athlon or a geforce3 so i cant test myself


    August 19, 2001, 03:59 AM

    Hi Tim,

    Excellent work. I suppose you've implemented the techniques described by Ron Frazier at didn't you? Me thinks that Mr. Carmack also uses these techniques. Actually you don't really need a Geforce3 for the approach, just some register combiners & a dotproduct3 facility (so gf1 and gf2 can also do the trick). Why are you limiting to the geforce3? Due to fillrate issues (& the extra texture units)?

    Anyway, it looks absolutely amazing. I think this is the way to go in the future. As for the shadowvolumes: what algorithms do you use for hulldetermination? Are they applicable for GPU processing? Did you solve many of the shadow volumes problems? How many passes do you need for the full diffuse-specular-bumpmapped lighting rendering?

    That's alot of questions, i know.

    With respect,



    August 19, 2001, 04:34 AM

    It's nice to note that, and carmack said this in the interview as well i believe, an xbox actually has 2 t&l units
    So on an xbox it might actually be faster..
    Ofcourse this can only be confirmed if someone would test this on a dev machine

    Kieren Johnstone

    August 19, 2001, 05:25 AM

    Hey tim :)
    Update your site already! Thanks... I miss it *sob*

    Lars Birkemose

    August 19, 2001, 05:42 AM

    It looks good, but to be honest, isnt this what DX8 and GF3 is supposed to do.... out of the box ?
    Colored lighting and fancy textures sure as hell aint gonna cut it for tomorrows games.


    Fabian Giesen

    August 19, 2001, 05:53 AM

    Looks nice and is technically cool, *but* :) the problem is IMHO that it doesn't have the right "feeling" yet. It all looks too clean and sterile and sharp. Maybe that's just me, but as I see it this is a nice solution for dynamic lighting, but no match for the smooth feeling you can achieve using global illumination techniques.

    Your lighting works for dark, industrial-like sceneries, but I'm quite sure it just won't look right when used to light a "normal" room someone would actually live in :)

    Note that you don't exactly need global illumination to achieve this smooth feeling - for example, in movies, typically insane numbers of lights and direct illumination are used because global illumination algorithms slow down the rendering time far too much. Still, I think that even with a GF3, you don't have the GPU power (and precision) to lighten your environment that way.

    Which practically all boils down to the same old thing that's been used in FP shooters "since the beginning": mostly precalculated ambient lighting combined with a few dynamic lightsources, ofcourse with the big difference that those few dynamic lightsources now look alot better.


    August 19, 2001, 06:13 AM

    Obviously, you can't even program. You probably never seen a C++ program. My engine is over 600K of code, if you would call this out of the box, well, yeah... If this is out of the box, why do people develop engines for years and still don't have PPA or bumpmapping ? Please, don't talk about things that you obviously don't understand a bit ;-)



    August 19, 2001, 06:14 AM

    Good joke ;-)

    Well, the D3D SW emulation makes 0.5FPS in scenes where my GF3 makes 2500FPS. No joke. It is completely unrealistic to assume that you can even reach 10% of the speed.



    August 19, 2001, 06:15 AM

    Well, I mentioned a dozen time that I'm using bumpapping ;-)

    I drew them, took me 30 minutes ;-)



    August 19, 2001, 06:18 AM

    Well, I think this is the natrual approach to implement shadows. Not a big deal.

    You need to render the scene in the depth buffer first, I see no other way ;-)



    August 19, 2001, 06:21 AM

    Well, you can pretty much kick it on GF1/2. Rendering 5 passes or so into the alpha buffer plus the stencil shadows and some other non lighting related stuf... 1FPS or so. Also, without vertex shaders, doing all the tangent space stuff etc on the CPU, well, forget it ;-)

    I think I solved all shadow volume problems, but manye it can be don faster. As I said, I haven't settled down on a final solution...



    August 19, 2001, 06:22 AM

    Hey fluffy,

    Not true. In my engine I only use the silhouette edges for the shadows and even on objects with a quite complex topology, the shadow volumes are correct.

    I did not claim that silhouette-based extrusion did not work for complex topology just that it does not work for some complex topology. Let me first make some definitions. Let P be a point on a silhouette extracted from the surface S. The shadow volume face generated from this silhouette has a normal N which is identical to the normal of S at P.

    Contrary to what some of you may think, the shadow surface orientation should sometimes be reversed (e.g. the normal should be reversed). The classic case of this is that of a torus (which has genus 1 topology) viewed at an oblique angle (I am paraphrasing [1]). The star shaped inner silhouette contains four cusps. A cusp in a silhouette can be formed at a point where the tangent of the silhoeutte curve (which is a line segment in our case) is collinear with the viewing direction. Inspecting the surface normals of the torus along each of the four regions we see that in two regions the normals point into the "silhouette loop" while in two others, the normals are pointing outward (see [1] for diagrams). Here it would be impossible to construct a single shadow volume for the star shaped silhouette curve which is orientable. Two of the shadow surface orientations must be served.

    Heflin and Elber provide solutions to this problem in [1]. They are discussing this in the context of free form surfaces but it also applies to discretized surfaces (e.g. polygonal meshes) under certain conditions. Also, while this example specifically concerned itself with the case of a torus I believe there are other objects of genus 1 and higher genus which have similar problems. Granted, this might be too much of a special case for a lot of you to bother but at least now you know a potential source of problems if the shadows start acting weird.

    While it will not solve this problem specifically, it makes sense to partition objects of genus higher than zero into objects of zero genus. If you have found just one silhouette of a polygonal mesh of genus zero then you can reach all over silhouettes by edge traversals with a floodfill-like approach.

    [1] Heflin, Elber. Shadow Volume Generation from Free Form Surfaces


    August 19, 2001, 06:27 AM

    Well, I think it looks *absolutely not* steril, but anyway.

    It is not the same old thing, and looks lightyearsy better. Lightmaps are just very very low res, here we have usually two texels for each pixel on the screen. The small details because of the bumpmaps, the sharp per-pixel highlights and the shadows make a major difference, even for static lights. Not to say that dynamic lights looks much better.

    I think what we have now thanks to the GF3 is a million times better than the old lighting model, you can't compare it. Some tim ago we had to wait 30min or so for lightmaps to compile, now we can render the same at 5x higher resolution at realtime FPS.

    It is not the final solution, but still a major step forward


    The Legend

    August 19, 2001, 06:30 AM

    Now try to render a room in a normal house, where you can see a lot of the typically white walls, if you don't have smooth shadows the room will look *really* strange ...

    Mike Armstrong

    August 19, 2001, 06:33 AM

    nice shots, although how do you tie the attentuation of lights to the attenuation on the shadow volumes. My guess is that you don't and the shadows are of constant multiplies blend colour, now this is obviously not a bad thing but if only certain lights are selected for a given scene, do you find that shadow volumes are suddenly cut off ?( eg as they go through walls and suddenly stop ).

    Additionly how do you handle the case of lights being in two sepearate rooms each casting shadow volumes into the other room. Now normally this would obviously cause shadows in both rooms where perhaps there shouldn't be. This leads to the question, are you using or tried using the opposite method ie light volumes.

    I don't mean to critise, however I know all the fun that shadow volumes generate:>



    August 19, 2001, 08:12 AM

    Hey man, don't laugh at people who can't afford a GeForce3 but have written MMX inner pixel loops that only take 10 clock cycles for 32-bit lightmapping with independant texture coordinates! ;)

    It's obvious that a GeForce3 GPU is better at rendering than any CPU, simply because it's the only thing it can do. A CPU is designed to be extremely versatile, and this poses a lot of limitations. It's not difficult to get 100 gigaflops with a processor that can only do floating-point instructions on fixed registers, but that's pointless. So these numbers don't mean a thing. Intel or AMD could probably add the vertex and pixel shader instructions to their instruction set at higher clock frequencies. Unfortunately they won't do this because even MMX, 3DNow! and SSE haven't been used a lot for rendering. That's probably because there are only a few professionals that understand how to use them to their full extent (e.g. Sree Kotay, Erik de Neve).

    For the price of a GeForce3 you can almost get a multiprocessor PIII system, but apparently people rather spend their money at a card that can only show it's mighty power in games and some other 3D applications. On modern CPU's you can play Unreal Tournament in software at 1024x768, but the gameplay is just as great on my PII 300 at 512x386!

    Your engine look realy, realy good. You obviously have put a lot of effort into it and I admire that. But remember that there's more to programming a good game with great graphics than having an engine that only rocks on a GeForce3. Just my humble opinion...


    August 19, 2001, 08:48 AM

    the dx8 software emulation is c-code

    not assembler, not MMX, not 3dNow, not SSE, not even fast c-code

    it is simply code to show what it has to do ( means perfect to read ) for the driver developers so they have an easy code they can manipulate for them..

    oh, and, if you code right for a todays cpu, you can get more out of it that you can out of a gf3, just because you can do effects your gpu has problems with.. oh, and perfect culling with c-buffers.. no pixel too much, no multiple passes.. all can be optimized

    i cant wait to have a programable pipeline for cpu's ( would be very nice for mp3-encoding, mpeg4-encoding and such, too.. )

    oh, and, with a athlon 1.4gig or something you can do damn nice stuff like beautiful realtime raytracing on 320x240.. you can never get this on your geforce.. perfect reflections from everything to everything ( with infinite-tracing, and all with float-precision.. ).. and there the shadows do really work.. everything does shadow everything;)

    can't wait to get the athlon myself.. i currently have a pentium3 500, where i have 15 fps on 160x120.. and NOT optimized with assembler yet..;) i will rewrite it for 3dnow then and then fuck gf3 ( wich i will buy, too;) )

    oh, and a athlon 1.4gig is cheaper than your gf3;)

    if gf3 would use the same mem the athlon would use ( like in a nforce ) i could even hardware accelerate the raytracing a bit, wich would be nice.. but for now..

    bugs your gf3 has:

    cubemap rendering is not at all fast
    very unprecious register combiner part ( 8bit [-1,1]-range ).. normaly you use floats today for this on cpu's
    only triangles
    terrible to render volumetric objects ( texture3d )
    stupid architecture for perpixellighting
    32bit color, i use 128bit myself ( or 96 without alpha.. ;) )
    framebuffer access much too highlevel with much too less features

    this for now..

    btw, it looks great;) (espencially the big picture)

    This thread contains 140 messages.
    First Previous ( To view more messages, select a page: 0 1 2 3 4 ... out of 4) Next Last
    Hosting by Solid Eight Studios, maker of PhotoTangler Collage Maker.