Not logged in, Join Here! or Log In Below:  
 
News Articles Search    
 


Submitted by Tim C. Schröder, posted on August 18, 2001




Image Description, by Tim C. Schröder



Well, I thought it's time to share some new pics from my current engine. Basically it is a GeForce 3 optimized FPS engine, with 100% dynamic lighting as the main feature. All the shots you see here have a preprocessing time of 0ms, except the 500ms for the Octree during the level compilation;-) I tried to get the lighting model working on GF1/GF2 cards, but I failed. Not possible. At least it screams on the GeForce 3. Well, I guess you want to see the feature list, here it goes:
  • Vertex.. ahh... SMARTSHADER(TM) for basically every triangle on the screen
  • Pixel shaders for the entire lighting
  • DOT3 diffuse + specular per-pixel lighting on every surface (Well, not on the skybox...)
  • Per-pixel normalization cubemap or pixel shader normalization for every surface
  • Tangent space setup done by vertex shaders
  • DOT3 self-shadowing
  • PPA for every surface
  • Realtime general shadow solution, everything shadows on everything including on itself
  • Colored lights
  • Blinking, flickering and pulsating lights through a shader definition file
  • Lights can be assigned to splines
  • Detail texturing
  • Hyper texturing
  • Advanced vertex buffer optimization code to gurantee best T&L performance
  • Light flares + coronas, implemented through vertex shaders
  • Ellipsoid based collison detection / handling
  • Realtime in game light editor, modify every aspect of the lighting without any reloading
  • Configuration system allows you to change basically everything without any code rebuild
  • Basically any damn 3D feature in the world. If it is not supported yet, it will be in the future
  • The engine is incredible CPU limited at the moment, this is my main problem. Rendering brute force is sometimes even faster than performing HSR. I can render 4 quad texture passes without much drop in performance. The CPU code isn't sooo unoptimized, but the GF3 is card that handles everything you throw at it and just screams for more, so my crappy 700Mhz machine can't keep up.

    If you are interested in a discussion about realtime lighting algorithms, you know my mail. Note: A discussion is a conversation between to similar skilled people that learn from each other. So please no "Teach me how to do this !!!" mails. I'm quite busy with my work and writing my engine, so really no time for tutorials / explanations, sorry ;-(

    Anyway, comments welcome.

    Tim C. Schröder


    [prev]
    Image of the Day Gallery
    www.flipcode.com

    [next]

     
    Message Center / Reader Comments: ( To Participate in the Discussion, Join the Community )
     
    Archive Notice: This thread is old and no longer active. It is here for reference purposes. This thread was created on an older version of the flipcode forums, before the site closed in 2005. Please keep that in mind as you view this thread, as many of the topics and opinions may be outdated.
     
    Jordan

    August 18, 2001, 03:25 PM

    Amazing, purely amazing, keep up the good work.

     
    Kanda

    August 18, 2001, 04:14 PM

    impressive !

     
    MrDevlin

    August 18, 2001, 04:15 PM

    Now I know why a gf3 is so expensive
    It's really amazing you're all doing that realtime

     
    psykotic

    August 18, 2001, 04:15 PM

    Nice engine, Tim. Could you refresh my memory here: Are you only extruding shadow volumes based on silhouette edges? You realize that this only works in the general case for convex polyhedra? It might not be much of a problem in some cases but in others it will not look right. NVIDIA recently posted a DirectX 8 demo that extrude triangles (not just silhouette edges) in vertex shaders; you might want to take a look. Secondly, what do you mean by hypertexturing? Surely not hypertexturing in the original sense as defined by Dr Ken Perlin? Thanks.

     
    BigB

    August 18, 2001, 04:24 PM

    Man.., impressive

    I would love to see it in real-time.., is it possible for you to get a running demo for download ?

    good work :)

    Bruno

     
    Nick

    August 18, 2001, 04:26 PM

    Wow Tim, this looks very cool!

    I especially like the zero preprocessing for the lighting. It looks almost like real-time radiosity. GeForce3 is a lot of power of course. Acurate lighting is probably not that important to create a great atmosphere...

    Teach me how to do this with my software engine! :P

     
    tcs

    August 18, 2001, 04:56 PM

    psykotic: I said general, I mean general ;-) I only realize that what I do works general = always ;-) I don't use vertex shaders for the shadows, I think I have a better way.

    Hypertexturing is using a low frequency texture to avoid the look of tiling. Works quite well. Well, not in the sense of perlin's idea ;-)

    Nick:
    3x overdraw, 6 texture per-pixel, two dot products, several vector normalizations etc ? ;-) I promise you, after writing vertex and pixel shaders for a processor which is 10x more powerful than a P4 2Ghz you can't go back ! ;-)

    Ah, Kurt !!! Don't resize my pictures with such a crappy filter and compress them with JPEG turned up to max ! ;-) Well, here's the high quality version of the pic:

    http://glvelocity.gamedev.net/iotd.jpg

     
    Kurt Miller

    August 18, 2001, 05:01 PM

    ">>Ah, Kurt !!! Don't resize my pictures with such a crappy filter and compress them with JPEG turned up to max ! ;-) "

    Then next time resize it before sending. Width no greater than 640.

     
    Oscar

    August 18, 2001, 05:02 PM

    Nice work Tim, just some questions:

    Did you put volume shadows back? I think no, otherwise the 3 pictures of the level would have shadows...but in anycase very interesting stuff. I also want to ask you (like psykotic) what do you mean with hypertexturing. And with "Tangent space setup done by vertex shaders"?, do you mean putting the light vector into tangent space?

    Bye!

     
    Oscar

    August 18, 2001, 05:03 PM

    LOL! He answered some questions when I was writting the post...

     
    Jordan

    August 18, 2001, 05:05 PM

    I just noticed he was using Quake2 maps. :)

     
    psykotic

    August 18, 2001, 05:06 PM

    So you are extruding every triangle of the shadow caster polyhedron (possibly simplified)? Would you mind explaining the algorithm? I have looked into a number of algorithms for silhouette extraction that exploit temporal and geometric coherence but as I mentioned, using silhouettes exclusively will not work in the general case. Also, an obvious argument for using vertex shaders for shadow volume extrusion is that it would lighten your CPU load which you say is very high at the moment. Thanks.

     
    SirKnight

    August 18, 2001, 05:15 PM

    Thats some very nice work there. I am working on a similar engine myself with per pixel lighting, stencil volumes, bump maps and such. Also to let you know, if you do the vector normalization for your light vector and halfange vector (if your computing specular) in the register combiner (which takes two general combiners for one vector and 3 combiners for 2 vectors) its faster on a geforce 3 than using normalization cube maps. There is a prsentation called "Bump Mapping with Register Combiners" on nvidias web site. You should check that out.

    As for the shadow volumes, it’s actually faster to calculate them on the CPU than the GPU because of some of the inefficiencies which has to be setup. Hopefully the geforce 4 and beyond will let shadow volumes run faster on the GPU than they do now.

    Also one more thing, who made that map your running there. Its very nice looking. :)

    Nice job.

    -SirKnight

     
    tcs

    August 18, 2001, 05:21 PM

    The shadow volumes are basically working, just not together with the new VSD pipeline and I still not 100% sure about how to handle them. Mainly depends on the old speed vs quality issue...

    Well, mayne my explanation was a bit to fancy, I mean transforming the light vector, yes ;-)

    Tim

     
    psykotic

    August 18, 2001, 05:23 PM

    As for the shadow volumes, it’s actually faster to calculate them on the CPU than the GPU because of some of the inefficiencies which has to be setup. Hopefully the geforce 4 and beyond will let shadow volumes run faster on the GPU than they do now.

    That does not mean you should calculate them on the CPU however. In the recent QuakeCon interview with John Carmack he talked about this issue, concluding that while shadow volume extrusion is currently slightly faster on the CPU than the GPU his cpu usage went from nearly 100% to 5% after he delegated the extrusion process to the GPU. Enough said.

     
    tcs

    August 18, 2001, 05:26 PM

    Well, it's a very complex task to get the shadow volumes working in the genral case. But the silhouette way works *always*, just some tricks required. I'm still not convinced that the vs shader way is the correct one, but I'll evealuate it again now since I got my GF3 at home. I still haven't decided on the final solution, and other task prevent me from working on this for now...

    CPU load is 90% caused my the VSD pipeline and by the physics system. I'm currently rewriting my engine to a more COM like model, and while doing this I also rewrite the VSD system to put some work from the CPU to the GPU.

    Tim

     
    Catfish

    August 18, 2001, 05:27 PM

    Blasphemy! How could you not recognise that map?!? :)

    It's Q2DM8, IIRC, from Quake2. Nice deathmatch map.

     
    tcs

    August 18, 2001, 05:29 PM

    It's hard to handle open geometrie in a vertex shader, and if you want to have a volume for EVERY light, you'll have open geometrie because you only take geometrie into account which is in the attenuation range of the light.

    I'm aware of the normalization scheme with pixel shaders only, I'm using it.

    I'm still not sure that vs shadows are the way to go...

    Tim

     
    tcs

    August 18, 2001, 05:36 PM

    Well, I would say it is the best deathmatch map of all time, period ;-)

    I show my love for it by taking it's inner beauty and polish up the appearence.

    Tim

     
    cgk

    August 18, 2001, 05:36 PM

    looks kinda cool.
    i'd also like to know more about the lighting, especially the shadow volume extraction.

    a bit of questions:
    1) will your system work with high polygon load? when your shadow volume extraction is CPU based and every polygon is considered as occluder (all general and stuff) and you do all that fancy fx i think the answer could be "no" :).
    2) do you plan a way to simulate diffuse lighting? sharp shadows everywhere on everything might not really look good in every case
    3) why does that left light (in the door) in the bottom right picture shine though the wall on the ground but does not light the wall it is behind? :)

     
    psykotic

    August 18, 2001, 05:40 PM

    Regarding silhouette-based shadow volumes, I guess it would work fine if you seperated your objects into convex components and extruded shadow volumes for each convex component using their silhouette edges. Is this what you are doing?

     
    tcs

    August 18, 2001, 05:49 PM

    1.) It will. actually the engine is everything but T&L limited, and more polys do little more than adding T&L cost. The silhouette extraction in the final code will scale relativ to the number of actual solhouette edges, not to the number of total tris.

    2.) I have diffuse DOT3... The shadows you see in the screenshot are from and old demo and they don't have the quality the new ones have. When I talk about shadows, I don't mean black stuff in the scene, I mean real shadows in the sense of occulding lights...

    3.) Easy, because this implementations has no shadows. All my shadows will do is fix these error...

    Tim

     
    tcs

    August 18, 2001, 05:50 PM

    I have trouble understanding where you see all these troubles, shadow volumes work fine with concave objects by default, without any special magic...

    Tim

     
    SirKnight

    August 18, 2001, 05:56 PM

    Yes i am aware of what Carmack said about the volumes in the vertex program and about the CPU usage. I did not say anything about CPU usuage. I know CPU usage will be lower if the volumes are done on the GPU, it makes sence and i do experiment with these things. What im saying is that your framerate will be a bit higher (im not talking like major increases of course) if the volumes are done on the cpu right now untill the programmable T&L pipeline matures more.

    "Enough Said."
    What was that supposed to mean? Was that supposed to be some kind of an insult or something?

    -SirKnight

     
    Nick

    August 18, 2001, 06:02 PM

    "3x overdraw, 6 texture per-pixel, two dot products, several vector normalizations etc ? ;-) I promise you, after writing vertex and pixel shaders for a processor which is 10x more powerful than a P4 2Ghz you can't go back ! ;-)"

    Ok, let's see... 6 textures per pixel: it takes about 40 clock cycles for bilinear filtering with MMX instructions, so here we will need about 250 clock cycles. Two dot products, that can be done with SSE in about 10 clock cycles. Vector normalisation takes a reverse square root, but that's only a few clock cycles with SSE approximation instructions. We'll need a lot of scanline setup, but MMX and SSE can execute in parallel so with some well written assembly we can keep the cycle count below 500 per pixel even with some extra stuff and cache misses, ok? With a c-buffer we can easily eliminate the 3x overdraw with very little extra computations. Now if we got 2GHz, in 640x480 we can get more than 13 FPS. That's playable, don't you think? ;) *starts saving for a P4 2Ghz right now*

    If only Intel or AMD would make some instructions especially for this kind of things, they could probably easily beat the GeForce3 because of the higher clock speed. And of course we would have the freedom again to implement visibility algorithms that normally only work well with software engines. Unfortunately that's not going to happen because they won't waste silicon on instructions that are only used in a few scanline routines...

    "Ah, Kurt !!! Don't resize my pictures with such a crappy filter and compress them with JPEG turned up to max ! ;-)"

    You could have put them on top of each other to get a longer image, like the IOTD birthday ;)

    Cheers,

    Nick

     
    SirKnight

    August 18, 2001, 06:03 PM

    Oh and another thing psykotic. You are putting words into my mouth. I did NOT say the volumes should not be done on the GPU. I just stated how the two run in terms of speed on current GPUs.

    It is a good thing though that there is not a big speed difference between doing them in the GPU vs the CPU becasue if your engine/game needs more CPU power and your doing the volumes on the CPU which is taking up a lot of its power, moving the volume calcs to the GPU wouldnt hurt you hardly at all. :)

    -SirKnight

     
    psykotic

    August 18, 2001, 06:07 PM

    Yes i am aware of what Carmack said about the volumes in the vertex program and about the CPU usage. I did not say anything about CPU usuage. I know CPU usage will be lower if the volumes are done on the GPU, it makes sence and i do experiment with these things. What im saying is that your framerate will be a bit higher (im not talking like major increases of course) if the volumes are done on the cpu right now untill the programmable T&L pipeline matures more.

    Right. My point was that in a game you generally need to use the CPU for a lot of stuff (physics, inverse kinematics, artificial intelligence and so on) so that sacrificing some frames per second might be the way to go for some people.

    "Enough Said."
    What was that supposed to mean? Was that supposed to be some kind of an insult or something?

    No, it was not meant to be an insult. I am sorry if I made it sound that way. Just wanted to bring up some issues you did not mention in your post for the benefit of other people reading this. An approximate CPU usage increase of 95% is very relevant if you ask me. Your mileage may vary of course.

     
    tcs

    August 18, 2001, 06:14 PM

    Well, let me just try it, and when I have my real world results and I'll share them and we accept this as the final truth, ok ? ;-)

    Tim

     
    Jordan

    August 18, 2001, 06:28 PM

    !!!!!! Ahem, Q2DM1 !!!!!!

     
    SirKnight

    August 18, 2001, 06:34 PM

    Ok i just wanted to clarify some stuff. I didnt mean to sound like i was being defensive. So i am also sorry if i said anything that sounded defensive. :) But yes i agree that a high CPU usage is very relevent and in a game, one effect shouldn't take up most of the CPUs power. I think though that doing the volumes in the vertex programs would be the way to go if the CPU usuage from the volume calcs is too high, losing a few fps is pretty much worth it i would say. :)

    Welp back to coding on my engine, if i code it any slower i might have something to show by the year 2034. ;)

    -SirKnight

     
    This thread contains 140 messages.
    First Previous ( To view more messages, select a page: 0 1 2 3 4 ... out of 4) Next Last
     
     
    Hosting by Solid Eight Studios, maker of PhotoTangler Collage Maker.