Not logged in, Join Here! or Log In Below:  
 
News Articles Search    
 


Submitted by Nygus, posted on October 16, 2001




Image Description, by Nygus



Here are the screenshots of my hardware accelerated raytracing engine. Top four pictures are rendered with only phong shading, bottom - with shadow testing and reflections eneabled.

Features:
  • DirectX8 support,
  • ".x" files loader,
  • fast phong shading,
  • texturing,
  • shadows, reflections,
  • 3Dnow! optimizations,
  • no spatial subdivision (yet!), each ray is tested with all triangles...
  • Hardware acceleration does not work very fast because of slow AGP reading (20MB/s on my system). People, use vector instructions, f.e. 3Dnow! Ray/triangle intersection function (barycentric test) written in 3Dnow! tests two triangles at once:
  • best case, both triangles are backfacing - only 39 cycles,
  • worst case, both triangles are intersecting - 126 cycles. (including function call & return from Visual C++)
  • btw. My PC is quite old:
  • Duron 700,
  • Matrox G400,
  • I will try to release program & source soon on my homepage:
    http://home.elka.pw.edu.pl/~mgalach/mycode.htm

    If you wish to contact me, write to: mgalach@elka.pw.edu.pl

    Marek Galach (aka Nygus)


    [prev]
    Image of the Day Gallery
    www.flipcode.com

    [next]

     
    Message Center / Reader Comments: ( To Participate in the Discussion, Join the Community )
     
    Archive Notice: This thread is old and no longer active. It is here for reference purposes. This thread was created on an older version of the flipcode forums, before the site closed in 2005. Please keep that in mind as you view this thread, as many of the topics and opinions may be outdated.
     
    Waramp

    October 16, 2001, 03:31 PM

    How much of a speedup do you gain from the hardware vs just a ordianry software renderer? What parts of the process does the hardware accelerate?

    How did you generate that tree?

    Waramp.

     
    XycsoscyX

    October 16, 2001, 03:33 PM

    AWhooHoo, gratuitous first port (I hope). Nice job on it, I've been wanting to see more work on hardware accelerated raytracing. The reflection looks really nice too.

     
    edeltorus

    October 16, 2001, 03:37 PM

    Hm. I also don't understand why you raytrace cubes and spheres which are built out of polygons?

    Those objects should be primitives.. And you get perfect spheres cheaper than a polygon.

    Nils

     
    Gert van Valkenhoef

    October 16, 2001, 03:41 PM

    neat =)
    I would like to know how long this took to code...

     
    Louis Howe

    October 16, 2001, 03:47 PM

    Neat concept! This looks really promising, especially as hardware acceleration gets more and more advanced.

    Probably with current hardware, techniques like dynamic cube-mapping might be a little faster, even if they are not perfectly accurate.

    Keep up the good work. This is really cool.

     
    Jesse

    October 16, 2001, 03:54 PM

    yes, but it's usally pointless just to trace spheres. It's better to be able to trace lots of polygons fast, which it what he is showing us he can do. BTW, its a very nice image. Let us know how it goes on new hardware...
    www.laeuchli.com/jesse/

     
    davepermen

    October 16, 2001, 04:32 PM

    cool stuff to add:
    bumpmapping and like that showing this is the only way for real reflections
    selfreflecting objects
    but i think then the engine gets terribly slow, because you can't render the secondary rays simply in a depthbuffer but have to really trace it agains the trianglular meshes..
    but, its a nice idea anyways, well done

     
    Arath

    October 16, 2001, 04:34 PM

    Nice work, I saw on your site that you're a regular assembler coder, can you tell me how many time that takes to optimize intersection functions, and if it is really worthing to optimize such functions for a fully hardware accelerated engine, I mean just for the collision/physic engine.

     
    Nieuw

    October 16, 2001, 05:28 PM

    >How did you generate that tree?

    looks like a simple L-system to me

     
    Jari Komppa

    October 16, 2001, 05:46 PM

    I'm curious on what exactly is hardware accelerated here?
    Do you mean you use video card to accelerate raytracing? If so, then it's really exciting, especially since you don't have a bleeding edge display adaptor.
    Or do you mean that using 3dnow! instructions is hardware acceleration?

     
    psykotic

    October 16, 2001, 06:00 PM

    I'm curious on what exactly is hardware accelerated here?
    Do you mean you use video card to accelerate raytracing? If so, then it's really exciting, especially since you don't have a bleeding edge display adaptor.

    From glancing over his homepage, I can see that he uses hardware acceleration for primary rays, i.e. first-hit optimization. He says that he has problems with reading over the AGP bus so I guess he uses colors for identifying objects and the depth buffer can be used for selecting origins for secondary rays (inaccurate). Alternately, you can simply use the color buffer to identify the object a ray through a fragment hits. The depth buffer can be used for making sure that the scanline renderer and the ray tracer agrees.

     
    psykotic

    October 16, 2001, 06:03 PM

    bumpmapping and like that showing this is the only way for real reflections

    Yeah, bumpmapping in ray tracers is a nice feature and is very easy to implement if its procedurally generated.

    selfreflecting objects

    From what I can tell, he only uses hardware acceleration for first-hit optimization so self-reflection happens automatically.

    but i think then the engine gets terribly slow, because you can't render the secondary rays simply in a depthbuffer but have to really trace it agains the trianglular meshes..

    Could you explain how you would use scanline rendering for speeding up n-ary rays? I've never seen it used for anything but first-hit optimization but I'd like to learn more.

     
    psykotic

    October 16, 2001, 06:42 PM

    A hardware-accelerated ray tracing trick the IOTD author might consider is described by Thomas Ludwig (originally used by in his real-time ray tracing tutorial for HUGI. The basic idea is to let the hardware do all the texture mapping and rasterization. First, you tile the screen with small triangles (the smaller triangles, the more precision you get). For each visible vertex you trace a ray. When it intersects a surface, you change the depth, the texture coordinates and the lighting intensity of the vertex based on the intersection point. Then you set the proper textures and let the hardware do the rasterization and perspective interpolation. I don't think Thomas talks about changing the depth of the vertices but it seems like a necessity if you want the texture mapping to look perspectively correct. This approach is basically a hardware-accelerated alternative to standard subsampling techniques and as such, it suffers from the same problems. Small objects might be neglected and things might look too jaggy. If the hardware supports pixel shaders, you could apply a smoothing filters to make the latter less of a problem. Just my two cents.

     
    Hiro Protagonist

    October 16, 2001, 07:21 PM

    I see a lot of strange discoloration around objects in your scenes.What are these artifacts and why?

     
    lycium

    October 16, 2001, 07:23 PM

    really neat, i wish you'd used sse, since i don't have a thunderbird (yet) ;)

    as psykotic mentioned, you could use hw accel as described in my article, that would give you a very nice speedup (anywhere from 64x to none, average case should be around 24x or something). i'm not sure what changing the "depth the vertices" means, but if it's subdividing the squares then i'm sure i mentioned that if the 4 corners of a square don't have the same ID tag, you subdivide it until you either have 1x1 squares or all the corners have the same ID. in any case, it's just an approximation, and it looks really good.

    if you added that to this engine, you might have one of the best rtrt engines available for consumer hardware. also be sure to check papers by ingo wald on rtrt (he's coming to afrigraph btw, www.afrigraph.org, i'm going to be there! :)

     
    psykotic

    October 16, 2001, 07:35 PM

    as psykotic mentioned, you could use hw accel as described in my article, that would give you a very nice speedup (anywhere from 64x to none, average case should be around 24x or something). i'm not sure what changing the "depth the vertices" means

    I said changing the depth of vertices. By depth as in z or 1/z. The artifacts of affine texture mapping (which leaving the depth alone effectively results in) are unlikely to be noticed if you subdivide aggressively but if you use a coarser grid, I think they will be quite obvious unfortunately.

    but if it's subdividing the squares then i'm sure i mentioned that if the 4 corners of a square don't have the same ID tag, you subdivide it until you either have 1x1 squares or all the corners have the same ID. in any case, it's just an approximation, and it looks really good.

    For modern hardware, you probably just want to have a grid of squares of fixed granularity. You better leave the vertex and index buffers alone if at all possible. You can still effectively do the subdivision by only tracing rays for e.g. four parent vertices and propagating the vertex data to the child vertices (the vertices within the square defined by the four parent vertices) if applicable or trace rays for each child vertex (at the lowest level) if needed.

     
    Max

    October 16, 2001, 07:43 PM

    They look like JPEG artifacts to me.

    Max

     
    Frans Bouma

    October 17, 2001, 05:58 AM

    Why no cube envmapping tricks?

     
    nygus

    October 17, 2001, 08:59 AM

    Because of hardware acceleration (of primary rays) basic primitive is
    triangle.

     
    nygus

    October 17, 2001, 09:02 AM

    I use hardware acceleration only of primary rays. For scenes with small count of secondary rays speedup is huge.
    I didn't create that tree...

     
    nygus

    October 17, 2001, 09:08 AM

    Two days (because I didn't know 3dnow! before very well)
    ASM routine (using FPU) gain is only 12% (vs oryginal C).
    3Dnow! routine - gain 68%.

    I don't know if it is worth. I collision routine eats 20% or more CPU, i think, yes.

     
    nygus

    October 17, 2001, 09:12 AM

    You are almost right.
    On my Matrox G400 card I don't have access to z-buffer, so secondary rays are calculated in normal way.

     
    This thread contains 22 messages.
     
     
    Hosting by Solid Eight Studios, maker of PhotoTangler Collage Maker.