Not logged in, Join Here! or Log In Below:  
 
News Articles Search    
 

 Home / 3D Theory & Graphics / Help with 3D code optimizations Account Manager
 
Archive Notice: This thread is old and no longer active. It is here for reference purposes. This thread was created on an older version of the flipcode forums, before the site closed in 2005. Please keep that in mind as you view this thread, as many of the topics and opinions may be outdated.
 
SLI9000

July 02, 1999, 08:04 AM

Sorry sorry this message is too long, I just had lots of questions.


IĄ¯m a complete beginner programmer and am trying to write my very first 3d engine for self-education purposes.
(I did do some Direct3D programs in the past though)
I've been searching through books and the internet for information
I have managed to get all the bare basic parts of an engine done (software Z-buffer, perspective texture mapping, gouraud shading, culling, clipping, texture filtering, polygon scan-line conversion thru edge table and active edge table)

But there is one huge problem. It is way too slow.I can't figure out why.
I'm kind of a newbie programmer (Visual C/C++ 5) so please tell me how I can optimize my code.

I am using a lot of global vars, are they slower than local?
I've got all the globals in one *.cpp file, and another header called globals.h
that has lots of extern *each global varname* statements. It gets included in all the other source files.

Although my sources files are all cpp. They donĄ¯t use any C++ capabilities, except maybe for function overloading. IĄ¯m just making the extensions .cpp, so if I decide to include DirectX in the future, I wonĄ¯t have to go thourgh that lpVtbl thing every single Direct X function call.

and there are frequently used functions all over the place (different files)
and a header that has all the function prototypes that is included in every file.

Likewise all my structs and typedefs are in a headerfile

Is this file structure the problem that is causing the slowness?
What if I merged everything in one file and put every single function as static? :)
Also, if I set the window thread priority to Time Critical will that help (tell me how to do this)

Currently IĄ¯m not using Direct-X and OpenGL so I canĄ¯t use optimizations specific to those APIs, the program is completely software through the Win32 API function (CreateDibSection) right now.

Exactly how much of a speed gain can you expect from assembly (has to be inline, don't have MASM or equivalent)

for texture mapping I am doing it like this:
Project viewspace coords into screen coords and store texture coords along with vertex intensities and 1/zvals (from z component of projected coords) with each polygon vertex;
And In the poly-filler function:
Generate edge table entries and Interpolate u/zval and v/zval across the polygon along with 1/zvals and intensities
IĄ¯ve heard you canĄ¯t just linearly interpolate intensity vals b/c theyĄ¯re not linear to screen space but oh well.
Perform full div of texture coords per pixel by 1/z and plot on screen (after checking with z buffer of course)

I heard that full div per pixel was slow, but I took it out and noticed little difference in speed (still way too slow)

Is only doing the full div at intervals across the scanline and linearly interpolating between those intervals going to increase the speed much?

Also, I heard software Z-buffer is slow. Is this true?
IĄ¯m currently doing one if() compare per pixel and some adds per pixel and edges because of it. And Only calculating the zval increments once per edge and polygon.

If there are any free software 3d modellers out there please tell me.
IĄ¯m currently using Caligari Truespace 2 Demo with a convertor that I wrote that for some reason doesnĄ¯t work sometimes, (donĄ¯t have enough money to buy it and ver 3, and 4 demos donĄ¯t let you save)

Also, How do you generate *correctly oriented* surface normals?
I tried taking the cross product of two edges, and also that signed polygon surface area thing
But some normals come out facing the wrong direction I think.
I think this b/c when I used to do culling through the Dot product of the surface normal and the Ą°camera?Ąą plane normal, some polygons disappeared and stuff.
(Now IĄ¯m using the clockwise test)


Meanwhile while IĄ¯m calculating vertex normals by subtracting the object center from all vertices in the object and normalizing the resulting vector. (the center is given to me in the file)
Truespace 2 doesnĄ¯t store vertex normals
Yes, I know this is a stupid way of doing but, until I get those correct surface normalsĄĻ

Those programs like 3D studio and Maya2. are absolutely TOO expensive. IĄ¯d rather get a computer, or several computers.

Also a question about Light Maps. Are they completely static?
(like you can't even move the camera around?)

Also, while my program is running so slowly, things like WinAmp and Truespace are zipping along in the background. How can I get more resources and cpu time slices from the comp.
To make it so if my program is slowing to a crawl so are the others.

Finally, I canĄ¯t get the profiler of Visual C 5 to work, whatĄ¯s wrong with it? Please help with this too. (got Visual C from a friend)

If you have any additional info about 3D concepts please tell me. I donĄ¯t have easy access to the book store and my small high schoolĄ¯s computer classes only cover typing and wordprocessing, etc.

 
dimitris

July 02, 1999, 09:13 AM


I have made a very fast software 3d engine(not released yet) and i thing i can
give you some advices.

So, optimizing using assembly will help very much especially for drawing
algos. If you want, try using mmx this will help to much too if you decide
to implement bilinear filtering.
Uses one procedure for doing general things. I mean don't
call a procedure to rorate each vertex but rotate all vertexs of an object at
one optimized assembly procedure(projecting together even better).

Try simplifying the things. There a lot of dirty stuff that can go(e.g. in triangle procs).
Monitor your program. That way you can see which procedure takes the most time from
rendering pipeline and optimizing her. Also you can see if some changes takes effect at the speed of the program.

Don't clear the screen and then draw but try to use environments that covers all the screen
so you don't need clearing. With z buffer this is very easy because you can check each pixel
if it hasn't been draw and then fill it with the background you want for clearing.

Try using fixed point where you don't need much accuransy and you need max speed.
Fixed point are very suitable for shorting, comparing and in triangles algos(even on outer loops ).

I hope i gave you a little help

Greets
Dimitris


 
Jeroen

July 03, 1999, 07:23 AM

Hi,

If you profile your code (Look what percentage of total CPU time each function uses), you'll
find that the majority of CPU-ticks is used by your DrawPolygon function. Try optimizing it,
or rewrite it in assembly.
Also, using a DIB to display your scene is very slow. When I used a DIB, I found that with
simple scenes, blitting the DIB took up half the CPU time (!). Using DirectX or a lib like
PTC should make a big difference.

The things you mentioned about global variables and multiple files don't have a significant
impact on performance.

Jeroen

 
john shield

July 03, 1999, 02:09 PM



dimitris wrote:
>>
>>I have made a very fast software 3d engine(not released yet) and i thing i can
>>give you some advices.
>>
>>So, optimizing using assembly will help very much especially for drawing
>>algos. If you want, try using mmx this will help to much too if you decide
>>to implement bilinear filtering.
>>Uses one procedure for doing general things. I mean don't
>>call a procedure to rorate each vertex but rotate all vertexs of an object at
>>one optimized assembly procedure(projecting together even better).
>>
>>Try simplifying the things. There a lot of dirty stuff that can go(e.g. in triangle procs).
>>Monitor your program. That way you can see which procedure takes the most time from
>>rendering pipeline and optimizing her. Also you can see if some changes takes effect at the speed of the program.
>>
>>Don't clear the screen and then draw but try to use environments that covers all the screen
>>so you don't need clearing. With z buffer this is very easy because you can check each pixel
>>if it hasn't been draw and then fill it with the background you want for clearing.
>>
>>Try using fixed point where you don't need much accuransy and you need max speed.
>>Fixed point are very suitable for shorting, comparing and in triangles algos(even on outer loops ).
>>
>>I hope i gave you a little help
>>
>>Greets
>>Dimitris
>>

I always say to myself optimize last. If you find a good algorithm then optimize it.
Few would argue that the best code is Assembler for optimization. I find alot of cool
info on optimization at www.intel.com

They have released "free" docs on pentium optimizations and a complete developers manual.
If you understand it maybe you can help me!!!

cool

good luck man!



 
5Horse

July 06, 1999, 10:17 AM

The 1 thing I can say is your method of windowing is incredibly important. Are you going full screen or windowed graphics? Are you letting windows handle the redraws of your screen through responding to wm_paint message? or are you forcing your own redraws? YOu didnt talk about those things so I just threw them in as waiting on wmpaint message is incredibly slow and if your ending is astonishingly slow, that would be my first guess.


5horse

 
sli9000

July 06, 1999, 11:30 AM



5Horse wrote:
>>The 1 thing I can say is your method of windowing is incredibly important. Are you going full screen or windowed graphics? Are you letting windows handle the redraws of your screen through responding to wm_paint message? or are you forcing your own redraws? YOu didnt talk about those things so I just threw them in as waiting on wmpaint message is incredibly slow and if your ending is astonishingly slow, that would be my first guess.
>>
>>
>>5horse

Thanks for the message
Here is some additional info to clear things up.

First of all I'm not waiting for WM_PAINT messages, that would be too slow, and not real-time
What I'm doing now is first calling CreateDibSection to get the actual bits to an offscreen DC
Then I modify those bits (like Bits[y*ScreenX+x]=0 or something)
then I call BitBlt to blit that dc to the main DC.
This is all inside a PeekMessage() loop (no SetTimer and WM_TIMER messages)

I tried using DirectDraw instead and it was *roughly* about the same speed (exclusive full screen mode)

thanks




 
This thread contains 6 messages.
 
 
Hosting by Solid Eight Studios, maker of PhotoTangler Collage Maker.