flipcode - Building a 3D Portal Engine

Building a 3D Portal Engine - Issue 15 - Space Partitioning, Octrees, And BSPs
by (09 April 1999)

Introduction

This week: Spatial subdivisions. What are they, and how can you use them to enhance your rendering engine? Which structure should you use for what type of environment, and does 'the best structure' exist?

By the time you read this, every coder, and almost every gamer knows about at least some spatial subdivisions. And that's odd, since it's a rather technical topic. The reason that everyone seems to know about this stuff is probably mainly because software companies use it for marketing their games: If your game doesn't use portals, it must be 'dated'. 'BSP' is out, 'portals' are in, 'octrees' are not even 'discovered' yet. Of course, the average review writer doesn't really know what BSP's are used for: Usually, they say it has something to do with 'static scenes', because the Prey-guys have been ranting all over the place about their dynamic portal stuff. OK, I have been very vague so far. :) Time to get more clear.

Spatial subdivision is used to get some order in the set of polygons that you call 'world'. If you have a single spinning object on your screen, you can just store the polygons that represent it in an array, and draw them all. But, you learn a bit, and you decide to do hidden surface removal. Backfaces are culled. That means that if you walk your array, half of the polygons are ignored. No problem. But now you become even smarter, and you start skipping entire objects when they are behind the camera. You have just introduced a very basic spatial subdivision. How? Well, you organized your polygons in objects. And the world contains objects, so there's a hierarchy of data.

If you want to move your objects in a world consisting of relatively large polygons, this world is usually not broken up in objects. Consider a typical Quake level: You could threat the entire level as one giant object, or do something else with it. Anyway, you won't be able to test if this object is behind the camera or something, as the camera will usually be right in it. So, you want a different ordering to be able to discard parts of the world. And you want this ordering to be calculated automatically, if possible.

I will discuss three techniques to split the world in digestible chunks here. Two are well-known, but I will start with a subdivision that is rarely used (for a good reason, by the way :).

The Uniform Grid Approach

Imagine a world with 100.000 polygons, consisting of buildings, trees, bridges, roads and so on. There are a couple of large polygons, but most polygons are rather small, and more or less evenly distributed over the area. The scene has a size of 1000x1000x100 (units are not important, but you can call them miles or something), where '100' is the height of the tallest building. A very simple structure can help us to answer some basic questions about the dataset. Consider a grid of 100x100x10 cells, where each cell is a perfect cube with dimensions of 10x10x10 'miles' (or whatever). The world polygons are stored in these cells. Polygons that are partially in more than one cell, are stored in both, or splitted. Now, we want to render this scene. So, we determine in which cell the camera is (that's is rather easy). Now we also know which polygons are close to the camera. And when we have drawn these, we know where to find the next bunch of polygons. Of course, this does not tell us which polygons are visible or something, but knowing what polygons are where is a good thing. This approach has a couple of disadvantages. First, a cell can contain a lot of polygons, or we must generate a really detailed grid. Second, polygons are never evenly distributed. A city, for example, will not have a lot of polygons at greater altidudes, not every building is 10 miles tall. That means that some cubes are probably empty, while others contain hundreds of polygons. So, we might have a look at a better data structure.

The Octree

A better datastructure is the octree. It's almost as simple as a uniform grid, but it solves some of the shortcomings. Again, consider a scene with 100.000 polygons (the same city again). Now, place a huge cube around it, and call this cube the 'octree root node'. Now the interesting part: Split this cube in eight cubes. Divide the polygons of the parent cube (the root) over these cubes. When this is done, you have a uniform grid of 2x2x2 with all the world polygons. Now, count the polygons in each of these smaller cubes. If there are more than, say, 10 polygons in a cube, split it in 2x2x2 smaller cubes and divide the polygons over them. Stop splitting cubes when you reach a certain minimum number of polygons, or a certain minimum cube size. Now you have large cubes where there are few polygons, and smaller cubes where there are many polygons. But how do you determine in which 'subcube' (or 'node') the camera is?

Simple. Start at the root node. If you're left of the centre of this cube, you have just discarded four possible sub-cubes. Now check if you're above the centre. Finally, check if you're in front of or behind the centre of the node. Now you know in which 'child- node' you are. Determine the centre of this node, and repeat the process. You'll be in the right node in no-time. To be more precise: If you have 256 octree nodes, you can find the right one with just 8 tests (although that's the optimal case). There is another interesting thing that you can do with this structure: You can traverse the nodes in depth sorted order. With a very simple algorithm, you can visit each node in the correct order, first the one that the camera is in, then the one that is closest to the camera, and so on, until you visited every node. This is how to do it:

Start at the root node. This node probably has subnodes. Draw these subnodes in the correct order: If the camera position is on the left side of the centre of the node, first draw the four nodes on the left side, and then the four nodes on the right side. The four sub-nodes on both sides are processed similarly; if the camera is above the centre of the node, draw the top sub-nodes first, then the bottom ones. Finally, draw the top nodes first, if the camera is above the centre of the node. If a subnode has child nodes, repeat this process recursively, until you encounter a node without child nodes. Draw the polygons in this node. Drawing an octree this way is an important concept. If you didn't grasp the idea the first time, please take some time to read those few lines again. Also try to understand how this approach guarantees correct sorting of nodes.

Note that if a node contains more than a single polygon, you still need to sort the polygons in the nodes. This will not take much time, since there will be just a couple of polygons, but this will introduce the painter's algorithm sorting artifact. The painter's algorithm is rather simple: You determine the average depth of a polygon by averaging the depth (z) of it's vertices. Then you use this value to sort the polygons. This introduces a problem: A small polygon that is in front of a large polygon can still have a greater average distance to the camera than the larger polygon. Consider this classic example:

It's impossible to sort these polygons using their average depth. Of course, these are concave polygons, but similar examples are possible with three or more convex polygons (any case with cyclic overlap). These problems can be solved by using a BSP tree.

BSP Trees

If you got the basic idea behind building and traversing octrees, you'll have no problems with BSP trees. In a way, BSP trees are simpler than octrees: Instead of splitting space in equally sized cubes, they divide space in two parts. Like octrees, BSP trees start with a 'root' node, representing the entire world. This root node has child nodes, one for 'left', and one for 'right'. What's left and right is determined by a splitting plane (for three dimensions) or a splitting line (for two dimensions). So, if you wanted to store the room that you're in right now in a BSP tree, you could start with a plane that divides the room in a part behind you, and a part in front of you. These two subspaces are then processed separately: You could divide the space in front of you in a subspace below your desk and a subspace above your desk. In that case, your desk is the splitting plane of the node that represents the space in front of you. As you can see, dividing space can be done with quite arbitrary planes. For automatic BSP generation, you'll want to use something that is easier to determine, though. Most BSP compilers simply use the first polygon in the dataset for that. Consider this little room:

(It's my famous room-with-a-pilar again) This room could be splitted like this: The first polygon that we ecounter in the dataset is the north wall of the pillar (for some reason). So, the entire scene (root) gets splitted by this plane. That means that the north wall of the room is in front of the plane (or left, whatever you prefer), the south wall is on the backside of this plane, and so are the other walls of the pillar. The east and west walls of the room need to be splitted using the plane, to store them in the tree. The front side of the plane is now ready, it can't be splitted any further using the polygons in it. The back side however can be splitted by the east and west walls of the pillar. Once we have done that, our tree is ready.

Something that is not incredibly obvious from this example is this: If we would use this algorithm on the classic problematic cases of the painter's algorithm, the polygons would get split in such a way that sorting is no problem anymore. That is, if we also use the BSP tree to sort our polygons, like we did with the octree. How do we do that?

Very simple. Start at the root again. Check on which side of the splitting plane the camera is. Go to the node that represents that side (call it left, right, front or behind, it really doesn't matter) and handle that node just like the root. This way, you can quickly determine in which node the camera is. And if you encounter a node without child nodes (and thus without a splitting plane), you draw the polygons in it, and return to the parent node. In the parent node, you can now trace the other side of the plane in the same manner. This gives you a sorting-artifact free set of polygons, without any sorting. The only penalty for not having to sort is the larger number of polygons: Quite a few polygons will get splitted during the tree building process.

In case you want to know more: There's lots of info about BSP trees and other spatial subdivisions all over the web. For now, it's important that you know that these things exist, as they are important ingredients for good rendering engines. You gotta know the techniques, that's the only way to make your own special blend that will some day blow away all competition. Remember, 'Mixing is the key to success... Mixing...'

Article Series:

Building a 3D Portal Engine - Issue 01 - Introduction
Building a 3D Portal Engine - Issue 02 - Graphics Output Under Windows
Building a 3D Portal Engine - Issue 03 - 3D Matrix Math
Building a 3D Portal Engine - Issue 04 - Data Structures For 3D Graphics
Building a 3D Portal Engine - Issue 05 - Coding A Wireframe Cube
Building a 3D Portal Engine - Issue 06 - Hidden Surface Removal
Building a 3D Portal Engine - Issue 07 - 2D & 3D Clipping: Sutherland-Hodgeman
Building a 3D Portal Engine - Issue 08 - Polygon Filling
Building a 3D Portal Engine - Issue 09 - 2D Portal Rendering
Building a 3D Portal Engine - Issue 10 - Intermezzo - 8/15/16/32 Bit Color Mixing
Building a 3D Portal Engine - Issue 11 - 3D Portal Rendering
Building a 3D Portal Engine - Issue 12 - Collision Detection (Guest Writer)
Building a 3D Portal Engine - Issue 13 - More Portal Features
Building a 3D Portal Engine - Issue 14 - 3D Engine Architecture
Building a 3D Portal Engine - Issue 15 - Space Partitioning, Octrees, And BSPs
Building a 3D Portal Engine - Issue 16 - More On Portals
Building a 3D Portal Engine - Issue 17 - End Of Transmission