flipcode - Pixel Shader 2.0 Example

Pixel Shader 2.0 Example - A Simple Example of Using Pixel Shader Version 2.0
by (07 January 2003)

Introduction

The release of Microsoft's DirectX 9 enables you to write programmable shaders using version 2.0 (ps 2.0) of the pixel shader instruction set. Version 2.0 of DirectX pixel shaders gives us many more arithmetic instructions, more registers, and a larger number of instructions per program than did version 1.4 (ps 1.4).

All of these added features make it worthwhile to learn about ps 2.0, but because it is still on the leading edge of pixel shader programming, there aren't many examples or tutorials describing how to write a ps 2.0 shader. As far as I know, there are currently no examples of using ps 2.0 in a program in the DirectX 9 SDK documentation. I'm confident that Microsoft will produce examples and documentation for using ps 2.0, but in the meantime, here's a small example to get you started. This tutorial discusses a simple pixel shader that just performs a texture lookup. It is intended as a base for you to write your own, more impressive shaders.


; A simple pixel shader
; This uses the ps 2.0 instruction set and registers
ps_2_0
; 
; Declare the s0 register to be the sampler for stage 0
dcl_2d s0


; Declare t0 to have 2D texture coordinates 
; from stage 0. These are the interpolated 
; texture coordinates.
dcl t0.xy


; Sample the texture at stage 0 into the r1 temporary register
texld r1, t0, s0


; move r1 to the output register
mov oC0, r1

The tutorial assumes that you are familiar with DirectX version 8 or 9. It also assumes that you already know how to initialize DirectX, create a device object, create and populate a vertex buffer, load a texture from a file, and render a scene. If you aren't familiar with these operations, you might want to look at the complete source code accompanying this tutorial. All of these steps are also very well described in the DirectX 9 SDK documentation under "Tutorial 1: Creating a Device", "Tutorial 2: Rendering Vertices" and "Tutorial 5: Using Texture Maps".

To get started, you will need the Microsoft DirectX 9.0 runtime and SDK. At the time of this writing, they are available at:
http://msdn.microsoft.com/library/default.asp?url=/downloads/list/directx.asp

In a C++ program, start by creating the Direct3D object and Direct3DDevice object.


#include <d3d9.h>  
#include <d3dx9.h> // Needed for D3DXCreateTextureFromFile 

IDirect3D9* lpD3D = Direct3DCreate9(D3D_SDK_VERSION);

// Make sure you do all of your error checking.  I've left
// it out for the sake of readability.

One problem that you are now faced with is whether to create a HAL device or a REF device. A HAL (Hardware Abstraction Layer) device represents your graphics hardware, so it's usually nice and fast. A REF (the reference rasterizer) device is a Direct3DDevice9 interface implemented in software, so it's full-featured but slow. The problem with creating a HAL device is that very few graphics cards on the market today support ps 2.0. As far as I know, only the ATI RADEON 9700 currently supports ps 2.0 in hardware.

You can check to see if your graphics card supports ps 2.0 by using IDirect3D9::GetDeviceCaps.


D3DCAPS9 hal_caps;
ZeroMemory(&hal_caps, sizeof(D3DCAPS9)); // Can't hurt to zero the struct.

HRESULT hRes = lpD3D->GetDeviceCaps(D3DADAPTER_DEFAULT,
                                    D3DDEVTYPE_HAL,
                                    &hal_caps);

You can use D3DSHADER_VERSION_MAJOR and D3DSHADER_VERSION_MINOR macros to determine the pixel shader version supported by your HAL.


UINT ps_major = D3DSHADER_VERSION_MAJOR(hal_caps.PixelShaderVersion);
UINT ps_minor = D3DSHADER_VERSION_MAJOR(hal_caps.PixelShaderVersion);

// If all goes well, ps_major should be 2, and ps_minor 
// should be 0.

You can also use D3DPS_VERSION to compare your the pixel shader support in your HAL with some value that you want.


if (hal_caps.PixelShaderVersion < D3DPS_VERSION(2, 0)) 
{
        // We'll have to use the REF device or 
        // settle for a lower ps version number. 
}

Since this tutorial is concerned with using ps 2.0, we'll use the REF device if ps 2.0 isn't supported by the HAL device. Using the REF device probably isn't a good overall development strategy, but it will suffice for the purposes of this tutorial.

You now have to create the device in the usual way, using IDirect3D9::CreateDevice. Also create a vertex buffer, set the stream source and the FVF (flexible vertex format). The source code that accompanies this tutorial uses a simple quad and the vertex format D3DFVF_XYZRHW|D3DFVF_DIFFUSE|D3DFVF_TEX1. The only required piece of this vertex format is the D3DFVF_TEX1. There must be at least one set of texture coordinates stored with each vertex or texture mapping won't work quite right.

The next important thing you have to do before rendering your scene is create a test texture. I recommend using D3DXCreateTextureFromFile (Note: To use the D3DX functions, you have to add d3dx9.lib to the list of libraries that you link with, which also requires requires you to link with advapi32.lib). You can use D3DXCreateTextureFromFile to load a texture from common image file formats like BMP and JPEG.


IDirect3DTexture9* lpTexture = NULL;
hRes = D3DXCreateTextureFromFile(lpDevice, // your IDirect3DDevice9* 
                                 "filename.jpg", 
                                  &lpTexture);


// Make sure you check your return codes.

The next step is to set the texture in texture stage 0. You can use any of the texture stages that are supported by your device, but this example assumes that the texture is in stage 0.


hRes = lpDevice->SetTexture(0, lpTexture);

We're finished with the lpTexture interface, so we can go ahead and release it. This way, it will get cleaned up when we release the device or set another texture to stage 0.


lpTexture->Release();
lpTexture = NULL;

Now we're ready to dive into ps 2.0 code. In a separate file, using your favorite text editor, you can edit your pixel shader in the DX9 ps 2.0 assembly language. The C++ source code that accompanies this note assumes that the pixel shader program is in a separate file named "simple_texture_map.ps".

As with previous versions of the DirectX pixel shader language, all pixel shaders need to start with a ps statement. For ps 2.0, the statement is ps_2_0. Note that the semicolon is used to denote a comment in the pixel shader assembly language.


; Here's the first statement in a pixel shader 
ps_2_0

The next step in the pixel shader is to declare the s0 register. This means telling the shader that we're going to use s0 (a texture sampler input register) to sample the texture set at stage 0. Recall that the texture was set at stage 0 in the C++ program using IDirect3DDevice9::SetTexture.


; Declare the sampler register s0 as a 2D texture map
dcl_2d s0

There are also dcl_ statements for cube maps and volume maps: dcl_cube and dcl_volume, respectively. The sampler registers are new in ps 2.0. In ps 1.4, there was no need to declare that an input register was going to be used to sample from a particular texture stage.

The next step in the pixel shader is to declare t0, a texture coordinate input register, as holding 2D interpolated texture coordinates. This declaration is a new requirement in ps 2.0. In ps 1.4, the input texture coordinate registers just held the interpolated texture coordinates automatically. It's also important to point out that ps 2.0, like ps 1.4, refers to the texture data and the texture coordinates in separate registers. In ps 2.0, the texture data sampler is referred to by an s# register, and the interpolated texture coordinates are typically in a t# register, where '#' is some digit identifying the specific register.


; Declare the t0 register as having the interpolated texture
; coordinates belonging to the texture being sampled from stage 0.
dcl t0.xy

The .xy modifier is used to indicate that the texture coordinates only have two components.

Next, we actually sample the texture in stage 0 using the texture coordinates from t0.


; Use the texture coordinates in t0 and the texture sampler in s0 to
; sample the texture in stage 0. This samples the texture and loads
; the result into the temporary register r1.  
texld r1, t0, s0

You can pick any of the temporary registers as your target for the texld instruction, there's nothing special about r1.

In ps 2.0, r0 isn't the output of your shader program like it was in previous pixel shader versions. Every pixel shader written using ps 2.0 must write to the output register oC0 or your pixel shader will not assemble successfully. I assume that oC0 means "output color 0". There are other output registers, but all pixel shaders must at least write to oC0.

So the last step in this simple pixel shader is:


; move r1 to the output register
mov oC0, r1

This simply moves the color that is sampled from the texture to the output register. You might wonder why the program doesn't simply using oC0 as the target of the texld instruction, it's because one of the rules of ps 2.0 is that oC0 can only be written to using a mov instruction.

The next step is back in the C++ program. We have to assemble the pixel shader file and tell the device to use that pixel shader rather than the shader from the fixed function pipeline.


// Assemble and set the pixel shader 
IDirect3DPixelShader9* lpPixelShader = NULL;
LPD3DXBUFFER pCode = NULL;
LPD3DXBUFFER pErrorMsgs = NULL;

hRes = D3DXAssembleShaderFromFile("simple_texture_map.ps",  
				    NULL,		    
				    NULL, 
				    0,
                                    &pCode, 
				    &pErrorMsgs);

The pErrorMsgs parameter for D3DXAssembleShaderFromFile is optional, but I strongly encourage you to use this parameter since it's a very valuable aid in debugging your pixel shader program. The error messages from the assembler are useful, and you'll see them if you insert this block of code after the call to D3DXAssembleShaderFromFile:


if ((FAILED(hRes)) && (lpErrorMsgs != NULL))
{
    unsigned char* message = (unsigned char*)pErrorMsgs->GetBufferPointer();
    
    // Other error handling here. 
}

In the debugger, you can put a breakpoint on the call to pErrorMsgs->GetBufferPointer() to see what went wrong with shader assembly. Obviously, this will only tell you about errors in assembling the shader, not runtime errors.

After the shader has been successfully assembled, you can set the shader using IDirect3DDevice9::CreatePixelShader and IDirect3DDevice9::SetPixelShader.


hRes = g_lpDevice->CreatePixelShader((DWORD*)pCode->GetBufferPointer(), 
                                         &lpPixelShader);
if (FAILED(hRes)) return false;

hRes = g_lpDevice->SetPixelShader(lpPixelShader);
if (FAILED(hRes)) return false;

We're not going to refer to the pixel shader interface any further, so releasing it at this point is a good idea.

If you want to just make sure that the pixel shader is working without the added complexity of textures, you can just write a constant value to the output register. For example, a pixel shader can write "red" to each pixel. This isn't very interesting, but it can help you to make sure that shader assembly, creation, and execution are working.

Here's a simple pixel shader that always outputs a constant color:


ps_2_0
def c0, 1.0, 0.0, 0.0, 1.0
mov oC0, c0

The def instruction just sets the value of a constant register. It's equivalent to calling IDirect3DDevice9::SetPixelShaderConstantF from the C++ program.

That's it. While the shader discussed here isn't a particularly complex one, I hope it will be a benefit to you if you're getting started writing ps 2.0 shaders.

The full source code for this tutorial can be found in the files simple_texture_map.cpp and simple_texture_map.ps. It assumes that you have a .jpg file named test.jpg in the same directory.

Download: article_ps2_tutorial.zip (26k)

Ben would like to thank Morgan McGuire and Kathleen Tibbetts for their helpful comments on this tutorial.

Article Series:

Pixel Shader 2.0 Example - A Simple Example of Using Pixel Shader Version 2.0