Object-Oriented Programming for Game Programmers

This text is Copyright (c) 1999 Sami Vaaraniemi. Send comments to vaaranie@cc.helsinki.fi. This page was created using the Webford page development software. Last updated 3 Feb 1999.

Every once in a while the debate whether applying OO and C++ brings any benefit to game programmers pops up in the newsgroups. Opponents say that applying OO results in slow code, it is too complicated and does not bring any real benefit. Proponents have varying arguments ranging from making the code simpler to resulting in easier maintenance.

I wrote this text hoping to sort out some common misconceptions regarding OO and C++ in relation to game programming. I'm going to concentrate on one specific aspect of OO, namely single inheritance and polymorphism. In this text I will not bother you with other, albeit important, aspects of OO such as encapsulation. My point of view is that using C++ and a subset of the features it provides can really benefit a game programmer, without implying any kind of visible performance hit. I will try to convert you to that view as well by showing some sample code both in C and in C++. I have tried to keep the samples simplistic just to make the point and not get swamped with details.

A First-person Shooter in C
Suppose we want to implement a Quake-like first-person shooter in which there is a number of different monster types. We'll begin by defining a structure for the monsters:

#define MONSTER_GRUNT 0
#define MONSTER_OGRE 1
#define MONSTER_KNIGHT 2

typedef struct Monster_
{
        int type; 
        int strength;
        int intelligence;
        int armor;
        struct Monster_* next;
} Monster;

When a Monster instance is created at run-time, the type-field is initialized with one of the MONSTER_XXX values to identify the type of the monster. The newly created Monster structure is then inserted into a global linked list of monsters. Somewhere in our main game loop there is code that loops over the monsters and calls DoAI routine that makes the monster behave in a type-specific way:

/* type-specific AI routines implemented somewhere else */
void DoGruntAI(Monster* m);
void DoOgreAI(Monster* m);
void DoKnightAI(Monster* m);

void DoAI(Monster* m)
{
        switch(m->type)
        {
                case MONSTER_GRUNT:
                        DoGruntAI(m);
                        break;

                case MONSTER_OGRE:
                        DoOgreAI(m);
                        break;

                case MONSTER_KNIGHT:
                        DoKnightAI(m);
                        break;

                default:
                        /* not reached */
                        assert(FALSE);
                        break;
        }
}

/* somewhere in main game loop */
Monster* m = MonsterList->head;
while (m != NULL)
{
        DoAI(m);
        m = m->next;
}

Once we have implemented the type-specific AI routines, we'll test it and see that it works just fine. Grunts stomp around looking for somebody to shoot at and Ogres are as aggressive and stupid as ever. The default branch with the assert in the switch/case statement gives us a nagging feeling that everything is not quite right, but we decide to stick with this implementation for now.

More Trouble
Later, we add more monster types to our game. First we'll modify the defines section:

#define MONSTER_GRUNT 0
#define MONSTER_OGRE 1
#define MONSTER_KNIGHT 2
#define MONSTER_WIZARD 3
#define MONSTER_CRAZYNAZI 4

Then we'll implement the type-specific AI routines and finally modify the switch/case statement in DoAI to call these new routines. As we'll be adding more monster types every now and then, we notice that the task of modifying the switch/case statement every time is somewhat tedious and error prone. It is easy to forget to add a case to some monster, and that would be discovered only later by testing when the assert in the default branch fires. When adding monsters in some late-night programming session, there is a risk that we make mistakes - e.g., by forgetting the break-statement from one of the new branches. The resulting monster AI would probably be the computer equivalent of schizophrenic behaviour. Also, it is likely that there are other places in our code in which there is an almost identical switch/case statement calling some other type-specific routines. For instance, the rendering loop uses similar switch to draw the monster graphics based on the type of the monster. All in all, we're increasingly unsatisfied with the situation.

One often-used "improvement" to the situation is to store monsters to separate lists based on their type. Instead of doing a switch/case per monster instance, you loop over multiple lists:

/* somewhere in main game loop */
Monster* m = GruntList->head;
while (m != NULL)
{
        DoGruntAI(m);
        m = m->next;
}

m = OgreList->head;
while (m != NULL)
{
        DoOgreAI(m);
        m = m->next;
}
// etc... for each monster type

This solution is not very appealing since in this case we'll have to add a new list and write the code to loop over it when new monster types are added. Moreover, we would have to do this for the rendering loop as well. We'd prefer a solution that allows adding monster types without having to touch the core code in the main game loop.

Function Pointers
Still sticking to C, we decide to use function pointers to save us from the switch/case trouble. There are different ways to do it; after some thought we decide to modify the Monster structure by adding pointers to type-specific functions:

typedef void (*AIFunction)(Monster*);
typedef void (*RenderFunction)(Monster*, Device*);

typedef struct Monster_
{
        AIFunction aifunc;
        RenderFunction renderfunc;

        int strength;
        int intelligence;
        int armor;
        Monster* next;
} Monster;

The DoAI routine becomes a whole lot shorter:

void DoAI(Monster* m)
{
        assert(m->aifunc != NULL);
        if (m->aifunc != NULL)
                (m->aifunc)(m);
}

Everything works nicely as long as we ensure all Monster instances are initialized properly with the correct function pointers. You may wonder why there is a double-guard against NULL pointer in DoAI. The assert is used to catch incorrectly initialized Monsters as early as possible when compiling in debug mode. The second check is an example of defensive programming. In release version the assert will be #ifdef'd out. If the unthinkable happens and m->aifunc is NULL (and somehow, someday, it will) in the release version that we ship, the second check prevents the game from crashing on the gamer. It is a matter of another debate whether this kind of double-checking is worth doing. I personally think it is.

Final Touch
Back to our design. We're still not quite satisfied with it since there is now redundancy: every Monster structure instance stores the pointers separately thus causing memory space overhead. What we really want is to store the function pointers once per Monster type. Back in our sketchboard, we decide to have a separate structure for the function pointers, and a pointer to that structure in the Monster structure:

typedef struct FunctionTable_
{
        AIFunction aifunc;
        RenderFunction renderfunc;
} FunctionTable;

typedef struct Monster_
{
        FunctionTable* fptr;

        int strength;
        int intelligence;
        int armor;
        Monster* next;
} Monster;

This time the design is just perfect. The start-up code will initialize a global array of FunctionTable structures, in which we have an entry for each monster type:

FunctionTable gFunctionTable[MAX_NUMBER_OF_MONSTER_TYPES];

int main()
{
  gFunctionTable[MONSTER_GRUNT].aifunc = &DoGruntAI;
  gFunctionTable[MONSTER_GRUNT].renderfunc = &RenderGrunt;
  gFunctionTable[MONSTER_OGRE].aifunc = &DoOgreAI;
  gFunctionTable[MONSTER_OGRE].renderfunc = &RenderOgre;
  /* etc... */
}

When a monster is created at run-time, the fptr member is initialized as follows:

Monster* AllocGrunt()
{
        Monster* m = (Monster*)malloc(sizeof(Monster));
        m->fptr = &gFunctionTable[MONSTER_GRUNT];
        return m;
}

Function DoAI now looks like this:

void DoAI(Monster* m)
{
        assert(m->fptr != NULL);
        assert(m->fptr->aifunc != NULL);
        if (m->fptr && m->fptr->aifunc)
                (m->fptr->aifunc)(m);
}

The rendering function is essentially identical to DoAI. If you are feeling lucky, you may decide to go without the double-check against NULL pointers.

C Summary
Let's summarize what we have achieved so far. When new monster types are added, the global FunctionTable array needs to be updated with a new FunctionTable structure which is initialized with function pointers to the type-specific function implementations. When a Monster instance is created at run-time, we have to ensure its fptr field is set to point to the correct entry in the global FunctionTable array. The major benefit is that there is no need for us to touch the core code, DoAI and other similar functions. The code is now more maintainable since it is less likely that we break something when new monster types are added to the game.

Game programmers are obsessed with the efficiency of their code. So what about efficiency in our case? Instead of a switch/case statement in DoAI, with different branch calling different function, there is now one pointer dereference, a lookup into the FunctionTable structure, and an indirect function call (and a couple of checks against NULL but we don't count them here). With no compiler optimizations enabled, the switch/case corresponds to a linear search over all cases, and a subsequent function call. When comparing worst case performance, our latest version with a lookup is certainly better. A good optimizing compiler will probably generate a table lookup out of the switch/case statement. In that case, the performance of our latest version is about the same.

Speaking of efficiency, there is another point to note. Every time I profile my DirectX game code, I find that it is the calls to DirectX rendering functions that eat up most of the processor time (this is on a non-hardware-accelerated platform). The first function that does something other than rendering is a function that transforms vectors - and it takes less than 1 % of the overall processor time. The point is this: there is no point in worrying about a few cycles in places where it simply does not matter.

Enter OO
So when will OOP enter the stage? As a matter of fact, OOP - or more specifically polymorphism - already did enter the stage when we got rid of the switch/case statement in the DoAI function. In essence, we have re-engineered the C++ implementation of polymorphism: the virtual funtion call mechanism. This is an example of OO programming using a non-OO language. You can do it, but you will have to do it manually. A language that supports OO constructs, such as C++, gives us the stuff we just implemented for free.

Let's see how we'd achieve the same in C++. First, declare the Monster types as classes:

class Monster
{
public:
        virtual void doAI() = 0;
        virtual void render(Device*) = 0;

        int strength;
        int intelligence;
        int armor;
        Monster* next;
};

class Grunt : public Monster
{
public:
        void doAI();
        void render(Device*);
};

class Ogre : public Monster
{
public:
        void doAI();
        void render(Device*);
};
// etc...

The DoAI function looks similar to the one we ended up with in the C version:

void DoAI(Monster* m)
{
        m->doAI();
}

As a matter of fact, since the DoAI function is now so simple, we can get rid of it alltogether without cluttering the main loop with checks or switches:

/* somewhere in main game loop */
Monster* m = MonsterList->head;
while (m != NULL)
{
        m->doAI();
        m = m->next;
}

The C++ virtual function mechanism ensures that the doAI call ends up in the correct doAI implementation which depends on the actual type of the monster. And that's basically it. No need to manually set up function pointers. No need to guard against NULL function pointers. And most importantly, no need to touch the core code or DoAI function as new Monster classes are added.

How does the efficiency of the C++ version compare to the C version? As I already said, in the places where we use the virtual function calls (twice for each monster per frame) it probably does not matter. In places where it really matters - e.g., in your texture mapper's inner loop - you should not call virtual functions. In short, in places where you can afford a switch/case statement, you can afford a virtual function call.

Still, for the sake of the discussion, let's see how C++ compilers typically implement virtual functions. It turns out that the implementation is very similar to our C implementation of polymorphism. In most compilers, each object has a pointer called vptr ("virtual table pointer") that corresponds to our fptr pointer. For each class, there is an array of function pointers called vtbl ("virtual table") that corresponds to our FunctionTable struct. At run-time, when an object is constructed, the vptr pointer in the object is set to point to the vtbl of its class, exactly like we do in AllocGrunt function. A virtual function call means following the vptr pointer of the object to the vtbl and indexing it to find the appropriate function.

To really concretize the discussion let's dive deep down and take a look at how Microsoft Visual C++ 5.0 implements a virtual function call at the assembly language level. The call pGrunt->doAI() generates code like this:

pGrunt->doAI();
1: mov         eax,dword ptr [pGrunt]
2: mov         edx,dword ptr [eax]
3: mov         ecx,dword ptr [pGrunt]
4: call        dword ptr [edx]

On line 1, the value of the pointer pGrunt is moved to register eax. On line 2 the 32 bit value pointed to by eax is moved to edx. That value is the first 32 bits of the Grunt structure - in other words, it is the value of the vptr pointer. The actual call takes place on line 4 in which the expression [edx] evaluates to the first entry of the vtbl array. The meaning of line 3 may not be immediately obvious: it stores the value of the pGrunt pointer to ecx. What purpose does this serve? The answer is that's how the 'this' pointer gets passed to doAI (in situations in which the function takes multiple parameters 'this' may be passed to the function on the stack). Prior to the actual call, the value for the 'this' pointer is stored to ecx, and subsequent references to member variables in doAI() will use the value in ecx as the 'this' pointer.

The example above was simple because it called the first virtual function of class Grunt. Here's what happens when we call pGrunt->render() which is the second virtual function of that class (for the sake of simplicity, I removed the Device* parameter):

pGrunt->render();
1: mov         eax,dword ptr [pGrunt]
2: mov         edx,dword ptr [eax]
3: mov         ecx,dword ptr [pGrunt]
4: call        dword ptr [edx+4]

As you see, lines 1, 2 and 3 are identical to the previous example. The difference is on line 4. Instead of taking the value of the first entry in the vtbl array, [edx+4] is the value of the second entry in the array, which is the pointer to the function render().

In addition to the minor run-time cost of a virtual function call, there is memory space cost since every object with virtual functions will have the vptr pointer. For most of the time this is not an issue, but in some cases it may be. For instance, if you have a simple Point structure with two 16-bit ints for storing a point in 2D space, then adding one virtual function to the structure will double its size: two 16-bit ints plus a 32 bit vptr. After that the compiler will not be able to put a Point structure into a 32 bit register which will affect performance if you use it in a critical loop.

Conclusion
Instead of simply asking "do OO features of C++ add overhead?", you should compare procedural and OO approaches to a problem and make your decision based on that. There are situations in which procedural programming style with ordinary functions will do just fine. In other situations, like in our example above, applying OOP will result in code that is as efficient as the procedural version. In addition, the C++ version using OOP was smaller, cleaner and more maintaineable.

I hope that by now I have convinced at least some people that C++ and its OO features can benefit a game programmer without sacrificing performance. Virtual member functions are not the silver bullet but they can be used for solving certain kinds of problems in a manner that results in clean, efficient and maintainable code. You can think of them as automatic switch/case statements that your friendly OO compiler supports by doing the dirty work for you. But before you rush out and start using C++ for your next project, let me have a couple of words of warning.

C++ is a complex language. The advanced features of C++ come with a large number of gotchas for the programmer to get caught by. Don't try to use every single feature of C++ right from the beginning. Mastering C++ and programming in general simply is not possible in 21 days, no matter what they say.

Finally, don't believe single-mindedly anything you read on the web. That includes this text. Fire up your profiler and profile, profile, step through the code in the debugger, and then do your own decision based on hard facts.

Ps. If you are also interested in COM, then read another article I wrote: COM for Game Programmers.

This page has been accessed times. 1