This text is Copyright (c) 1999 Sami Vaaraniemi. Send comments to vaaranie@cc.helsinki.fi. This page was created using the Webford page development software. Last updated 21 February 1999.
Click HERE to download the accompanying C++ source.
This text is not a COM tutorial. Knowing the basics of COM is a good idea before reading this text since I'm not going to explain in detail basic things such as IDL and IUnknown. If you are totally unfamiliar with these topics, go fishing onto the net and read an article or two, or read first few chapters of the book "Essential COM" by Don Box (btw I was a bit disappointed to see he had cut his hair). On the other hand, if you are a seasoned C++ veteran you will probably be able to follow the text and perhaps gain some understanding of COM on the route.
I can imagine many uses for COM in relation to game programming, but in this text I'm going to focus solely on using COM for making a game engine extendible (for an example on how to structure a 3D engine using COM, check out Jorrit Tyberghein's Crystal Space at http://crystal.linuxgames.com). Furthermore, I'm going to consider only in-process COM and not delve into the vast territories of out-process COM. Instead of going heavy with theory, we'll dive straight into practice by showing how COM can be used for writing extendible software in C++. I will pay special attention to the one issue that game programmers are obsessed with, namely efficiency.
Programmers usually have the presupposition that using COM automatically introduces lots of overhead. There are all kinds of rules and restrictions as to what kind of parameters a COM method can take; the COM marshalling layer always copies parameter data and transforms it into some kind of wire-representation; context switches and/or inter-process communication cause inevitable delays. The fact is that all these overheads can be avoided when using in-process COM. The other side of the in-process coin is that once we do the decision, we're pretty much stuck with it: there is no easy way for converting in-process-only interfaces to remoteable interfaces. But in this article we'll acknowledge the fact and sacrifice everything in the name of efficiency.
We'll take it as our goal to replace the C++ class-based monster system in OO for Game Programmers with a COM-based implementation. The target is a system in which new monster types can be added to the game without touching the engine code, without recompiling or relinking it. Adding new monster types will be possible even while the engine is running. All this we will achieve with only a minor performance hit over the C++ version that we already have. In this text I'm going to outline what the code looks like in the engine side, and I'm going to discuss relevant implementation issues as we run into them. I'm not going to get into the implementation side of the monster COM class in this article.
How would we achieve the above requirements in a non-COM world? Most likely we'd write the engine so that it could load monster implementations from DLLs. We'd define a set of functions and/or classes to be exported from the DLL. The engine would locate and load the DLLs at run-time, load the function pointers from the DLLs, and call the functions to let them do their job. We'd get the job done - but it would be tedious, messy and error prone. The first chapter of Don Box’s "Essential COM" has a nice explanation of the intricate issues you will run into with plain non-COM DLLs.
Welcome To The COMhood
I feel I have to say a couple of words of what COM is and what it is not since there are all kinds of strange claims about it floating around in the game programming newsgroups. COM is not a class library. Neither is it a programming language or an API. COM stands for Component Object Model; it is a specification of how independently-written software components can interoperate.
I will take one more side-step right now, and it's the last one for a while, I promise. I want to point out that COM provides an infrastructure for doing extendible software, much like C++ provides an infrastructure for writing polymorphic code. But as Frederick Brooks would put it, COM does not attack the essence of the difficulties in software development. COM tells us how to write both the client and the server sides of interfaces - once we have the interfaces designed. Designing the interfaces in the first place still requires intellectual effort and is as difficult as it ever was.
On to the topic then. When our engine loads a COM object in-process, it turns into something that is in many ways similar to an ordinary C++ object. It looks and feels the same and the performance of a COM method call is the same as a C++ virtual member function call. As a matter of fact, when a COM object is in-process, a COM method call *is* an ordinary C++ virtual function call. There is no marshalling, no stubs, no proxies or any other hideous things in between the caller and the COM object. We'll verify this later by taking a look at the assembler code that does a COM call.
Despite all the similarity, an in-process COM object and a C++ object do have some important differences as well. First, COM objects are always accessed via pointers to interfaces. Interface pointers are essentially ordinary C++ pointers to an abstract class. This means that one cannot access the data members of a COM object directly, which has a concrete implication for us that we'll tackle in a moment. Second, COM methods cannot be inlined. The reason why this is so is self-evident: inlining means that the body of a function is inserted to the call location at compile-time. With COM objects, the body of a function simply is not available at compile-time.
What it all amounts to is that implementing a class as a COM class means that whenever you access it you will pay the price of a C++ virtual function call. We have already established in OO for Game Programmers that this is not an overwhelming price to pay. All other implications of COMness can be pretty much worked around in one way or another if needed. For instance, instead of using the system provided CoCreateInstance for instantiating COM objects, we can implement the instantiation in a custom interface of our class factory in a way that is more suitable for our purposes. All in all, we don't want to do COM calls in critical loops, but implementing the Monsters as COM classes will result in no visible performance hit over the C++ implementation.
Assume we are going to write one in-process COM class called CGrunt that implements an interface we will define. The implementation will be compiled into a single DLL that is plugged into the engine at run-time. One monster type per DLL is the simplest configuration; it would be fairly trivial to implement multiple different monster types in one DLL.
Interface
Let's begin by writing down the interface for the monster class. We'll do that in IDL although we could just write the interface as a C/C++ header file (just like d3d.h does) since we're only going to do in-process COM. The monster interface, just like all other COM interfaces, derives from the IUnknown interface. The purpose of the IUnknown interface is to provide a uniform way for managing the life-time of COM objects (the AddRef and Release methods ) and for asking objects questions like: "are you a monster?" (the QueryInterface method).
interface IMonster : IUnknown
{
HRESULT DoAI();
HRESULT Render();
};
Since this is a toy example the IMonster methods take no parameters. I left out some IDL stuff that is just implementation detail and not relevant to the discussion. According to COM rules all methods should return an HRESULT which is a 32-bit integer. The reason for this is that the COM remoting infrastructure needs a way to return RPC errors to the caller. Our IMonster interface will never be remoted so in principle we could ignore that rule and return from the methods whatever we want, and everything would work just fine. But even for interfaces that are never to be called over process boundary it is recommended that HRESULTs are used as method return values, simply for the sake of programming model consistency.
Into The Implementation: The Engine Side
The changes in the engine side will be rather small which reflects the fact that COM objects are much like C++ objects from the client's point of view.
From the fact that COM object's data members can't be accessed directly it follows that we can't store pointers to COM objects in an intrusive list like we did in our engine with the C++ classes. To solve this we could simply convert to using non-intrusive containers such as the STL list, but in order not to introduce yet another new player to the game, we'll work around the problem without STL (STL is another great topic for an article; luckily I don't have to write it since there is already one in Amit's Game Programming Information page in the C versus C++ section). Recall the code in the game main loop that loops over the monster instances in an intrusive list and calls doAI() for each object:
Monster* pMonster = MonsterList->head; while (pMonster) { pMonster->doAI(); pMonster = pMonster->next; }
Note that in real-life we'd use a doubly linked list in order to be able to add and remove monsters in constant time.
In order to be able to use COM objects as monsters, I will accept my fate and pay the price of adding one level of indirection to the code above. Let us have the following small structure:
struct MonsterListItem { IMonster* item; MonsterListItem* next; };
Instead of storing the interface pointers directly into the list, we'll have a list with items of type MonsterListItem. The loop code now looks like this:
MonsterListItem* pListItem = MonsterList->head; while (pListItem) { pListItem->item->DoAI(); pListItem = pListItem->next; }
Similar change will be done in all places where we loop over monster instances, e.g., in the rendering loop.
We had to give up one pointer dereference in the bargain, and the memory space for one additional pointer per monster instance. In return, the game core code has now been adapted to a system that can be extended with new monster types after the engine has been finalised. The code could be made cleaner by hiding the extra pointer dereference inside an operator->() which we'd declare inline, but for now we won't bother since it doesn't change the fact that the extra step is there.
What's left to do is to change the way how monsters are instantiated and destroyed, and to consider the reference counting strategy.
Spawning Monsters
Instantiating monsters is different from the way it is done in C++. For the sake of brevity, we'll just call CoCreateInstance whenever we want to instantiate a monster. CoCreateInstance does all the dirty work in locating and loading the implementation DLL and getting the interface pointer to the COM monster.
The idea is that the monsters to be spawned are specified in the level data along with other data such as the BSP tree. In the COM-based version, each monster entry in the level will contain the class id (CLSID) of the COM class that implements the monster and probably other data such as the initial location. The level loader part of the engine extracts the class id, instantiates the monster and inserts it into the global list of monsters.
// error handling omitted IMonster* InstantiateMonster(const CLSID& clsid) { IMonster* pMonster = NULL; HRESULT hr = ::CoCreateInstance( clsid, // e.g., CLSID_Grunt NULL, // no aggregation CLSCTX_INPROC_SERVER, // in-process! IID_IMonster, // type of returned interface reinterpret_cast(&pMonster)); return pMonster; } bool LoadLevel(Stream* level) { Token token = NextToken(level); If (token == Token_Monster) { // extract the class id from the level data CLSID clsid = ExtractCLSID(level); // instantiate the COM object that implements the monster IMonster* pMonster = InstantiateMonster(clsid); // add the new monster to the global linked list MonsterListItem* pItem = new MonsterListItem; pItem->item = pMonster; pItem->next = MonsterList->head; MonsterList->head = pItem; } // etc }
CoCreateInstance does a lot of work for us but it is rather unflexible. For instance, in order to give constructor parameters such as the initial position for the monster we would implement a custom interface, say IMonsterFactory, in the monster's class factory. Instead of calling CoCreateInstance, we'd obtain an interface to the class factory and do instantiation using the custom interface.
Counting References
Let's move on from the light into the dark and dismal land of reference counting. COM objects manage their lifetime by counting each reference a client has on them. Doing this properly requires the client to take action every time a reference is copied, and frankly, it is a nuisance. Then again, I can't think of any other reasonable reference counting strategy alternative that wouldn't impose major run-time overhead.
Basically the situation is not so much different from that with ordinary C++ pointers. You allocate dynamically an object, have multiple pointers to it and take care that no-one uses a pointer after the object has been deallocated. With COM it is just easier to make a mistake since every reference is counted separately. The rule that will guide us out of the shadows is this: think of your reference counting strategy for a while, adopt a rule and stick to it.
There are two basic strategy alternatives. Either we do the reference counting as the book tells you to do: when a pointer is copied, call AddRef(), and when the copied pointer is no more used, call Release(). This way the COM object will know exactly how many references clients have to it, and knows to deallocate itself when the reference count goes to zero. Or we can adopt a whole different strategy: only call AddRef() and Release() once when obtaining the interface pointer in the first place, and nowhere else. One can live with the first strategy by letting ATL smart COM pointers do the dirty work.
We are not going to get into ATL in this article, so we’ll adopt the second strategy that is based on the following reasoning: "If the life-time of an interface pointer p2 is fully embedded within the lifetime of an interface pointer p1, then when copying p1 to p2 there is no need to AddRef(), and when p2 is not used any more there is no need to Release()". We make this rule into a law and follow it everywhere in the engine.
In practice the rule yields the following. The list in the main game loop owns the Monster interface pointers. Only when adding monsters to the list we'll AddRef() and when removing them from the list we'll Release(). In other places the same rule as with C++ pointers applies: no-one else will own the pointers and should assume they do. We can copy interface pointers without AddRef's and Release's and use them as long as the object stays alive in the main list. The whole reference counting issue reduces to being essentially identical to having ordinary pointers to C++ objects.
When a monster is instantiated the first AddRef() is automatically done by CoCreateInstance. In other words, when we get our hands to an IMonster, its reference count is already one, so there's no need for us to increment it again. Later, when the engine notices that a monster has been annihilated by the player, we'll remove the entry from the global list of monsters and call Release() on the interface pointer.
All the pieces required in the engine side are now in place. As promised, I'll now take a look at the different kinds of overheads COM imposes on us.
How Efficient Is an In-Process COM Method Call?
Let's take a microscopic look at what really happens when we do an in-process COM method call. In order to remove all extra moving parts from the preparate, I added one more method to the IMonster interface shamelessly breaking the guidelines with void return type:
interface IMonster : IUnknown { HRESULT DoAI(); HRESULT Render(); void TestMethod(); };
Method TestMethod() is the sixth method in the interface (three methods come from IUnknown). Here's the assembler code that Microsoft Visual C++ 5.0 generates for the COM method call pMonster->TestMethod() (compiled in debug mode):
pMonster->TestMethod(); 1: mov ecx,dword ptr [pMonster] 2: mov edx,dword ptr [ecx] 3: mov eax,dword ptr [pMonster] 4: push eax 5: call dword ptr [edx+14h]
The first three lines are essentially identical to the code generated from an ordinary C++ virtual function call as you may recall from OO for Game Programmers (if you don't know what vptr and vtbl are, then check that article out) . Line 1 moves the value of the pMonster pointer into register ecx. Line 2 moves the value pointed to by ecx into edx; that value is the first 32 bits in the interface - in other words, it is the value of the vptr pointer which points to the vtbl of the interface. On line 5 the actual call takes place. Since edx contains the address of vtbl[0], edx+14h equals to vtlb[5]; in other words it is the sixth entry in the vtbl array. As expected, the sixth entry is the pointer to the TestMethod method.
Lines 3 and 4 implement the passing of the hidden 'this' parameter into the function. Here is actually the only real difference to an ordinary C++ virtual function call. If you look at the code generated from an ordinary virtual function call, you'll notice that MSVC++ passes the 'this' pointer in the ecx register instead of the stack to the function. This is an optimization that MSVC++ can do since it also generates the code for the virtual function body which is then aware of 'this' coming in ecx. In the case of a COM method call, the function body may have been implemented using another compiler which is not aware of this optimization. Therefore the virtual function implementing the method is declared with the __stdcall directive which instructs the compiler to use a standard stack frame for the call and pass 'this' on stack.
By stepping through the call in the debugger I verified that it ends up directly in the body of TestMethod in the COM class. So in practice an in-process COM method call is a C++ virtual function call, no more and no less. If this was a real DCOM call going over a process boundary, the call would end up in the COM proxy that marshalls the parameters and does a remote method call.
Passing Data
So far our toy example has ignored the fact that in real-life you will need to pass data in between the monster implementations and the engine. In one extreme, we can add access methods such as GetPosition() for all the monster data that the engine might be interested in. It is also very likely that the monster implementation will need to call back to the engine to obtain some data, such as the player position. To implement that, we could define an interface called IGameEngine that somebody in the engine side implements, and we’d give a pointer to that interface to the monster. The monster implementation can then call the methods of IGameEngine when it needs to know something.
The downside of using access methods is of course that we would pay the price of a virtual function call every time a piece of data is needed. To avoid that we can pass pointers to larger chunks of data to the other party. E.g., the DoAI method can take as a parameter a pointer to a StateData structure which is defined along with the interface in the IDL file:
typedef struct StateData_ { float px, py, pz; // player position float vx, vy, vz; // player velocity float deltaTime; // ticks per this frame // etc... } StateData;
In the engine side we can have a global StateData variable that the engine updates each frame. A pointer to that variable can be passed to DoAI. Note that there is no extra copying taking place when passing pointers to in-process COM methods. This means that using pointers to structs is an efficient way to pass larger chunks of data in to a method (and also out). It also means that if the DoAI implementation modifies the data then rest of the monsters will also see the modified data. This may be an issue for some applications but our engine can trust the monster implementations not to modify the data. It would also be possible to pass data types such as an HWND to a COM method and it would work with no problems.
Note that parameter types used in an interface are part of the interface, and thus, just like COM interfaces in general, they must never change after the interface has been published. If there is a need to change a parameter type, we would have to create a new version of the interface using it, say IMonster2, and define a new structure for it.
It may seem strange that we can expose data with a pointer to a structure on the other side of the interface boundary (at least for me it does). After all, COM is supposed to be compiler independent, so how can a component implemented with one compiler access a structure implemented with another compiler? That would require the compilers to agree on the runtime representation of a structure. The answer is that COM indeed requires the compilers to agree on the runtime representation of structs. In some cases it may require special compiler switches but in practice the representation varies little within a single platform anyway.
It is also possible to expose internal data to the other party directly, with or without using a structure that is part of an interface. But doing this breaks encapsulation since it makes a component aware of the internal details of another component. It is really a trade-off between efficiency and encapsulation. It would be better to favor encapsulation and avoid using COM in time-critical places.
Unfortunately real-life does not always fit well into the ideal of full encapsulation. Suppose we had a physics module in the engine side that computes and applies impulses to monsters based on collisions. That module would need access to potentially lots of data describing the internal state of the monster object. In order to avoid virtual function calls and/or copying data, one might decide to expose the state data to the physics module directly, essentially making the internal data part of an interface and promising not to change it ever.
Conclusion
In-process COM can be used for providing extendibility in an efficient way in a game engine. A COM method call is every bit as efficient as a C++ virtual function call. By using COM the programmer does not have to worry about the details of locating and loading DLLs but can leave such details to the system. The only real overhead over a C++ class-based version follows from the fact that using COM automatically brings better encapsulation. However, in-process COM does not prevent you from trading encapsulation for efficiency if absolutely necessary.
In all fairness I have to say that despite the conceptual simplicity of the COM-based monster implementation there are some pesky practical details to deal with. The COM components need to be registered in the user's system before they are visible to the engine. This is of course a job for the installer; likewise the uninstaller has to take care and unregister the components. On the other hand, the registeration process makes the system flexible. When somebody implements a level, new monsters can be delivered along with the level in one package to the end user. A new component can be registered even while the engine is running and the engine will be able to load it when it sees the component’s class id.
I'm pretty sure that the appearance of games in which the actual game is implemented as a set of COM components that are plugged into an engine is just a matter of time. Actually I’d be surprised if there is no game company doing it already. In addition to Monster types, we could as well implement the game logic in a plugin component. In my mind there is only one thing that can prevent this from happening. The question is that is an engine long-lived enough so that it will pay off to write it so that it is open to extension? Writing reusable code requires always a greater effort than writing code for a special case. The fact that many engines today support extending using a scripting language proves that there are cases in which writing reusable code pays off. While extendibility via scripting is nice since even the end-user can do the extension, there are shortcomings to scripting such as efficiency and flexibility. Extension via COM will be more flexible and efficient; the downside is that extending will not anymore be possible for the end user with no proper tools and programming knowledge.
This page has been accessed times.