Implementation Notes
I won't describe the implementation in detail here, but you can look
in an AI book or at another web page that has a description. See the
references section for a list of books.
Essentially, there is a set of nodes (map locations and a cost) called
OPEN and another set called CLOSED. Each time through the main loop,
you pick out the best element from OPEN (where "best" means "the one
with the lowest cost"), and you look at its neighbors. You then put
any unvisited neighbors into the OPEN set. The cost of a node is the
sum of the current cost of walking from the start to that node and the
heuristic estimate of the cost from that node to the goal.
My own C++ A* code is available: path.cpp and path.h. There is also an older (slower, but easier to
understand) version of my code, as well as many other implementations
of A*, on Steve Woodcock's
Games AI Page.
A lot of people don't know why their A* implementation is so slow, and
ask on rec.games.programmer
or comp.ai.games
. The most
common reasons are:
For example, on a grid map, you should have a heuristic that
reflects the actual movement costs along normal directions and
diagonals. If your game doesn't allow diagonal movement, you
probably want abs(dx)+abs(dy)
. If it does, you
probably want max(abs(dx),abs(dy))
.
This is counterintuitive -- it makes the most sense to have a
"true" distance heuristic. A*, however, is comparing the
g
values (actual movement costs) to h
values (heuristic). If these two don't match, or if they aren't
at the same scale, A* may give bad paths or may search too much.
The problem with this heuristic is that it gives you path that look bad, even if they aren't really bad! To remedy this, add
a small (around 1% of the heuristic) penalty when the node isn't near
the straight-line path from the start to the goal. Also see the Heuristics section of this document for more
information.
What's the first thing you'll think of using for these two sets? If
you're like me, you probably thought "array". You may have thought
"linked list", too. The problem is that finding the best node out of
an array or list is slow. Lists and arrays are very general.
Usually, more specific data structures can outperform general data
structures. For A*, we do not need the flexibility of a list (fast
insertion and deletion anywhere), nor do we need the flexibility of an
array (fast access to any element). Instead, we want two operations:
insert and delete best. (Actually, we need a third;
see the note below.)
For these operations, what you really should have is a priority queue. These
can be implemented in a variety of ways, including balanced binary
trees, skip lists, and splay trees. My favorite is with heaps.
Heaps are easy to implement, and they are stored in an array, which is
pretty efficient. I use STL's efficient implementation of
heaps for my code. I get speed and I don't have to code up and
test a heap data structure. An example set of numbers: if you have
1,000 nodes in your OPEN set, then inserting into and removing from a
sorted list takes an average of 500 comparisons and 2 moves; for an
unsorted list it takes 1000 comparisons and 2 moves; for a sorted
array it takes 500 comparisons and 500 moves; for a heap it takes 10
comparisons and 10 moves. That's approximately a factor of
100 faster. However, these are worst case values. More
realistically, with 1000 nodes in the set, you should expect heaps to
be 25% faster than linked lists, and for 1500 nodes in the set, you
should expect heaps to be 50% faster. The more nodes you have, the
faster the heap will be relative to an array or list.
It is important to keep in mind that we are not merely looking for
asymptotice ("big O") behavior. We also want to look for a low
constant. To see why, consider an algorithm that is O(log N) and
another that is O(N), where N is the number of elements in the heap.
It may be that on your machine, an implementation of the first
algorithm takes 10,000 * log(N) seconds, while an implementation of
the second one takes 2 * N seconds. For N = 256, the first would take
80,000 seconds and the second would take 512 seconds. The "faster"
algorithm takes more time in this case, and would only start to be
faster when N > 200,000.
You cannot merely compare two algorithms. You should also
compare the implementations of those algorithms. You also have to
know what size your data might be. In the above example, the first
implementation is faster for N > 200,000, but if in your game, N stays
under 30,000, then the second implementation would have been better.
A friend of mine (who does research in data structures for
shortest-path algorithms) says that heaps are good unless you have
more than 10,000 elements in your heap. Unless your game's map is
extremely large, you probably don't need a more complicated data
structure (like multi-level buckets). You should probably stay
away from Fibonacci heaps, which have good asymptotic performance but
have slow implementations unless N is huge.
Important Note for anyone using heaps: Unlike Dijkstra's
Algorithm, A* requires that you determine whether the new node you are
examining is already in the OPEN set, and if it is, whether it is
better than the one in the OPEN set, and if it is better, then the one
in the OPEN set must be adjusted. The problem is that when using
heaps, it is difficult to find a particular node in the heap -- it
requires a search through the entire heap, which is slow. To avoid
this expensive search, my A* code has a "Marking" array to help it
answer the two questions above. To avoid the first question (whether
something is in the OPEN set), the Marking array keeps track of which
map locations are in the OPEN and CLOSED sets. To avoid the second
question (whether the new node is better than the old one), the
Marking array keeps track of the g
values at each node. Only
if both questions are answered "yes" does my code invoke Find (which I
call find_in_open
). Experiments suggest that for most
searches in my game, one of the two questions will be answered "no"
most of the time, so almost all the calls to Find are eliminated when
using the Marking array.
Heaps are a tree-based structure with expected O(log N) time
operations. However, the problem is that with A*, the common behavior
is that you have a low cost node that is removed (causing O(log N)
behavior, since values have to move up from the very bottom of the
tree) followed by low cost nodes that are added (cuasing O(log N)
behavior, since these values are added at the bottom and bubble up to
the very top). The expected case behavior of heaps here is
equivalent to the worst case behavior. We may be able to do
better if we find a data structure where expected case is
better, even if the worst case is no better.
Splay trees are a self adjusting tree structure. Any access to a node
in the tree tends to bring that node up to the top. The result is a
"caching" effect, where rarely used nodes go to the bottom and don't
slow down operations. It doesn't matter how big your splay tree is,
because your operations are only as slow as your "cache size". In A*,
the low cost nodes are used a lot, and the high cost nodes aren't used
for a long time, so those high cost nodes can move to the bottom of
the tree.
There's another good data structure that may be better than heaps.
Often, you can restrict the range of values that would be in the
priority queue. Given a restricted range, there are often better
algorithms. For example, sorting can be done on arbitrary values in
O(N log N) time, but when there is a fixed range it can be done in
O(N) time.
We can use HOT Queues
(Heap On Top queues) to take advantage of a particular property of A*
and Dijkstra's algorithms: when we are removing nodes with value f
, the nodes we insert have value f + delta
where delta
<= C
. (Note however that delta >= 0
only if your heuristic is
admissible, and this technique is not as useful if you use an
inadmissible heuristic.) The constant C
is the maximum
change in cost from one point to an adjacent point. We can therefore
keep "buckets" that store some subset of the values (just like we do
in the O(N) sorting algorithms), and in the most important bucket we
can use a heap. For example, if there are ten buckets, then the heap
only contains (on average) one tenth of the values, so it runs faster
than using a heap for all of them. In addition, we do not organize
the other nine tenths of the elements unless they are actually needed.
In A*, we often put nodes into OPEN that we never actually need. HOT
Queues are a big win because the elements that are not needed are
inserted in O(1) time. Only elements that are needed get heapified
(which is not too expensive). The only operation that is more than
O(1) is node deletion from the heap, which is O(log N) but remember
that the queue is small, so it's not as bad as you'd expect.
In addition, if C
is small, we do not even need a heap for the
smallest bucket, so insertion and deletion are both O(1) time!
(Compare this to heaps, which have insertions and deletions take O(log
N) time.) One person reported that HOT queues are as fast as heaps
for at most 800 nodes in the OPEN set, and are 20% faster when there
are at most 1500 nodes. I would expect that HOT queues get faster as
the number of nodes increases.
In the main A* loop, the OPEN set stores all the nodes that may need
to be searched to find a path. The Beam Search is a
variation of A* that places a limit on the size of the OPEN set. If
the set becomes too large, the node with the worst chances of giving a
good path is dropped. One drawback is that you have to keep your set
sorted to do this; a priority queue implementation may not work well.
(HOT queues would work, though.) On the other hand, sorting it is not
much slower than a priority queue, and it may help so much to be able
to drop nodes that it pays for the extra overhead of sorting. (Note
that you don't do a full sort every time you add a node to the list;
you only need to place the new node in the right place, which can be
done in logarithmic time. Inserting into a priority queue is also
logarithmic. It's just that priority queues can have a lower constant
factor associated with the costs.)
Iterative Deepening is used in many AI algorithms to start with an
approximate answer, then make it more accurate. The name comes from
game tree searches, where you look some number of moves ahead (for
example, in Chess). You can try to deepen the tree by looking ahead
more moves. Once your answer doesn't change or improve much, you
assume that you have a pretty good answer, and it won't improve when
you try to make it more accurate again. In ID-A*, the "depth" is a
cutoff for f
values. When the f
value is too large, the
node won't even be considered (i.e., it won't be added to the
OPEN set). The first time through you process very few nodes. Each
subsequent pass, you increase the number of nodes you visit. If you
find that the path improves, then you continue to increase the cutoff;
otherwise, you can stop. For more details, read these
lecture nodes on ID-A*.
I personally don't see much need for ID-A* in games. ID
algorithms tend to increase computation time while reducing memory
requirements. In games, however, the "nodes" are very small -- they
are simply coordinates. I don't see a big win from not storing those
nodes. As for processing time, although lists, arrays, and heaps are
affected by a large number of nodes added to the OPEN set, HOT queues can be used to avoid using any CPU time for
handling nodes with high f
values.
With dynamic weighting, you assume that at the beginning of your
search, it's more important to get (anywhere) quickly; at the
end of the search, it's more important to get to the goal.
f(p) = g(p) + w(p) * h(p)
Here, there is a weight (w >= 1
) associated with the heuristic.
As you get closer to the goal, you decrease the weight; this decreases
the importance of the heuristic, and increases the relative importance
of the actual cost of the path.
There are two properties about Bandwidth Search that some
people may find useful. This variation assumes that h
is
an overestimate, but that it doesn't overestimate by more than
some number e
. If this is the case in your search, then
the path you get will have a cost that doesn't exceed the best path's
cost by more than e
. Once again, the better you make
your heuristic, the better your solution will be.
Another property you get is that if you can drop some nodes in the
OPEN set. Whenever h+d
is greater then the true cost of
the path (for some d
), you can drop any node that has an
f
value that's at least e+d
higher than the
f
value of the best node in OPEN. This is a strange
property. You have a "band" of good values for f
;
everything outside this band can be dropped, because there is a
guarantee that it will not be on the best path.
Curiously, you can use different heuristics for the two properties,
and things still work out. You can use one heuristic to guarantee
that your path isn't too bad, and another one to determine what to
drop in the OPEN set.
Instead of searching from the start to the finish, you can start two
searches in parallel -- one from start to finish, and one from finish
to start. When they meet, you should have a good path.
This sounds like a good idea, but I don't think it'll give you very
much. The idea behind bidirectional searches is that searching
results in a `tree' that fans out over the map. A big tree is much
worse than two small trees, so it's better to have two small search
trees. With A*, however, my experiments suggest that you don't
get a tree. You get a path that has nearby map areas explored, but it
doesn't fan out like Dijkstra's algorithm. In fact, that's what makes
A* so fast -- no matter how long your path is, it doesn't search like
crazy, unless the path really is crazy. It tends to search only small
areas of the map. Bidirectional search may be more useful if your map
is complex.
The front-to-front variation links the two searches together.
Instead of choosing the best forward-search node---g(start,x) +
h(x,goal)
---or the best backward-search node---g(y,goal) +
h(start,y)
---this algorithm chooses a pair of nodes with the best g(start,x) + h(x,y) + g(y,goal)
.
The retargeting approach abandons simultaneous searches in the
forward and backward directions. Instead, it performs a forward
search for a short time, chooses the best forward candidate, and then
performs a backward search---not to the starting point, but to that
candidate. After a while, it chooses a best backward candidate and
performs a forward search from the best forward candidate to the best
backward candidate. This process continues until the two candidates
are the same point.
Path-Finding algorithms tend to be worse than linear: if you
double the distance needed to travel, it takes more than twice
as long to find the path. You can think of path-finding as searching
some area like a circle --- when the circle's diameter doubles, it has
four times the area.
One way to reduce the problem is to have multiple levels of searching.
For example, to get from your home to a location in another city, you
would find a path from your chair to your car, from the car to the
street, from the street to a freeway, from the freeway to the edge of
the city, from there to the other city, then to a street, to a parking
lot, and finally to the door of the destination building. There are
several levels of searching here:
- At the street level, you are concerned with walking
from one location to a nearby location, but you do not go out
on the street.
- At the city level, you go from one street to another
until you find the freeway. You do not worry about going into
buildings or parking lots, nor do you worry about going on freeways.
- At the state level, you go from one city to another
on the freeway. You do not worry about streets within cities
until you get to your destination city.
Dividing the problem into levels allows you to ignore most of your
choices. When moving from city to city, it is quite tedious to
consider every street in every city along the way. Instead, you
ignore them all, and only consider freeways. The problem becomes
small and manageable, and solving it becomes fast.
You can use multiple levels with graph-searching algorithms such as
A*. One paper on this topic is Hierarchical A*: Searching Abstraction Hierarchies Efficiently. You
do not need to use the same algorithm at each level. See the section about splines for an example of
using a path-finding algorithm at one level (from tile to tile) and a
simple spline drawing algorithm at the lower level (from pixel to
pixel).
A similar approach is to use varying resolution. First, plot a path
with low resolution. As you get closer to a point, refine the path
with a higher resolution. This approach can be used with path splicing to avoid moving obstacles.
My impression of Dynamic A* (D*) was that it was good for robots but
not necessarily for games. D* is intended for use when you don't have
complete information. If you don't have all the information, A* can
make mistakes; D*'s contribution is that it can correct those mistakes
without taking much time.
A* assumes you have all the information at the beginning, and then
gives you a path to follow. D*'s does not require complete
information at the beginning -- instead, you give it what you know,
and it gives you a path. When you learn more, D* will make
improvements to the path. D* solves a different problem than A*.
D* never does worse than A*. If you have all the information
at the beginning, they produce the same path. If you do not have all
the information, they initially produce the same path but D* improves
it, so its path is better than the one A* produces.
However, D* requires a lot of space -- essentially you run A*
and keep around its internal information (OPEN/CLOSED
sets,
path tree, g
values), and then when the map changes, D* will
tell you if you need to adjust your path to take into account the map
changes.
For a game with more than one unit, you usually don't want to keep all
that information around, so D* can't really be used. It was designed
for robotics, where there is just one robot -- you don't need to reuse
the memory for some other robot's path. If your game has only one or
a small number of units, you may want to investigate D*.