The DNS works as a distributed database because of two fundamental
ideas: replication and caching.
We have already seen how caching works -- at any point in a query,
if a nameserver has a current copy of the desired information, it
can supply it instead of contacting other nameservers.
The DNS requires that all nameservers be
replicated at least once -- that is, for each zone
of authority there must be at least two
authoritative nameservers. The rules for replication of nameservers
make for quite entertaining reading...
DNS queries and responses are an excellent example of an
application where the reliable, connection-oriented transport
mechanism of TCP is not necessary, and simply has too much
overhead. In fact, queries are encapsulated in unreliable
UDP datagrams, see later. UDP is a
connectionless transport service, with the same
level of reliability as IP packet delivery itself -- in other
words, UDP messages can be lost, delivered out of order and even
duplicated. If a resolver does not receive a reply from a
nameserver, it usually either tries again, or tries the next
nameserver for the same domain.
Finally, although it is beyond the scope of our subject, DNS
messages are NOT simple ASCII strings -- the DNS
formats are quite complex and designed for efficient parsing. It's
not trivial (for obvious reasons) to write a DNS client. In a
sense, DNS is not strictly an application protocol -- it provides
support for application protocols, but isn't one itself.