Showing posts with label Troubleshooting. Show all posts
Showing posts with label Troubleshooting. Show all posts

Sunday, December 20, 2009

Troubleshooting DNS


Troubleshooting, by its nature, is a tough subject to teach. You start with any of a world of symptoms and try to work your way back to the cause. We can’t cover the whole gamut of problems you may encounter on the Internet, but we will certainly do our best to show you how to diagnose the most common of them. And along the way, we hope to teach you troubleshooting techniques that will be valuable in tracking down more obscure problems that we don’t document.

Is DNS Really Your Problem?

Before we launch into a discussion of how to troubleshoot a DNS problem, we should make sure you know how to tell whether a problem is caused by DNS, not by another naming service. On Windows hosts, figuring out whether the culprit is actually DNS can be difficult. Windows supports a whole panoply of naming services: DNS, WINS, HOSTS, LMHOSTS, and more. The stock Windows 2000 nslookup, however, doesn’t pay any attention to these other naming services. You can run nslookup on a Windows 2000 box and query the name server ’till the cows come home while the service with the problem is using a different naming service.
How do you know where to put the blame? First, you need to consider what kind of program is having the problem. If it’s a TCP/IP client, such as telnet or ftp, the possible culprits are DNS and the HOSTS file. If it’s a utility that supports NetBIOS naming, such as net (as in net use), the likely suspects also include WINS and the LMHOSTS file. Other clients, such as ping, that also take either a DNS name or a NetBIOS name as an argument can use any of these naming services.
Next, consider the order in which Windows uses the naming services. You should look through the various services in that order when troubleshooting the problem.
These hints should help you identify the guilty party or at least exonerate one suspect. If you narrow down the suspects and DNS is still implicated, you’ll just have to read this chapter.

Checking the Cache

As we’ve said earlier, you can check the contents of your name server’s cache with the DNS console. This can come in handy if you suspect that your name server has cached bad or out-of-date data from another server. To inspect a server’s cache, click the plus sign to the left of the name of the server in the DNS console’s left pane. You’ll see a folder named Cached Lookups. Either click on the plus sign to the left of it or double-click the folder icon or the label to expand the next level. This shows you the top-level domains for which your name server has cached data. Expand your way to the domain name to which the cached data you’re looking for is attached. In Figure 13-1, we’ve clicked our way down to acmebw.com to look for cached data.




Bb726934.dnstst01(en-us,TechNet.10).gif

Figure: NS and A records for acmebw.com in the cache
As you can see in the right pane, our name server has cached three NS records and one A record for acmebw.com. If we double-clicked net and then acmebw, we could find the cached addresses of these name servers, too.
If you’d like to see the TTL on the cached data, double-click on a record in the right pane. Provided the DNS console is in advanced view mode (select View Þ Advanced), the resulting window shows the record’s TTL. For example, in Figure 13-2, we’ve double-clicked the acmebw.com A record.



Figure : The TTL on a cached record

Figure: The TTL on a cached record
Be sure to refresh the DNS console with Action Þ Refresh or F5 before checking the TTL, or the TTL you see may be bigger than the current TTL.
If you right-clicked the record, you may have noticed a Delete Record selection. Now there’s something you can’t do in BIND. Using the DNS console, you can actually delete cached data record by record! If you know that some records in your name server’s cache are out of date, you can delete them and let your name server pick up updated records from an authoritative name server.

Potential Problem List

Let’s go through some common real-world DNS problems. Many of these problems are easy to recognize and correct. We cover these problems as a matter of course–they’re some of the most common problems because they’re caused by some of the most common mistakes. Here are the contestants, in no particular order.
1. Forget to Increment Serial Number
This particular problem will occur only if you make changes to your zone data file by hand, without using the DNS console. The DNS console remembers to increment the serial number in the SOA record each time it changes zone data, so you don’t have to worry about it. However, this also means that you probably won’t be in the habit of updating the serial number, so you may forget when making that one-off manual modification.
The main symptom of this problem is that slave name servers don’t pick up any changes you make to the zone on the primary server. The slaves think the zone data hasn’t changed since the serial number is still the same.
How do you check if you remembered to increment the serial number? Unfortunately, that’s not so easy. If you don’t remember what the old serial number was and your serial number gives you no indication of when it was updated, there’s no direct way to tell whether it has changed. 1
When you start the primary, it will load the updated zone data file regardless of whether you’ve changed the serial number. About the best you can do is to use nslookup to compare the data returned by the primary and by a slave. If they return different data, you probably forgot to increment the serial number. If you can remember a recent change you made, you can look for that data. If you can’t remember a recent change, you can try transferring the zone from a primary and from a slave, sorting the results, and using a file-comparison tool to compare them.
The good news is that, although determining whether the zone was transferred is tricky, making sure the zone is transferred is simple. Just increment the serial number on the primary’s copy of the zone by double-clicking the SOA record in the DNS console and manually editing the serial number field. The slaves should pick up the new data within their refresh interval, or sooner if they use NOTIFY.
2. Forget to Restart Primary Master Server
Like the last problem, you’ll see this problem only if you make changes to your zone data files by hand. The DNS console adds and deletes data on the fly, so there’s no need to restart your primary master name server.
If you’re not using the DNS console, though, you may forget to restart your primary master name server after editing a zone data file. The name server won’t know to load the new data–it doesn’t automatically check the file to see if it has changed. Consequently, any changes you’ve made won’t be reflected in the name server’s data: new zones won’t be loaded, and new records won’t percolate out to the slaves.
To check when you last restarted the name server, scan the Event Viewer output for the last entry that looks like this:
The DNS Server has started.
The date and time on these events will tell you the last time you restarted the name server.
If the time of the restart doesn’t correlate with the time you made the last change, use the DNS console to stop and restart the name server and reload its data. Check that you incremented the serial numbers on the zone data files you changed, too.
3. DNS Server Loses Manual Changes
One final but important note about making manual changes: remember that the Microsoft DNS Server periodically updates its zone data files. Each time you make changes to a zone’s data using the DNS console, a write is pending: before the DNS server exits, it must rewrite the zone’s data file or it will lose the changes you made. Think of this as a dirty page in memory: the operating system must write it to disk before exiting.
If you make a manual change to a zone data file while a write is pending, you’ll mysteriously lose the change when the name server exits. Say you add delegation to a new subdomain of movie.edu while the server is running and a write is pending. After you’ve made the change, you have to stop the server and start it again to get it to read the zone data again. But as the server exits, it rewrites the movie.edu zone data file, and your delegation disappears. If you’re watching the Event Viewer carefully (like you should be), you’ll see this message before the server stops:
The DNS server wrote version 37 of zone movie.edu to   file movie.edu.dns.
Once you force the server to rewrite its zone data files with Action Þ Update Server Data Files, the server is in sync with the zone data files and doesn’t have to rewrite them on exit. So, if you’re going to make manual changes to the zone data files, you should either stop the server first (although that means your server won’t answer queries while you make the change), or use the DNS console to sync the server with the zone data files and then make the change.
4. Slave Server Can’t Load Zone Data
If a slave name server can’t get the current serial number for a zone from its master server, you won’t be warned about it initially. However, if the problem persists and the slave can’t determine within the expire interval whether or not its data is up to date, it will expire the zone. On a Microsoft DNS Server, you’ll see a message like this in the Event Viewer:
Zone movie.edu expired before it could obtain a successful   zone transfer or update from a master server acting as its source   for the zone. The zone has been shut down.
Once the zone has expired, you’ll start getting SERVFAIL errors when you query the name server for data in the zone:
C:\>  nslookup robocop wormhole.movie.edu .   Server:  wormhole.movie.edu   Addresses:  192.249.249.1, 192.253.253.1   *** wormhole.movie.edu can't find robocop.movie.edu: Server failed
There are three leading causes of this problem: a loss in connectivity to the master server due to network failure, an incorrect IP address configured for the master server, and a syntax error in the zone data file on the master server.
First, use the DNS console to check the address of the master server(s) from which the slave is attempting to load data. Right-click the domain name of the zone in the left pane, choose Properties, and look at the General tab, shown in Figure 13-3.



Figure: Zone properties window showing master server(s)

Figure : Zone properties window showing master server(s)
Make sure that’s really the IP address of the master name server. If it is, check connectivity to that IP address:
C:\>  ping 192.249.249.3    Pinging 192.249.249.3 with 32 bytes of data:    Request timed out.  Request timed out.  Request timed out.  Request timed out.
If the master server isn’t reachable, make sure that the server’s host is really running (for example, is powered on) or look for a network problem.
You may also want to check that the master server is returning authoritative responses to queries for data in the zone. If the master server is responding as not authoritative for the zone, the slave won’t transfer the zone from it. Here’s how you could use nslookup to check for an authoritative response for the zone’s SOA record from the master server:
C:\>  nslookup -norec -type=SOA movie.edu. 192.249.249.3
This command sends a nonrecursive query for the SOA record for movie.edu to the name server at 192.249.249.3. We need to send a nonrecursive query so that the name server at 192.249.249.3 doesn’t try to forward the query to another server.
If this master server is correctly configured, the answer to this query should be authoritative. (Remember that unless nslookup reports “Non-authoritative answer,” the answer is authoritative.) A nonauthoritative reply may indicate that the master server had a problem loading the zone, usually because of a syntax error in the zone data file. Contact the administrator of the master server and have him check his Event Viewer or syslog output for indications of a syntax error. We’ve never seen a Windows 2000 name server go nonauthoritative for a zone based on a syntax error in a zone data file, but older BIND name servers exhibit this behavior. So if your name server is a slave to a zone whose primary master is a BIND name server that’s not claiming authority for the zone, a syntax error could be your problem.
If the answer to the query is authoritative but the slave server still can’t transfer the zone successfully, you can use the nslookup’s ls command to try to transfer the zone manually (ls, as we said in Chapter 12, performs a zone transfer). If you see an error like this, it’s a good bet that the master server restricts zone transfers:
C:\>  nslookup - 192.249.249.3   Default Server:  terminator.movie.edu  Address:  192.249.249.3  >  ls movie.edu   [terminator.movie.edu]  *** Can't list domain movie.edu: Query refused  >
Contact the administrator of the master server and ask whether she is restricting zone transfers. Ask her to check the options on the Zone Transfers tab of the Properties window for the zone you’re trying to transfer (if she’s running the Microsoft DNS Server). If the remote server is running BIND, ask if she’s using the xfrnets or allow-transfer features to restrict zone transfers.
Once the problem has been cleared up and your server successfully transfers the zone, you’ll see messages like these in the Event Viewer:
A more recent version, version 212 of zone movie.edu was   found at DNS server at 192.249.249.3. Zone transfer is in progress.  The DNS server wrote version 212 of zone movie.edu to   file movie.edu.dns.
5. Add Address to Zone, but Forget to Add Corresponding PTR Record
Because the mappings from hostnames to IP addresses are disjointed from the mappings from IP addresses to hostnames in DNS, it’s easy to forget to add a PTR record for a new host. Adding the A record is intuitive, but many people who are used to host tables assume that adding an address record takes care of the reverse mapping, too. That’s not true–you need to add a PTR record for the host to the appropriate in-addr.arpa zone. Thankfully, the DNS console makes that easy by providing a checkbox to Create associated pointer (PTR) record when you choose New Host….
Neglecting to add the PTR record for a host usually causes that host to fail authentication checks. For example, users on the host won’t be able to rsh or rcp to other hosts. The servers these programs talk to need to be able to map the connection’s IP address to a domain name to check authorization files.
In addition, many large FTP archives, including ftp.uu.net, refuse anonymous ftp access to hosts whose IP addresses don’t map back to domain names. ftp.uu.net’s FTP server emits a message that reads, in part:
530- Sorry, we're unable to map your IP address 140.186.66.1   530- to a hostname in the DNS. This is probably because your   530- nameserver does not have a PTR record for your address in its   530- tables, or because your reverse nameservers are not registered.   530-   We refuse service to hosts whose names we cannot resolve.  531-
That makes the reason you can’t use anonymous ftp pretty evident. Other FTP sites, however, don’t bother printing informative messages; they simply deny service.
nslookup is handy for checking whether or not you’ve forgotten the PTR record:
C:\>  nslookup    Default Server:  terminator.movie.edu   Address:  192.249.249.3     >  beetlejuice   --Check for a hostname-to-address mapping   Server:  terminator.movie.edu   Address:  192.249.249.3   Name:    beetlejuice.movie.edu   Address:  192.249.249.23     > 192.249.249.23  --Now check for a corresponding   address-to-hostname mapping   Server:  terminator.movie.edu   Address:  192.249.249.3   *** terminator.movie.edu can't find 192.249.249.23: Non-existent domain
On the primary master for 249.249.192.in-addr.arpa, a quick check of the DNS console or the 249.249.192.in-addr.arpa.dns file will tell you if the PTR record has been added to the zone yet.
6. Wrong Domain Name in RDATA of Record
When you add CNAME, MX, and NS records with the DNS console, remember to specify the fully qualified domain name of the host for the resource record-specific data. The DNS console assumes that the name you type as the RDATA field is fully qualified. So if you try to create a CNAME record as shown in Figure 13-4, the CNAME record looks like this in the zone data file:
bigt    IN  NS  terminator.
This is probably not what you intended, since there’s no top-level terminator domain. You probably assumed the DNS console would append the name of the zone to the name if you left off the dot. Nope.



Figure : Creating a CNAME record (the wrong way)

Figure : Creating a CNAME record (the wrong way)
These mistakes are easy to discover if you simply examine the zone data file (after Action Update Server Data Files) or use nslookup:
C:\>  nslookup -type=ns movie.edu.    Server:  terminator.movie.edu   Address:  192.249.249.3     movie.edu       nameserver = wormhole.movie.edu  movie.edu       nameserver = terminator  wormhole.movie.edu      internet address = 192.253.253.1  wormhole.movie.edu      internet address = 192.249.249.1
7. Loss of Network Connectivity
Though the Internet is more reliable today than it was back in the wild and woolly days of the ARPANET, network outages are still relatively common. These failures usually look like poor performance:
C:\>  nslookup nisc.sri.com.    Server:  terminator.movie.edu   Address:  192.249.249.3     DNS request timed out.      timeout was 2 seconds.  DNS request timed out.      timeout was 4 seconds.  DNS request timed out.      timeout was 8 seconds.  *** Request to terminator.movie.edu timed-out
Using nslookup, you can look up the names and addresses of the name servers your name server needs to talk to in order to resolve the name:
C:\>  nslookup   Default Server:  terminator.movie.edu  Address:  192.249.249.3    >  set type=ns    > sri.com.   Server:  terminator.movie.edu  Address:  192.249.249.3    Non-authoritative answer:  sri.com nameserver = NS.sri.com  sri.com nameserver = NS.CSL.sri.com  sri.com nameserver = TURTLE.MCC.COM  sri.com nameserver = NS1.sri.com    NS.sri.com      internet address = 128.18.30.66  NS.CSL.sri.com  internet address = 130.107.4.94  NS.CSL.sri.com  internet address = 192.12.33.94  TURTLE.MCC.COM  internet address = 128.62.1.215    NS1.sri.com     internet address = 128.18.30.65  >  com.   Server: terminator.movie.edu  Address:  192.249.249.3    Non-authoritative answer:  com     nameserver = C.ROOT-SERVERS.NET  com     nameserver = D.ROOT-SERVERS.NET  com     nameserver = E.ROOT-SERVERS.NET  com     nameserver = I.ROOT-SERVERS.NET  com     nameserver = F.ROOT-SERVERS.NET  com     nameserver = G.ROOT-SERVERS.NET  com     nameserver = J.GTLD-SERVERS.INTERNIC.NET  com     nameserver = A.ROOT-SERVERS.NET  com     nameserver = H.ROOT-SERVERS.NET  com     nameserver = B.ROOT-SERVERS.NET    C.ROOT-SERVERS.NET      internet address = 192.33.4.12  D.ROOT-SERVERS.NET      internet address = 128.8.10.90  E.ROOT-SERVERS.NET      internet address = 192.203.230.10  I.ROOT-SERVERS.NET      internet address = 192.36.148.17  F.ROOT-SERVERS.NET      internet address = 192.5.5.241  G.ROOT-SERVERS.NET      internet address = 192.112.36.4  J.GTLD-SERVERS.INTERNIC.NET     internet address = 198.41.0.21  A.ROOT-SERVERS.NET      internet address = 198.41.0.4  H.ROOT-SERVERS.NET      internet address = 128.63.2.53  B.ROOT-SERVERS.NET      internet address = 128.9.0.107
Then you can check your host’s connectivity to those servers. Odds are, ping won’t have much better luck than your name server did. If it does, you should check that the remote name servers are really running.
C:\>  ping 128.18.30.66   --ping first sri.com name server  Pinging 128.18.30.66 with 32 bytes of data:    Request timed out.  Request timed out.  Request timed out.  Request timed out.  C:\>  ping 130.107.4.94   --ping second sri.com name server  Pinging 130.107.4.94 with 32 bytes of data:    Request timed out.  Request timed out.  Request timed out.  Request timed out.
Now all that’s left to do is to locate the break in the network. Utilities like tracert can help you determine whether the problem is on your network, on the destination network, or somewhere in the middle.
You should also use common sense when tracking down the break. If, for example, your ping testing showed that you couldn’t reach any of the Internet’s root name servers, it’s not likely that each root’s local network went down or that the Internet’s commercial backbone networks collapsed entirely. Occam’s razor says that the simplest condition that could cause this behavior–namely, the loss of your network’s link to the Internet–is the most likely cause.
8. Missing Subdomain Delegation
Even though your ICANN-accredited registrar does its best to process your requests as quickly as possible, it may take a week or two for your subdomain’s delegation to appear in the root name servers. Depending on your parent (whether an ICANN-accredited registrar or some other zone administrator), your mileage may vary. Some parents are quick and responsible; others are slow and inconsistent. Just like in real life, though, you’re stuck with them.
Until your delegation data appear in your parent zone’s name servers, your name servers will be able to look up data in the Internet domain namespace, but no one else on the Internet (outside of your domain) will know how to look up data in your namespace.
That means that even though you can send mail outside of your domain, the recipients won’t be able to reply to it. Furthermore, no one will be able to telnet to, ftp to, or even ping your hosts by name.
Remember that this applies equally to any in-addr.arpa subdomains you may run. Until the parent delegates those subdomains to your servers, name servers on the Internet won’t be able to reverse-map addresses on your networks.
To determine whether or not your zone’s delegation has made it into your parent zone’s name servers, query a parent name server for the NS records for your zone. If the parent name server has the data, any name server on the Internet can find it:
C:\>  nslookup    Default Server:  terminator.movie.edu   Address:  192.249.249.3     >  server a.root-servers.net .  --Query a root name server   Default Server:  a.root-servers.net   Address:  198.41.0.4     >  set norecurse               --Instruct the server to answer out of  >  set type=ns                 --its own data and to look for NS records   >  249.249.192.in-addr.arpa.   --for 249.249.192.in-addr.arpa  Server:  a.root-servers.net   Address:  198.41.0.4     *** a.root-servers.net can't find 249.249.192.in-addr.arpa.  : Non-existent domain
Here, the delegation clearly hasn’t been added yet. You can either wait patiently or, if an unreasonable amount of time has passed since you requested delegation from your parent zone, you can contact your parent zone’s administrator and ask what’s up.
9. Incorrect Subdomain Delegation
Incorrect subdomain delegation is another familiar problem on the Internet. Keeping delegation up-to-date requires human intervention–informing your parent zone’s administrator of changes to your set of authoritative name servers. Consequently, delegation information often becomes inaccurate as administrators make changes without letting their parents know. Far too many administrators believe that setting up delegation is a one-shot deal: they let their parents know which name servers are authoritative once, when they set up their zones, and then they never talk to them again. They don’t even call on Mother’s Day.
An administrator may add a new name server, decommission another, and change the IP address of a third, all without telling the parent zone’s administrator. Gradually, the number of name servers correctly delegated to by the parent zone dwindles. In the best case this leads to long resolution times, as querying name servers struggle to find an authoritative name server for the zone. If the delegation information becomes badly out-of-date and the last authoritative name server host is brought down for maintenance, the information within the zone will be inaccessible.
If you suspect bad delegation, whether from your parent to your zone, from your zone to one of your children, or from a remote zone to one of its children, you can check with nslookup:
C:\>  nslookup    Default Server:  terminator.movie.edu   Address:  192.249.249.3   >  server a.gtld-servers.net.   --Set server to the parent name                                 --server you suspect has bad delegation   Default Server:  a.gtld-servers.net   Address:  198.41.0.4     >  set type=ns     --Look for NS records   >  hp.com.         --for the zone in question   Server:  a.gtld-servers.net   Address:  198.41.0.4     Non-authoritative answer:   hp.com          nameserver = RELAY.HP.COM   hp.com          nameserver = HPLABS.HPL.HP.COM   hp.com          nameserver = NNSC.NSF.NET   hp.com          nameserver = HPSDLO.SDD.HP.COM     Authoritative answers can be found from:   hp.com          nameserver = RELAY.HP.COM   hp.com          nameserver = HPLABS.HPL.HP.COM   hp.com          nameserver = NNSC.NSF.NET   hp.com          nameserver = HPSDLO.SDD.HP.COM   RELAY.HP.COM    internet address = 15.255.152.2   HPLABS.HPL.HP.COM       internet address = 15.255.176.47   NNSC.NSF.NET            internet address = 128.89.1.178   HPSDLO.SDD.HP.COM       internet address = 15.255.160.64   HPSDLO.SDD.HP.COM       internet address = 15.26.112.11
Let’s say you suspect that the delegation to hpsdlo.sdd.hp.com is incorrect. Query hpsdlo for data in the hp.com zone, and check the answer:
>  server hpsdlo.sdd.hp.com.    Default Server:  hpsdlo.sdd.hp.com   Addresses:  15.255.160.64, 15.26.112.11     >  set norecurse     > set type=soa     > hp.com.    Server:  hpsdlo.sdd.hp.com   Addresses:  15.255.160.64, 15.26.112.11     Non-authoritative answer:   hp.com   origin = relay.hp.com   mail addr = hostmaster.hp.com   serial = 1001462   refresh = 21600 (6 hours)   retry   = 3600 (1 hour)   expire  = 604800 (7 days)   minimum ttl = 86400 (1 day)     Authoritative answers can be found from:   hp.com          nameserver = RELAY.HP.COM   hp.com          nameserver = HPLABS.HPL.HP.COM   hp.com          nameserver = NNSC.NSF.NET   RELAY.HP.COM    internet address = 15.255.152.2   HPLABS.HPL.HP.COM       internet address = 15.255.176.47   NNSC.NSF.NET    internet address = 128.89.1.178
If hpsdlo really were authoritative, it would have responded with an authoritative answer. The administrator of the hp.com zone can tell you whether hpsdlo should be an authoritative name server for hp.com, so that’s who you should contact.

Interoperability Problems

The Microsoft DNS Server has at least one known interoperability issue with BIND name servers: zone transfers sometimes fail because of the proprietary WINS record.
When a Microsoft DNS Server is configured to consult a WINS server for names it can’t find in a given zone, it inserts a special record into the zone data file. The record looks like this:
@   IN     WINS    <IP address of WINS server>
Unfortunately, WINS is not a standard record type in the IN class. Consequently, any BIND slaves that transfer this zone will choke on the WINS record and refuse to load the zone. Here’s the message the administrator of the BIND server would see in his syslog output:
May 23 15:58:43 terminator named-xfer[386]:   "fx.movie.edu IN 65281" - unknown type (65281)
The workaround for this problem is to configure the Microsoft DNS Server to filter out the proprietary record before transferring the zone. You do this by selecting the zone in the left pane of the DNS console, right-clicking it, and selecting Properties. Click on the WINS tab in the resulting properties window, which is shown in Figure 13-5.



Figure :

Checking Do not replicate this record will filter out the WINS record for that zone. However, any Microsoft DNS Server slaves won’t see the record, even though they could use it.

Problem Symptoms

Some problems, unfortunately, aren’t as easy to identify as the ones we’ve listed. You’ll probably experience some misbehavior that you won’t be able to attribute directly to its cause, often because any of a number of problems may cause the symptoms you see. For cases like this, we’ll suggest some of the common causes of these symptoms and ways to isolate them.
Can’t Look Up Local Name
The first thing to do when a program like telnet or ftp can’t look up a local name is to use nslookup to try to look up the same name. When we say “the same name,” we mean literally the same name–don’t add a domain name and a trailing dot if the user didn’t type either one. Don’t query a different name server than the user did.
As often as not, the user will have mistyped the name or misunderstood how the search list works and just needs direction. Occasionally, you’ll turn up real host configuration errors, such as a mistake in the resolver configuration (e.g., the wrong IP address for a name server). You can check for errors like this using nslookup’s set all command.
If nslookup points to a problem with the name server, rather than with the host configuration, check for the problems associated with the type of name server. If the name server is the primary master for the zone but it doesn’t respond with data you think it should:
  • Check that the zone or zone data file contains the data in question.
  • Ensure that the domain names in the records are correct (problem 6).
If the name server is a slave server, you should first check whether or not its master has the correct data. If it does, and the slave doesn’t:
  • Make sure you’ve incremented the serial number on the primary (problem 1).
  • Look for a problem on the slave in updating the zone (problem 4).
If the primary doesn’t have the correct data, of course, diagnose the problem on the primary.
If the problem server isn’t authoritative for the zone that contains the data, check that your parent zone’s delegation to your zone exists and is correct (problems 8 and 9). Remember that to that name server, your zone looks just like any other remote zone. Even though the host it runs on may be inside your zone, the name server must be able to locate an authoritative server for your zone from your parent zone’s servers.
Can’t Look Up Remote Names
If your local lookups succeed but you can’t look up names outside your local zones, there is a different set of problems to check:
  • Can you ping the remote zone’s name servers? Maybe you can’t reach the remote zone’s servers because of connectivity loss (problem 7).
  • Is the remote zone new? Maybe its delegation hasn’t yet appeared (problem 8). Alternatively, the delegation information for the remote zone may be wrong or out of date, due to neglect (problem 9).
  • Does the domain name actually exist on the remote zone’s servers? Does it exist on all of them (problems 1, 2, and 4)?
Wrong or Inconsistent Answer
If you get the wrong answer when looking up a local name or you get an inconsistent answer, depending on which name server you ask or when you ask, first check the synchronization between your name servers:
  • Are they all holding the same serial number for the zone? Did you forget to increment the serial number on the primary after you made a manual change (problem 1)? If you did, the name servers may all have the same serial number, but they will answer differently out of their authoritative data.
  • Did you forget to restart the primary after making a manual change (problem 2)? Then the primary will return (via nslookup, for example) a different serial number than the serial number in the zone data file.
  • Are the slaves having trouble updating from the primary (problem 4)?
  • Is the name server’s round-robin feature rotating the addresses of the domain name you’re looking up?
If you get these results when looking up a name in a remote zone, you should check whether the remote zone’s name servers have lost synchronization. You can use tools like nslookup to determine whether the remote zone’s administrator has forgotten to increment the serial number, for example. If the name servers answer differently from their authoritative data but show the same serial number, the serial number probably wasn’t incremented. If the primary’s serial number is much lower than the slaves’, the primary’s serial number was probably accidentally reset. We usually assume a zone’s primary name server is running on the host listed as the origin in the SOA record.
You probably can’t determine conclusively that the primary hasn’t been restarted, though. It’s also difficult to pin down updating problems between remote name servers. In cases like this, if you’ve determined that the remote name servers are giving out incorrect data, contact the zone administrator and (gently) relay what you’ve found. This will help the administrator track down the problem on the remote end.
Lookups Take a Long Time
Long name resolution periods are usually due to one of two problems:
  • Connectivity loss (problem 7), which you can diagnose with tools like ping and tracert
  • Incorrect delegation information (problem 9), which points to the wrong name servers or the wrong IP addresses
Usually, sending a few pings will point to one or the other of these causes. Either you can’t reach the name servers at all, or you can reach the hosts but the name servers aren’t responding.
Sometimes, though, the results are inconclusive. For example, the parent name servers may delegate to a set of name servers that don’t respond to pings or queries, but connectivity to the remote network seems all right (a tracert, for example, will get you to the remote network’s “doorstep”–the last router between you and the host). Is the delegation information so badly out-of-date that the name servers have long since moved to other addresses? Are the hosts simply down? Or is there really a remote network problem? Usually, finding out will require a call or a message to the administrator of the remote zone. (And remember, whois gives you phone numbers!)
That’s about all we can think of to cover. It’s certainly a less than comprehensive list, but we hope it’ll help you solve the more common problems you encounter with DNS and give you ideas about how to approach the rest. Boy, if we’d only had a troubleshooting guide when we started!