NIST logoSPARTA Logo




Help! Something is Wrong!

DNSSEC adds more complexity to DNS operations which means more ways to have something go wrong. Usually, these errors can be quickly diagnosised with a few queries and knowledge about how DNSSEC works. However, for new administrators and administrators deploying DNSSEC for the first time, this can be a daunting task. To help make things easier some of the typical deployment errors and configuration errors are given below broken down by symptom. This usually requires knowledge of how to use the 'dig' tool or some other untility to send DNS queries to a server. There are also other online and downloadable tools that can help diagnosis and monitor DNSSEC health. A good place to start is the DNSSEC.NET tools page.

To get started, here is a short list of tools/information you'll probably need:

In addition, some information that would be useful but not strictly necessary at first:

First Things First -

It is important to narrow down the problem as much as possible. There are several different roles in the DNS protocol: authoritative name servers (who provide the data), stub clients (who send queries) and recursive name servers (who service queries on behalf of stubs and often validate DNSSEC signaturs). Seperate from that is the DNS data itself and whatever management system is used. This is outside of the DNS protocol so there is no standard way of managing a DNS infrastructure for an enterprise.

So to start - if you think you have a problem with a signed zone on a nsme server, try sending a query to the name server. If the zone is "example.gov" on a name server with IP address "serverIPaddr" and using dig, try:

dig @serverIPaddr example.gov DNSKEY +dnssec

Then (assuming foobar.example.gov dosn't exist in the zone):

dig @serverIPaddr foobar.example.gov DNSKEY +dnssec

You should get a name error message (RCODE = NXDOMAIN) and see NSEC or NSEC3 RR's in the response. If not, there is a problem (see below).

If you think the problem is with a recursive server (validating or non-validating) try:

dig @serverIPaddr gov DNSKEY +dnssec

Then:

dig @serverIPaddr notthere.gov DNSKEY +dnssec

Again, like above, you should get a name error or some other error message. If not, then there is a a problem.



So What Seems to be the Problem?


Non-DNSSEC queries seem to work, but DNSSEC queries fail...

What could be the problem:

If traditional DNS queries seem to work as before, but queries with the DNSSEC-OK bit (DNSSEC enabled) result in a "Server not Found" error or time out error, there could be an issue with a firewall, router or some other middlebox. DNSSEC responses are often larger than the 512 byte limit of traditional DNS responses. DNSSEC requires the use of Extended DNS (EDNS) that has a way for clients and server to signal that they can accept larger UDP packets. However, many older firewalls and IDS systems do not understand DNSSEC and interpret EDNS and DNSSEC responses as possible attacks or errors. Or an older router or switch may not be able to correctly handle larger UDP packets and may drop them.


Suggested Fixes:

Check to see if there is a firewall or IDS system in front of the DNS server. If so, see if there are any logging messages or configuration options to allow larger UDP messages on port 53 (used by DNS). Suggested size is at least 1500 bytes, with 4098 bytes being the practical maximum. Re-try the DNSSEC enabled query to see if this resolves the issue.


-OR-


Check any switch or router in front of the DNS server to see if it has any issues with UDP packets over 512 bytes. If that is the issue, try re-configuring (if possible) or replacing the middlebox. Re-try the DNSSEC enabled query to see if this resolves the issue. You might also want to use a trace tool like traceroute for additional information about a potential problematic middlebox.



Some DNSSEC queries work fine, but DNS error messages fail to validate...

What could be the problem:

If DNSSEC seems to work until you send a query for a name you know doesn't exist (for example, a query for the SOA RR in the zone 'example.com' works, but a query for 'foofoofoo.example.com' doesn't) there could be several potential issues:


  1. Is the zone signed using NSEC3? If so, check to see if the server implementation understands NSEC3. Older (pre-NSEC3 specification) implementations know how to add RRSIG RRs to DNSSEC responses, but don't know how to form correct error responses using NSEC3 and instead return traditional looking DNS error messages: The SOA RR (and signature) in the Authority section of the response. Note: All servers must be able to understand NSEC3. If one seconary server is not NSEC3-enabled, it will give incorrect responses while the other NSEC3-enabled servers will behave normally. These types of errors are espeically hard to track down.

  2. Often, error responses are larger than positive responses (but not always). It could be an issue with a box in the middle dropping the response packet. See Non-DNSSEC queries seem to work, but DNSSEC queries fail... to see if that is the real issue.


Suggested Fixes:

If the zone is signed using NSEC3 and the server implementation does not understand NSEC3 RR's, then the solution is to upgrade the server implementation or (if that is not possible), resign the zone using NSEC. The later fix will require the generation of new keypairs. After re-signing, load the new zone file on the server and restart/reload the zone. After the fix, send a query for a name known to not exist in the zone and look for an appropriate error message (one with NSEC3 or NSEC RRs).


If that doesn't work, see the suggested fixes in Non-DNSSEC queries seem to work, but DNSSEC queries fail... if there is an issue with a router, firewall or IDS system blocking larger error responses.


Sometimes I see signatures (RRSIGs) in responses, sometimes I don't...

What could be the the problem:

If DNSSEC responses are only seen on some queries, it might be an issue with a non-DNSSEC aware secondary. Some implementations need to be specifically configured to serve DNSSEC zones. If one secondary out of a collection of authoritative servers does not have DNSSEC configured, it will return tradtional DNS responses (i.e. no signatures) while the other name servers return DNSSEC enabled responses.


This may also happen if the SOA serial number is not incremented when the zone is signed. Signing a zone is a zone change, and any zone change requires the serial number in the zone SOA RR to be incremented. If it is not, the secondary name servers will not perform a zone transfer and will continue to serve the (old) unsigned zone. Most modern zone signing tools do this incrementing automatically, but older tools (like older version of dnssec-signzone in BIND) do not do this and the administrator must do this manually.


If all the name servers for the zone are configured to send DNSSEC responses, check to see if dynamic update is being used. If a name server accepts dynamic update messages, but does not have a ZSK to generate signtures for newly added Resource Records, those RR's could be added unsigned. There could also have been an error in the zone file or zone database that would prevent a signing tool to not consider a particular name part of a zone and thus not generate any RRSIGs for the RRSets at that name.


Suggested Fixes:

  1. If the zone allows updates via dynamic update, check to make sure that the primary name server has the appropriate keys to generate signatures for added names. Check the configuration (after re-signing any RRsets that lack signatures) and retry the queries.

  2. Check the zone file to see if any owner names are mis-typed. Note that if they are, they will not be signed, but also may not be returned normally.

  3. Check the configuration of all the authoritative servers to insure that they are DNSSEC enabled. Many implementations do this by default when a DNSSEC signed zone is loaded, but some (like BIND) do not. To check, make sure the following option in enabled in the named.conf file of each name server:


options {

...

dnssec-enable yes;

};



All I get are SERVFAIL responses when I send DNSSEC queries...

What could be the problem:

SERVFAIL is often a "catch-all" error message meaning it could be one of several potential issues. First, if using 'dig' or some other query tool, try issuing the following:

dig @serverIPaddr zonename SOA +dnssec +cdflag

If a non-error message is returned (i.e. not a SERVFAIL) and DNSSEC RR's are present (like RRSIGS), then the problem is likely that the validating resolver has an old or incorrect trust anchor. For BIND, look in the 'named.conf' file for the 'trusted-keys' statement block and check that it is correct for the given zone. Windows Server 2008 uses an alternative method for installing trust anchors. Also, check the validity of the RRSIGs in the zone. To check, look for the highlighted values in the RRSIG RR's in the zone file or responses (using the +cdflag in 'dig' or similar).

gov. 259200 IN RRSIG SOA 7 1 259200 20100206131703 20100201131703 51998 gov. QhqDbOq8hiOAhwPEElB8ZP3D0T3/aWwk6L79ciXFEZgjbr5TqEXA4QJ/ pPnD/hW71/ccvXLHFchra7I8UnFcTjF/7+59uw85jJaeAAgkDE5D1AsJ YhwnSb6FQo3atEkIkD1MC6CNGdGsYv85aYPNAZ54+B7g5bEVbOEMUm/T 6Jk=

Above, the first highlighted value is the expiration date of the signature and the second is the inception date. In text it appears as "YYYYMMDDHHMMSS" where YYYY is the year, MM is the month, DD is the day, HH is the hour, MM is the minutes and SS is seconds. Make sure that the signatures are valid - that is, the current time is AFTER the inception date and BEFORE the expiration date. If the signatures are expired or not valid, resign and reload the zone and try again. If no validation is being done (that is, no trust anchors are installed), there could be an issue with the server. Admins should check the server logs to see if there was a problem loading the signed zone file or another configuration problem resulting in a general server failure.

If that doesn't work, check to insure DNSSEC in enabled on the server. BIND (and some other implementations) require a specific configuration option to be enabled in order for DNSSEC responses to be sent. In the named.conf file options statement, there should be:


options {

...

dnssec-enable yes;

};


Suggested Fixes:

Check by using dig with the '+cdflag' to see if the error is due to a validation failure. If RRSIGs are seen, check to see if the recursive server's trust anchors are out of date.

If that does not work, check the server logs or configuration file for any potential errors that would cause the server to fail.


I just rolled the ZSK and now nothing works...

What could be the problem:

Was the proper produre for a ZSK rollover followed? Best practice is to use the pre-publishing method of key rollover for the ZSK. This process is described in RFC 4641 and NIST Special Publication 800-81r1. If done incorrectly, the old (expired) ZSK could still be in caches while the new ZSK is used to generate RRSIGs.


To check if this is the case, send a query to the server with the Checking Disabled (CD) bit set (see the dig command in Help! All I get are SERVFAIL responses when I send DNSSEC queries... If a response is seen, then the old ZSK is still in cache and causing the validation failures.


Suggested fixes:

If a ZSK rollover was done incorrectly, the solution is to partially back out the rollover and sign the zone with BOTH the old (expired) and new ZSK. After reloading the zone (and restarting the server), the administrator should wait for the TTL of the zone data (at least) before removing (again) the old ZSK, resigning the zone with the new ZSK only, and reloading the zone. The administator may wish to add a new pre-published ZSK at this point to make the next ZSK rollover easier and less prone to the same error.


I just rolled the KSK and now nothing works...

What could be the problem:

Was the proper procedure for a KSK rollover followed? Best practice is to use the dual-signature method of key rollover for the KSK. This process is described in RFC 4641 and NIST SP 800-81r1. Rollover of the KSK includes interacting with the parent zone to insure that the new KSK has a Delegated Signer (DS) RR with the delegation information. if this isn't present, the chain of authentication is broken and validation will fail.


First step is to check the DS in the parent zone (via 'dig' or some other tool) and check the keytag value in the DS RR (highlighted below) and the KSK (seen in the RRSIG): dnsops.gov dnsops.gov If the DS and KSK is correct, check to see if the validating recursive server has the old KSK installed as a trust anchor. See All I get are SERVFAIL responses when I send DNSSEC queries...


Suggested fixes:

If the KSK rollover was done incorrectly, the solution is to re-sign the zone with BOTH the old KSK and new KSK. Then begin the dual-signature rollover method described in RFC 4641 and NIST SP 800-81r1. Note that this also depends on how the parent zone handles KSK rollovers: Does the registrar want the DS RR or the Keyset uploaded? If the error is due to a stale trust anchor issue, see the suggested fix for All I get are SERVFAIL responses when I send DNSSEC queries...


I migrated to NSEC3 and now nothing works...

What could be the problem:

Migrating to an NSEC3 signed zone from an NSEC signed zone is a challenge as it is more complicated than a “normal” algorithm rollover. It requires a set of steps outlined in NIST SP 800-81r1. It also requires that the all the tools involved by NSEC3 enabled (signing tools and name servers) as well as the client resolver. To narrow down the potential problematic componet, try the following:

  1. Was the proper procedure followed for migrating from NSEC to NSEC3? Check to see if old data in a cache is causing validation failures when the new zone is introduced.

  2. Check to make sure that the name server is running a version that understands how to serve NSEC3 enabled zones. See Some DNSSEC queries work fine, but DNS error messages fail to validate...

  3. Check the validating recursive server to make sure it also understands NSEC3. If it is an older (pre-NSEC3) implementation, it will incorrectly interpret error responses with NSEC3 RR's as bogus.

  4. NSEC3 responses are often larger than the same zone using NSEC. It is possible that the new NSEC3 signed zone is just above the MTU limit for a router, firwall or IDS system on the network. If the above suggestions did not solve the problem, see Non-DNSSEC queries seem to work, but DNSSEC queries fail... for a possible solution.


Suggested fixes:

If the issue is due to number 1 above, the administator should back out the change and revert back to the NSEC signed zone. They should then follow the procedure described in NIST SP 800-81r1. This procedure requires a set of actions that may take some time depending on the TTL of the zone data and TTL of the DS RR in the parent zone.


If that is not possible, check to insure that the authoritative servers are NSEC3 enabled and that any client can interpret NSEC3 responses. Tools like dig that do no validation or caching may be more useful than a tool that may do validation but is not NSEC3-aware. Any component that is not NSEC3-aware would need to be upgraded, although it is more important that the authoritative servers are NSEC3-aware.



I get responses and see signatures, but I don't see the AD bit...

What could be the problem:

DNSSEC validators set the AD bit in the DNS response header when (and only when) the validator (or validating recursive server) is able to successfully validate all the RRSIGs in the response. When the validator cannot validate the signatures because it doesn't have a correct key, the AD bit is not set but the response it returned. This does NOT mean that the response is bogus, just incomplete, so the response is returned to the client.


Suggested fixes:

Technically, this isn't an error, but if validation is required for the zone, the way to address this issue is to check the validator (or recursive server doing validation) to make sure that the appropriate trust anchors are installed. In BIND, the trust anchors are configured in the named.conf file as part of the 'trusted-keys' option. Microsoft Windows Server uses an Administration interface to install trust anchors. Administrators should insure that any installed trust anchors remain up to date to avoid validation failures caused by stale keys. Administrators may want to insure that their validating recursive servers understand the key rollover signalling process specified in RFC 5011 or use a third party tool such as trustman to insure trust anchors are up to date.


DNSSEC was working yesterday, but not today...

What could be the problem:

This problem could have serveral causes. See:



I tried to upload my Keyset file to dotgov.gov, but get “0 keys saved”...

What could be the problem:

The dotgov.gov web portal accepts keyset files (and DNSKEY RR's) of KSK's for signed .gov domains. There is a series of checks the system performs before uploading the key. They are:


If any one of these conditions are not true, the key will likely be rejected.


Suggested fixes:

The only way around this issue to to insure that the uploaded keyset meets all the criteria posted above. This may include waiting and checking that the updated signed zone is replicated to all the name servers for the zone (so that the DNSKEY RRset shows up at all the name servers). If this is the cause, see Sometimes I see signatures (RRSIGs) in responses, sometimes I don't...


Note: Even after the Keyset file is uploaded, the administrator will need to wait until the regular updating of the .gov TLD. This happens twice a day (currently) at around 8:00am and 16:00pm in the US Eastern time zone.



DNSSEC Deployment Initiative logoQuestions or comments should be sent to the SNIP admin

NIST is an agency of the U.S. Department of Commerce. Privacy policy / security notice / accessibility statement / Disclaimer / Freedom of Information Act (FOIA) / No Fear Act Data
Date created 2/4/2010. Last updated 2/4/2010.

Website accessibility rating Section 508 approved by section508.info