On 2016-11-24 Trevor Cordes wrote:
Maybe I should now reformulate the crux of my problem as this: Can I configure bind to return for all AAAA requests in the local zone "I'm authoritative but I don't have the answer" instead of SERVFAIL *even if the subzone has been delegated*. Or even specify a delegation for certain records (A & MX) only (not AAAA), though I specifically read somewhere that that's impossible on purpose.
Eureka!! The path you set me out on that led to my above reformulation led me to some other avenues of google attack. Two ideas in I found a solution!
First I found this named option: filter-aaaa-on-v4 (and -v6) "It is intended to help the transition from IPv4 to IPv6 by not giving IPv6 addresses to DNS clients unless they have connections to the IPv6 Internet." Super description and chart here: https://kb.isc.org/article/AA-00576/0/Filter-AAAA-option-in-BIND-9-.html
Perfect!! It did indeed filter AAAA when I tested with names like google.com. But it failed for my own problematic sites.
So, I turned on more debugging in named and saw the external NS responses for the subdomain where giving me: lame-servers: info: FORMERR resolving ... then my named would give to me: query failed (SERVFAIL)
So it wasn't the external NS giving me SERVFAIL, but FORMERR... which then turned into a SERVFAIL from my command's point of view.
Some more searching armed with new keywords I found: https://lists.isc.org/pipermail/bind-users/2012-April/087465.html
"The root cause is that the name servers for www.ryanair.com are misconfigured. They are returning answers as if they are configured for ryanair.com (see the SOA record) instead of www.ryanair.com as can be seen below."
Aha! Ding! named was barfing because I had two NS's authoritative for the same domain and one referencing the other. Even though I was only trying to reference the delegated subdomain, named didn't like that arrangement... but not in general, only as it pertained to non-existent records. Weird! (Would have been easier to debug if it didn't work for any records at all!)
Crossing my fingers, I changed the external server to break out the zones (the root zone, and the delegated subzone) into two zone files so now both BOX1 and BOX2 have very similar zone files with regards to the handling of the ".out." subdomain... they both just delegate it to the root zone. It's like I was running the out subdomain on an entirely separate box from both root NS's. Restart, pray, and it works!
host problem gone. sshd problem gone. dig results are the same everywhere I attempt the query. So the problem was a misconfig on my part because of this very convoluted example when trying to delegate on "shared but different" domains. I hope if anyone else ever has this problem this thread can help them solve it more easily.
Finally, I guess the filter-aaaa-on-v4 didn't help here because of the nature of the FORMERR. I guess named was trying to tell me something.