DNS not resolving for a particular domain only

Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

DNS not resolving for a particular domain only

Bind-Users forum mailing list
Hi All,

We are experiencing a weird issue for the past week or two. 

We run bind9 on RHEL/CentOS and one of our international offices that has their own auth and caching servers cannot resolve lenovo.com for some odd reason. If that office clients use google DNS it works but using their own DNS caching servers, it cant resolve. Commands dig and nslookup give a timeout. Although dig with trace is able to get to the final answer. Nothing in the logs indicate an issue. Also, this is the only address that cant resolve, everything else works fine.
We've contacted the ISP to make sure nothing is being blocked or anything, and thats all clear. The network team has confirmed they haven't done anything on the edge devices or any firewall rule modifications which can cause it. 

Any ideas please???

_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DNS not resolving for a particular domain only

Bind-Users forum mailing list
On 08/11/2017 06:49 AM, U Zee via bind-users wrote:
> Any ideas please???

I'm seeing different A records returned depending on where I query from.

As such I can only speculate that something related to DNS for a CDN is
not working as desired.

I'd suggest a packet capture of the client's DNS traffic and possibly
(if not likely) the client's recursive DNS server's traffic (related to
the query.)



--
Grant. . . .
unix || die


_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DNS not resolving for a particular domain only

Bind-Users forum mailing list
Thanks for the suggestion Grant.

Here's what I get for the recursive server's capture: ( I queried from the recursive server itself from another ssh session so it is the client as well)


# tcpdump -v -v -nt -i eth0 udp port 53|grep lenovo
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
    86.36.AA.BB.45776 > 86.36.AA.CC.domain: [bad udp cksum 8a1b!] 34468+ A? www.lenovo.com. (32)
    86.36.AA.BB.45776 > 86.36.AA.CC.domain: [bad udp cksum 8a1b!] 34468+ A? www.lenovo.com. (32)
    86.36.AA.BB.36143 > 193.108.91.79.domain: [bad udp cksum c63c!] 12966 [1au] A? www.lenovo.com. ar: . OPT UDPsize=4096 OK (43)
    193.108.91.79.domain > 86.36.AA.BB.36143: [udp sum ok] 12966*- q: A? www.lenovo.com. 1/0/1 www.lenovo.com. CNAME cs47.can.lnvcdn.net. ar: . OPT UDPsize=4096 OK (76)
    86.36.AA.BB.45776 > 86.36.AA.CC.domain: [bad udp cksum 8a1b!] 34468+ A? www.lenovo.com. (32)
    86.36.AA.BB.10224 > 86.36.DD.EE.domain: [bad udp cksum 18c7!] 12721 [1au] A? www.lenovo.com.ourdomain.com. ar: . OPT UDPsize=4096 OK (57)
    86.36.DD.EE.domain > 86.36.AA.BB.10224: [udp sum ok] 12721 NXDomain*- q: A? www.lenovo.com.ourdomain.com. 0/1/1 ns: ourdomain.com. SOA master.ourdomain.com. host-master.ourparentdomain.com. 138524105 900 450 3600000 60 ar: . OPT UDPsize=4096 OK (138)
    86.36.AA.CC.domain > 86.36.AA.BB.45776: [udp sum ok] 34468 ServFail q: A? www.lenovo.com. 0/0/0 (32)


86.36.AA.BB = localhost (our recursive server) where I ran the query and capture
86.36.AA.CC = our secondary recursive server (no idea why that was contacted)
86.36.DD.EE = our one of two anycast addresses which point to the recursive servers


So it looks like we do get to the CNAME (4th line) but still it fails...?
I also tried a capture from a regular linux client but the output was similar except that it didn't include the CNAME line.

Frankly I have no idea if this is giving any useful info. I did see that for other queries also I saw bad udp cksum messages so not sure if thats an actual problem.

Do you see anything specific that might help us diagnose further?

Thanks


From: Grant Taylor via bind-users <[hidden email]>
To: [hidden email]
Sent: Friday, August 11, 2017 7:06 PM
Subject: Re: DNS not resolving for a particular domain only

On 08/11/2017 06:49 AM, U Zee via bind-users wrote:

> Any ideas please???


I'm seeing different A records returned depending on where I query from.

As such I can only speculate that something related to DNS for a CDN is
not working as desired.

I'd suggest a packet capture of the client's DNS traffic and possibly
(if not likely) the client's recursive DNS server's traffic (related to
the query.)



--
Grant. . . .
unix || die

_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users



_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DNS not resolving for a particular domain only

Mark Andrews

In message <[hidden email]>, U Zee via bind-users writ
es:

> Thanks for the suggestion Grant.
> Here's what I get for the recursive server's capture: ( I queried from
> the recursive server itself from another ssh session so it is the client
> as well)
>
> # tcpdump -v -v -nt -i eth0 udp port 53|grep lenovotcpdump: listening on
> eth0, link-type EN10MB (Ethernet), capture size 65535 bytes   
>    86.36.AA.BB.45776 > 86.36.AA.CC.domain: [bad udp cksum 8a1b!] 34468+ A? www.lenovo.com. (32)
>    86.36.AA.BB.45776 > 86.36.AA.CC.domain: [bad udp cksum 8a1b!] 34468+ A? www.lenovo.com. (32)
>    86.36.AA.BB.36143 > 193.108.91.79.domain: [bad udp cksum c63c!] 12966 [1au] A?
> www.lenovo.com. ar: . OPT UDPsize=4096 OK (43)
>    193.108.91.79.domain > 86.36.AA.BB.36143: [udp sum ok] 12966*- q: A? www.lenovo.com. 1/0/1 www.lenovo.com. CNAME cs47.can.lnvcdn.net. ar: . OPT UDPsize=4096 OK (76) 
>  Â  86.36.AA.BB.45776 > 86.36.AA.CC.domain: [bad udp cksum 8a1b!] 34468+ A? www.lenovo.com. (32)
>    86.36.AA.BB.10224 > 86.36.DD.EE.domain: [badudp cksum 18c7!] 12721 [1au] A? www.lenovo.com.ourdomain.com. ar: . OPT UDPsize=4096 OK (57)
>    86.36.DD.EE.domain > 86.36.AA.BB.10224: [udp sum ok] 12721 NXDomain*- q: A? www.lenovo.com.ourdomain.com. 0/1/1 ns: ourdomain.com. SOA master.ourdomain.com. host-master.ourparentdomain.com. 138524105 900 450 3600000 60 ar: . OPT UDPsize=4096 OK (138)   
>    86.36.AA.CC.domain > 86.36.AA.BB.45776: [udp sum ok] 34468 ServFail q: A? www.lenovo.com. 0/0/0 (32)
>
> 86.36.AA.BB = localhost (our recursive server) where I ran the query and
> capture
> 86.36.AA.CC = our secondary recursive server (no idea why that was
> contacted)
> 86.36.DD.EE = our one of two anycast addresses which point to the
> recursive servers
>
>
> So it looks like we do get to the CNAME (4th line) but still it
> fails...?I also tried a capture from a regular linux client but the
> output was similar except that it didn't include the CNAME line.
Well the next stage is to trace what happens when the recursive
server looks for cs47.can.lnvcdn.net, the target of the CNAME.

Mark
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742                 INTERNET: [hidden email]

_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DNS not resolving for a particular domain only

Bind-Users forum mailing list
Thanks Mark,

So mysteriously the problem is now gone and I have no idea how, I know that I didn't change anything.

While investigating, I tried looking but didn't get anything in packet capture on the recursive server, I think mainly because I had to grep for something otherwise there was just too much traffic. So its possible, my grep for lenovo didn't show related packets.... But I will never know now 





From: Mark Andrews <[hidden email]>
To: U Zee <[hidden email]>
Cc: Grant Taylor <[hidden email]>; "[hidden email]" <[hidden email]>
Sent: Monday, August 14, 2017 3:00 AM
Subject: Re: DNS not resolving for a particular domain only


In message <[hidden email]>, U Zee via bind-users writ
es:

> Thanks for the suggestion Grant.
> Here's what I get for the recursive server's capture: ( I queried from
> the recursive server itself from another ssh session so it is the client
> as well)
>
> # tcpdump -v -v -nt -i eth0 udp port 53|grep lenovotcpdump: listening on
> eth0, link-type EN10MB (Ethernet), capture size 65535 bytes   
>    86.36.AA.BB.45776 > 86.36.AA.CC.domain: [bad udp cksum 8a1b!] 34468+ A? www.lenovo.com. (32)
>    86.36.AA.BB.45776 > 86.36.AA.CC.domain: [bad udp cksum 8a1b!] 34468+ A? www.lenovo.com. (32)
>    86.36.AA.BB.36143 > 193.108.91.79.domain: [bad udp cksum c63c!] 12966 [1au] A?
> www.lenovo.com. ar: . OPT UDPsize=4096 OK (43)
>    193.108.91.79.domain > 86.36.AA.BB.36143: [udp sum ok] 12966*- q: A? www.lenovo.com. 1/0/1 www.lenovo.com. CNAME cs47.can.lnvcdn.net. ar: . OPT UDPsize=4096 OK (76) 
>    86.36.AA.BB.45776 > 86.36.AA.CC.domain: [bad udp cksum 8a1b!] 34468+ A? www.lenovo.com. (32)
>    86.36.AA.BB.10224 > 86.36.DD.EE.domain: [badudp cksum 18c7!] 12721 [1au] A? www.lenovo.com.ourdomain.com. ar: . OPT UDPsize=4096 OK (57)
>    86.36.DD.EE.domain > 86.36.AA.BB.10224: [udp sum ok] 12721 NXDomain*- q: A? www.lenovo.com.ourdomain.com. 0/1/1 ns: ourdomain.com. SOA master.ourdomain.com. host-master.ourparentdomain.com. 138524105 900 450 3600000 60 ar: . OPT UDPsize=4096 OK (138)   
>    86.36.AA.CC.domain > 86.36.AA.BB.45776: [udp sum ok] 34468 ServFail q: A? www.lenovo.com. 0/0/0 (32)
>
> 86.36.AA.BB = localhost (our recursive server) where I ran the query and
> capture
> 86.36.AA.CC = our secondary recursive server (no idea why that was
> contacted)
> 86.36.DD.EE = our one of two anycast addresses which point to the
> recursive servers
>
>
> So it looks like we do get to the CNAME (4th line) but still it
> fails...?I also tried a capture from a regular linux client but the
> output was similar except that it didn't include the CNAME line.

Well the next stage is to trace what happens when the recursive
server looks for cs47.can.lnvcdn.net, the target of the CNAME.

Mark
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742                INTERNET: [hidden email]




_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DNS not resolving for a particular domain only

Mark Andrews

In message <[hidden email]>, U Zee writes:
> Thanks Mark,
> So mysteriously the problem is now gone and I have no idea how, I know
> that I didn't change anything.
> While investigating, I tried looking but didn't get anything in packet
> capture on the recursive server, I think mainly because I had to grep for
> something otherwise there was just too much traffic. So its possible, my
> grep for lenovo didn't show related packets.... But I will never know now 

A single missing routing entry could have taken the site down.  The
delegation for lnvcdn.net only has 2 of the 4 nameservers listed
and those 2 are in the same /24.  There is a reason that it is
recommended that there are multiple nameservers which are geographically
and topologically dispersed are used and that both sides of the
delegation have consistent NS and address records.  It reduces the
probability that single faults cause failures like this.

Mark

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7439
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 6, ADDITIONAL: 3

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
;; QUESTION SECTION:
;lnvcdn.net. IN NS

;; AUTHORITY SECTION:
lnvcdn.net. 172800 IN NS ns1.lnvcdn.net.
lnvcdn.net. 172800 IN NS ns2.lnvcdn.net.
A1RT98BS5QGC9NFI51S9HCI47ULJG6JH.net. 86400 IN NSEC3 1 1 0 - A1RUUFFJKCT2Q54P78F8EJGJ8JBK7I8B  NS SOA RRSIG DNSKEY NSEC3PARAM
A1RT98BS5QGC9NFI51S9HCI47ULJG6JH.net. 86400 IN RRSIG NSEC3 8 2 86400 20170825051539 20170818040539 57899 net. ZbkC2I24NO+y91E+sPWOADqbjsVpHfuFhnox5QfeuImFsL2z0x3X+UG6 Lt9emQ23VFesgs8+J1WQVjHHBuhvc1XdWG7jBpv3Tr776oBcSF5rrqMp zC5CjRIzOlojSVpNG3snkW0xfijBuOl51RzaKrSqKb2x/tcXWUWkHpDw ga8=
2K5T76ECDUK1RJEDVHKHNL0LCCENKMES.net. 86400 IN NSEC3 1 1 0 - 2K673TEK531CUGB8J9QHASJNDFOVU87L  NS DS RRSIG
2K5T76ECDUK1RJEDVHKHNL0LCCENKMES.net. 86400 IN RRSIG NSEC3 8 2 86400 20170828050756 20170821035756 57899 net. s905nQwEBRv9cbVzZMWFLfb0Jnq/K+R32MJdnYa9CaPpJCtGIMzWkmPt yl7MKawRlhJE01n4ll4/4Grj3asVi5/LsrGSH7bjO9GkclWqsuxoeepl JrUh/UkZFw5qhnCvw1teWAPcZ6T93DBmq02c8UemFAYRrMO1ugbvHGQo QPw=

;; ADDITIONAL SECTION:
ns1.lnvcdn.net. 172800 IN A 192.16.0.5
ns2.lnvcdn.net. 172800 IN A 192.16.0.6

;; Query time: 257 msec
;; SERVER: 2001:503:d414::30#53(2001:503:d414::30)
;; WHEN: Tue Aug 22 09:45:04 AEST 2017
;; MSG SIZE  rcvd: 592

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22513
;; flags: qr aa; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 5

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
;; QUESTION SECTION:
;lnvcdn.net. IN NS

;; ANSWER SECTION:
lnvcdn.net. 3600 IN NS ns1.lnvcdn.net.
lnvcdn.net. 3600 IN NS ns3.lnvcdn.net.
lnvcdn.net. 3600 IN NS ns4.lnvcdn.net.
lnvcdn.net. 3600 IN NS ns2.lnvcdn.net.

;; ADDITIONAL SECTION:
ns1.lnvcdn.net. 3600 IN A 192.16.0.5
ns2.lnvcdn.net. 3600 IN A 192.16.0.6
ns3.lnvcdn.net. 3600 IN A 198.7.30.5
ns4.lnvcdn.net. 3600 IN A 198.7.30.6

;; Query time: 12 msec
;; SERVER: 192.16.0.5#53(192.16.0.5)
;; WHEN: Tue Aug 22 09:45:04 AEST 2017
;; MSG SIZE  rcvd: 175

>       From: Mark Andrews <[hidden email]>
>  To: U Zee <[hidden email]>
> Cc: Grant Taylor <[hidden email]>; "[hidden email]"
> <[hidden email]>
>  Sent: Monday, August 14, 2017 3:00 AM
>  Subject: Re: DNS not resolving for a particular domain only
>
>
> In message <[hidden email]>, U Zee via
> bind-users writ
> es:
> > Thanks for the suggestion Grant.
> > Here's what I get for the recursive server's capture: ( I queried from
> > the recursive server itself from another ssh session so it is the client
> > as well)
> >
> > # tcpdump -v -v -nt -i eth0 udp port 53|grep lenovotcpdump: listening on
> > eth0, link-type EN10MB (Ethernet), capture size 65535 bytes   
> >    86.36.AA.BB.45776 > 86.36.AA.CC.domain: [bad udp cksum 8a1b!] 34468+
> A? www.lenovo.com. (32)
> >    86.36.AA.BB.45776 > 86.36.AA.CC.domain: [bad udp cksum 8a1b!] 34468+
> A? www.lenovo.com. (32)
> >    86.36.AA.BB.36143 > 193.108.91.79.domain: [bad udp cksum c63c!]
> 12966 [1au] A?
> > www.lenovo.com. ar: . OPT UDPsize=4096 OK (43)
> >    193.108.91.79.domain > 86.36.AA.BB.36143: [udp sum ok] 12966*- q: A?
> www.lenovo.com. 1/0/1 www.lenovo.com. CNAME cs47.can.lnvcdn.net. ar: .
> OPT UDPsize=4096 OK (76) 
> >    86.36.AA.BB.45776 > 86.36.AA.CC.domain: [bad udp cksum 8a1b!] 34468+
> A? www.lenovo.com. (32)
> >    86.36.AA.BB.10224 > 86.36.DD.EE.domain: [badudp cksum 18c7!] 12721
> [1au] A? www.lenovo.com.ourdomain.com. ar: . OPT UDPsize=4096 OK (57)
> >    86.36.DD.EE.domain > 86.36.AA.BB.10224: [udp sum ok] 12721
> NXDomain*- q: A? www.lenovo.com.ourdomain.com. 0/1/1 ns: ourdomain.com.
> SOA master.ourdomain.com. host-master.ourparentdomain.com. 138524105 900
> 450 3600000 60 ar: . OPT UDPsize=4096 OK (138)   
> >    86.36.AA.CC.domain > 86.36.AA.BB.45776: [udp sum ok] 34468 ServFail
> q: A? www.lenovo.com. 0/0/0 (32)
> >
> > 86.36.AA.BB = localhost (our recursive server) where I ran the query and
> > capture
> > 86.36.AA.CC = our secondary recursive server (no idea why that was
> > contacted)
> > 86.36.DD.EE = our one of two anycast addresses which point to the
> > recursive servers
> >
> >
> > So it looks like we do get to the CNAME (4th line) but still it
> > fails...?I also tried a capture from a regular linux client but the
> > output was similar except that it didn't include the CNAME line.
>
> Well the next stage is to trace what happens when the recursive
> server looks for cs47.can.lnvcdn.net, the target of the CNAME.
>
> Mark
> --
> Mark Andrews, ISC
> 1 Seymour St., Dundas Valley, NSW 2117, Australia
> PHONE: +61 2 9871 4742                INTERNET: [hidden email]
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742                 INTERNET: [hidden email]

_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users
Loading...