Logs full of "timed out resolving" entries

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Logs full of "timed out resolving" entries

Scott Gasch
I have a BIND 9.16.3 server running as master for a couple of zones and forwarding queries to my ISP (and 8.8.8.8 / 8.8.4.4 etc...) for other queries.  It "works" ok but I notice weird delays in web browsing clients where the browser says "Resolving host...", hangs for a bit (noticeably, ~a few seconds), then loads the whole page.

In the server logs, I see lots of messages like this one:

Jun 16 17:21:04 <daemon.info> wannabe named[6982]: timed out resolving 'trafficmanager.net/DS/IN': 8.8.4.4#53
Jun 16 17:21:05 <daemon.info> wannabe named[6982]: timed out resolving 'apiv4-east.myq-cloud.com/AAAA/IN': 8.8.4.4#53

However when I dig from the command line, I can't replicate this.  The query comes back quickly and successfully:
# dig @8.8.4.4 apiv4-east.myq-cloud.com
; <<>> DiG 9.16.3 <<>> @8.8.4.4 apiv4-east.myq-cloud.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52825
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;apiv4-east.myq-cloud.com.      IN      A
;; ANSWER SECTION:
apiv4-east.myq-cloud.com. 4     IN      A       40.71.236.219
;; Query time: 42 msec
;; SERVER: 8.8.4.4#53(8.8.4.4)
;; WHEN: Tue Jun 16 17:22:17 PDT 2020
;; MSG SIZE  rcvd: 69

It almost feels like something in between me and the next nameserver is dropping packets when a lot of queries are in flight at the same time.  I've looked at my router logs and it doesn't seem to be the culprit.  When I run tcpdump, though, I definitely see queries that do not return and are tried at another DNS server.

Can anyone advise me how to troubleshoot this further, please?

Thx,
Scott




_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users
Reply | Threaded
Open this post in threaded view
|

Re: Logs full of "timed out resolving" entries

Scott Gasch
Replying to myself with more information... I've checked my local router and my own machine's packet filter.  Neither seem to be dropping packets.  And yet, when I enable stats in named.conf and surf the web a bit, I see stuff like this:

Queryv4 16962
ClientCookieOut 16670
QueryTimeout 11814   <===
Retry 11721
Responsev4 5161
QryRTT100 4808
ValAttempt 4107
ValOk 3136
ValNegOk 931
Truncated 588
GlueFetchv4 387
QryRTT500 300
NXDOMAIN 152
BucketSize 128
ServerCookieOut 81
GlueFetchv4Fail 56
CookieIn 34
CookieClientOk 34
QryRTT1600 31
QryRTT1600+ 11
QryRTT800 10
ValFail 5
NumFetch 2
QryRTT10 1
Priming 1

I tried changing the resolver query timeout in the config to 20s to no avail.  I tried a config file very close to the sample named.conf.  I tried setting up a new jail host to run the nameserver.  None of this seems to fix the problem.  Does anyone have any advice here?

Thx,
Scott



On Tue, Jun 16, 2020 at 5:28 PM Scott Gasch <[hidden email]> wrote:
I have a BIND 9.16.3 server running as master for a couple of zones and forwarding queries to my ISP (and 8.8.8.8 / 8.8.4.4 etc...) for other queries.  It "works" ok but I notice weird delays in web browsing clients where the browser says "Resolving host...", hangs for a bit (noticeably, ~a few seconds), then loads the whole page.

In the server logs, I see lots of messages like this one:

Jun 16 17:21:04 <daemon.info> wannabe named[6982]: timed out resolving 'trafficmanager.net/DS/IN': 8.8.4.4#53
Jun 16 17:21:05 <daemon.info> wannabe named[6982]: timed out resolving 'apiv4-east.myq-cloud.com/AAAA/IN': 8.8.4.4#53

However when I dig from the command line, I can't replicate this.  The query comes back quickly and successfully:
# dig @8.8.4.4 apiv4-east.myq-cloud.com
; <<>> DiG 9.16.3 <<>> @8.8.4.4 apiv4-east.myq-cloud.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52825
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;apiv4-east.myq-cloud.com.      IN      A
;; ANSWER SECTION:
apiv4-east.myq-cloud.com. 4     IN      A       40.71.236.219
;; Query time: 42 msec
;; SERVER: 8.8.4.4#53(8.8.4.4)
;; WHEN: Tue Jun 16 17:22:17 PDT 2020
;; MSG SIZE  rcvd: 69

It almost feels like something in between me and the next nameserver is dropping packets when a lot of queries are in flight at the same time.  I've looked at my router logs and it doesn't seem to be the culprit.  When I run tcpdump, though, I definitely see queries that do not return and are tried at another DNS server.

Can anyone advise me how to troubleshoot this further, please?

Thx,
Scott




_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users