Dig Hangs during axfr request when not on localhost.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Dig Hangs during axfr request when not on localhost.

Bind-Users forum mailing list
Hi

versions: 
BIND 9.9.4-RedHat-9.9.4-74.el7_6.1 (Extended Support Version)
CentOS Linux release 7.6.1810 (Core)

We are having a problem on our masters that have large zone files (around 5MB)
are failing to be loaded on our slaves.

after some investigation

we can perform the following commands whilst local on the master

dig @localhost ZONE axfr

and the command performs and exits successfully

however if you fun dig @IP.OF.MASTER ZONE axfr from a machine on the same subnet
the zone starts to transfer and then hangs at certain points around 150k bytes give or take and fails to complete.

any idea on what i can look into?

smaller zones are transferring all OK

Thanks for your help



_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users
Reply | Threaded
Open this post in threaded view
|

Re: Dig Hangs during axfr request when not on localhost.

Anand Buddhdev
On 14/06/2019 09:53, Pete Fry via bind-users wrote:

Hi Pete,

> however if you fun dig @IP.OF.MASTER ZONE axfr from a machine on the same
> subnet
> the zone starts to transfer and then hangs at certain points around 150k
> bytes give or take and fails to complete.
>
> any idea on what i can look into?
>
> smaller zones are transferring all OK

I would immediately suspect something on your network. Packet loss,
mismatched MTU, etc.

If I were you, I would run tcpdump on both master and slave and then
attempt a zone transfer, and examine that packet trace. See what's going
on. Are there TCP retransmits? Which side is stalling? The sender or
receiver? What, if anything, do you see in the log files of your BIND on
both the master and slave?

Anand
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users
Reply | Threaded
Open this post in threaded view
|

Re: Dig Hangs during axfr request when not on localhost.

Bind-Users forum mailing list
Interestinly as we have the same problem on our dev box (running the same versions)

I took the decision to install the ISC-BIND following (https://copr.fedorainfracloud.org/coprs/isc/bind/)

running 9.14.2 and repeated the tests and it works, however the config will need work to have no errors and as we generally deploy via puppet rework will be required.

We generally use the REDHAT approved bind for support reasons.

if it was a network issue just upgrading bind shouldn't effect it should it?

Pete


On Fri, 14 Jun 2019 at 09:06, Anand Buddhdev <[hidden email]> wrote:
On 14/06/2019 09:53, Pete Fry via bind-users wrote:

Hi Pete,

> however if you fun dig @IP.OF.MASTER ZONE axfr from a machine on the same
> subnet
> the zone starts to transfer and then hangs at certain points around 150k
> bytes give or take and fails to complete.
>
> any idea on what i can look into?
>
> smaller zones are transferring all OK

I would immediately suspect something on your network. Packet loss,
mismatched MTU, etc.

If I were you, I would run tcpdump on both master and slave and then
attempt a zone transfer, and examine that packet trace. See what's going
on. Are there TCP retransmits? Which side is stalling? The sender or
receiver? What, if anything, do you see in the log files of your BIND on
both the master and slave?

Anand

_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users
Reply | Threaded
Open this post in threaded view
|

Re: Dig Hangs during axfr request when not on localhost.

Ray Bellis


On 14/06/2019 09:38, Pete Fry via bind-users wrote:

> Interestinly as we have the same problem on our dev box (running the
> same versions)
>
> I took the decision to install the ISC-BIND following
> (https://copr.fedorainfracloud.org/coprs/isc/bind/)
>
> running 9.14.2 and repeated the tests and it works, however the config
> will need work to have no errors and as we generally deploy via puppet
> rework will be required.
>
> We generally use the REDHAT approved bind for support reasons.
>
> if it was a network issue just upgrading bind shouldn't effect it should it?

Somewhere about BIND 9.11 the default size of AXFR message was reduced
from the maximum of 65535 bytes down to 16384 because that allows for
optimal DNS message compression.

I also suspect a network level issue such as MTU, but it's feasible that
the above change may be allowing the packets to slip through.

kind regards,

Ray

_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users
Reply | Threaded
Open this post in threaded view
|

Re: Dig Hangs during axfr request when not on localhost.

Bind-Users forum mailing list
In reply to this post by Bind-Users forum mailing list
Interesting I don't suppose you know where the default AXFR size can be set so i can do some testing?
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users
Reply | Threaded
Open this post in threaded view
|

Re: Dig Hangs during axfr request when not on localhost.

Bind-Users forum mailing list
In reply to this post by Bind-Users forum mailing list
Would (https://docstore.mik.ua/orelly/networking_2ndEd/dns/ch10_12.htm#dns4-CHP-10-SECT-12.1.6.html)

the setting in 10.12.2.1 the data segmnet size limit be the default?
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users
Reply | Threaded
Open this post in threaded view
|

Re: Dig Hangs during axfr request when not on localhost.

John Horne
In reply to this post by Bind-Users forum mailing list
On Fri, 2019-06-14 at 08:53 +0100, Pete Fry via bind-users wrote:

> Hi
>
> versions:
> BIND 9.9.4-RedHat-9.9.4-74.el7_6.1 (Extended Support Version)
> CentOS Linux release 7.6.1810 (Core)
>
> We are having a problem on our masters that have large zone files (around
> 5MB) are failing to be loaded on our slaves.
>
> after some investigation
>
> we can perform the following commands whilst local on the master
>
> dig @localhost ZONE axfr
>
> and the command performs and exits successfully
>
> however if you fun dig @IP.OF.MASTER ZONE axfr from a machine on the same
> subnet the zone starts to transfer and then hangs at certain points around
> 150k bytes give or take and fails to complete.
>
Hello,

We have had the same problem on CentOS 7 servers after a recent bind yum
update. For the moment we have downgraded BIND back to
bind-9.9.4-73.el7_6.x86_64 and the zone transfers are working again.



John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
________________________________
[http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users
Reply | Threaded
Open this post in threaded view
|

Re: Dig Hangs during axfr request when not on localhost.

Bind-Users forum mailing list
>
> We have had the same problem on CentOS 7 servers after a recent bind yum
> update. For the moment we have downgraded BIND back to
> bind-9.9.4-73.el7_6.x86_64 and the zone transfers are working again.
>

John

Many thanks for this can't believe we didn't try this first!

thanks again

Pete
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users
Reply | Threaded
Open this post in threaded view
|

Re: Dig Hangs during axfr request when not on localhost.

John Horne
In reply to this post by John Horne
On Fri, 2019-06-14 at 10:05 +0000, John Horne wrote:

> On Fri, 2019-06-14 at 08:53 +0100, Pete Fry via bind-users wrote:
> > Hi
> >
> > versions:
> > BIND 9.9.4-RedHat-9.9.4-74.el7_6.1 (Extended Support Version)
> > CentOS Linux release 7.6.1810 (Core)
> >
> > We are having a problem on our masters that have large zone files (around
> > 5MB) are failing to be loaded on our slaves.
> >
> > after some investigation
> >
> > we can perform the following commands whilst local on the master
> >
> > dig @localhost ZONE axfr
> >
> > and the command performs and exits successfully
> >
> > however if you fun dig @IP.OF.MASTER ZONE axfr from a machine on the same
> > subnet the zone starts to transfer and then hangs at certain points around
> > 150k bytes give or take and fails to complete.
> >
> Hello,
>
> We have had the same problem on CentOS 7 servers after a recent bind yum
> update. For the moment we have downgraded BIND back to
> bind-9.9.4-73.el7_6.x86_64 and the zone transfers are working again.
>
Hi,

Looking a bit further into this, as far as I can tell the only difference
between version '9.9.4-73' and '9.9.4-74' is a fix for CVE-2018-5743 which
relates to TCP clients.

It's a bit confusing as we do set a limit for the TCP clients to 250. However,
the server sending the zone and the client requesting it are both lightly
loaded, and rndc shows that we are nowhere near the TCP limit on either server.

Also confusing, for me at least, is that we do log zone transfers, but usually
at a channel severity of 'info'. I changed this to dynamic, and controlled the
debug level using rndc. If I set the debug level to 9 then the transfer works.
Anything less than 9 and it fails.

It seems that the TCP connection is being lost for some reason, as the log file
(at a low debug level) shows the AXFR starting a few times. Each start
corresponds to when the transfer seems to hang. On the client side it shows the
connection as having timed out.




John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
________________________________
[http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users
Reply | Threaded
Open this post in threaded view
|

Re: Dig Hangs during axfr request when not on localhost.

Bind-Users forum mailing list
John

we are seeing exactly the same I've raised

https://bugs.centos.org/view.php?id=16183

Do you want to add your information to it so hopefully this can be fixed.

Pete
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/bind-users