JF05 @comcastbusiness.net |
JF05
Anon
2011-Jun-13 6:24 pm
Sonicwall NSA 240 - Intermittent DNS IssuesWe have the NSA 240 configured with 2 general use connections on failover and a T1 for web services etc, so the the primary dns server from each WAN interface is forwarded to computers on the network. The order is such that a computers primary dns corresponds to the connection that it should primarily use. Over the last couple weeks all computers on the network have begun experiencing extremely slow wan performance. after some diagnostics it looks like the root of the cause is random DNS server not responding errors on the LAN computers. Testing with direct connections to any of the modems verifies that the slow speeds and DNS issues only occur behind the NSA 240. Additionally, the problem still exists when alternative dns servers are substituted such as googles public DNS servers. So far I've checked firewall rules, nat policies, the connection MTU, all sorts of things and cant figure out what is wrong. |
|
|
Which DNS servers are you using?
You mentioned as well "primary DNS server from each WAN interface." I'm presuming that each WAN interface is connected to a different ISP who's feeding you different DNS servers.
About the only thing I can think of at this point is setting up Wireshark and having a filter for ONLY DNS traffic, see if you can capture one of these 'failures' and see if its getting mangled or the DNS server is not responding, or what.
My 00000010 bits.
Regards |
|
|
to JF05
"so the the primary dns server from each WAN interface is forwarded to computers on the network."
none of my computers on the network have WAN DNS servers assigned to their NICs.
do you have a DNS server in your network? |
|
JF05 @comcastbusiness.net |
JF05
Anon
2011-Jun-14 11:19 am
Yes, what I meant by that is the primary DNS server given by each different isp. The primary DNS server from each of those connections is given out by the NSA 240 to computers on the LAN. Each computer then has its DNS servers in the order below:
x.x.x.x (Primary DNS provided by cable company) x.x.x.x (Primary DNS provided by DSL company) x.x.x.x (Primary DNS provided by T1 company)
I have also tried setting the DNS to Google's public DNS servers for testing, and the same issue remains. In that case the DNS servers on the test computer are:
8.8.8.8 8.8.4.4
And finally; no we dont have a DNS server on our network, but if that is a solution to the issue, it is something we can create. |
|
|
tomdlgns
Premium Member
2011-Jun-14 11:41 am
so it looks like the sonicwall is doing DHCP....
why not use the sonicwall address for the DNS servers on the network?
my NIC settings in the locations that i dont have a DNS server in place and use the sonicwall for DHCP :
ip address- 192.168.1.100 sub- 255.255.255.0 gate- 192.168.1.1
DNS- 192.168.1.1 DNS2- 8.8.8.8.8
sonicwall ip is 192.168.1.1 dhcp range typically starts at .100 and ends at .150
since i only have one sonicwall device, i use the free, public DNS server as a backup. but this would only come into play if the DNS servers provided by the ISP (hard coded in the sonicwall) are down or acting up. then it uses the secondary DNS i provided. |
|
|
to JF05
I seem to remember there was a bug with DNS on Sonicwall's specifically.
This issue may be exacerbated by the fact that many DNS replies today are over 512 bytes (DNSSEC)
Have you tried upgrading the firmware on this unit? |
|
JF05 @comcastbusiness.net |
JF05 to tomdlgns
Anon
2011-Jun-14 12:33 pm
to tomdlgns
Changing to the sonicwalls IP and with the public DNS server as the backup only results in the computer immediately switching to the public DNS server (8.8.8.8) because it doesn't get a response from the sonicwall.
Looking around in wireshark it looks like problems specifically with DNS only arise when the computer attempts the wrong DNS server for the internet connection. In which case the server responds that it is not an authority on the domain and the query is refused. So in general it has to be something else that is causing the congestion which also seems to affect DNS queries intermittently... Other things that have been becoming more apparent are extreme slowness inside the network when dealing with computer names, etc.
My only guess is some piece of hardware is on its way out, and is congesting the network or something is improperly configured causing general network slowness to arise. Last time we had a switch go out that began flooding the network with ARP, but that doesnt seem to be the issue right now... |
|
|
tomdlgns
Premium Member
2011-Jun-14 1:14 pm
can you give us the output of ipconfig /all on a few workstations?
if all of the NIC cards DNS settings are pointing to the outside (ISP or public DNS), then how is internal DNS supposed to work?
something on your network should be desginated as the DNS server/role/function. |
|
JF05 @comcastbusiness.net |
JF05
Anon
2011-Jun-14 5:54 pm
It looks like the sonicwall can only forward DNS, so it would appear that yes, our network was set up without an internal dns server. The only thing I cannot explain is why the network worked without one for so long. |
|
|
tomdlgns
Premium Member
2011-Jun-14 9:56 pm
is sonicwall your DHCP server?
i know i asked above, but lets keep this simple and start over.
if the sonicwall is doing DHCP, you can go in the scope settings and specify the DNS servers you want to use/what the sonicwall pushes out to the LAN.
use the same ip that you are using as the gateway. both gateway and DNS servers should be the inside address of the sonicwall.
i am assuming it is something like 192.168.1.1...10.10.10.1....something like that.
it has been working all this time because DNS is doing its job. you are using an external DNS server to get to your websites. but if you try to do stuff internally, i dont see how your network can match, lets say...server1 with an IP address on your network. since there is nothing telling it what the IP is.
i dont know that this will solve your problem, but i highly recommend that you change your scope options on the sonicwall DHCP settings and get that working properly before moving onto the next issue, assuming there is still an issue.
i'd also recommend doing ipconfig /flushdns on all the machines on the network after they have gotten the new DHCP info from the sonicwall. |
|
|
to JF05
said by JF05 :Looking around in wireshark it looks like problems specifically with DNS only arise when the computer attempts the wrong DNS server for the internet connection. In which case the server responds that it is not an authority on the domain and the query is refused. So in general it has to be something else that is causing the congestion which also seems to affect DNS queries intermittently... Other things that have been becoming more apparent are extreme slowness inside the network when dealing with computer names, etc. Do you have the wireshark trace around for review? I'm still trying to wrap my head around this "not an authority on the domain and the query is refused" bit which doesn't quite make sense to me. Is this all 3 DNS servers or just one? Also trying to wrap my head around your setup again, so the Sonicwall's got one connection to the cable, one to the DSL, and one to a T1, and the rest of the interfaces are your LAN, correct? One thing I'd try and correlate is if on the DNS server(s) in question when you get the "not an authority" error if you can at least still ping it. Authority errors as I understand them from a DNS perspective has to do with the type of response, not that any response wasn't recieved. Regards |
|
JF05 @comcastbusiness.net |
JF05
Anon
2011-Jun-15 3:50 pm
So figuring out the authority issue, it seems to happen only when the connection is in failover. Therefore the internet connection in use does not correspond to the primary DNS server that the computer are still stuck with. In this case it waits the 2 seconds and goes down the list until the DNS server associated with the active connection approves the query. This part seems like just the way it is with the allowed configuration in the sonicwall. As such pinging any of the DNS servers works fine all the time. They just refuse the query when it comes from another companies connection.
The connections on the sonicwall are as follows: X0 - LAN - 192.168.1.1 - DATA LAN X1 - WAN - 6*.***.***.*4 - T1 WAN X2 - LAN - 192.168.5.5 - VOICE LAN X3 - WAN - 1**.***.***.*7 - CABLE WAN X4 - WAN - 1**.***.***.*2 - DLS Wan X5 - X8 - Unassigned
For internal DNS, my best guess is we have nothing. If i try an nslookup of anything internal it will fail. However through some protocol we must be getting server names associated with IPs seeing as you can ping and use the servers by both name or IP. By name it seems to be slower at times, but for some reason it still works.
Additionally, the issues with internet connection speed, DNS and so forth all seemed to temporary clear up after rebooting all major network devices (routers, modems and switches) after-hours. However today the problem has returned. This leads me to think something on the network (likely someones computer?) is causing the issues.
As for wireshark, it would appear nothing is flooding the network with a particular type of protocol. Occasionally I see a TCP Dup Ack or retransmission. If There is anything else to look for I can definitely do so. So far though it would seem that the issues with DNS and connection slowness are being caused by a clog or failure somewhere else. |
|
|
to JF05
said by JF05 :So figuring out the authority issue, it seems to happen only when the connection is in failover. So there any indication of a connection failover when this problem occurs? How's routing or failover setup between the three connections you have here? Dynamic? Static? Load Balanced? said by JF05 :However through some protocol we must be getting server names associated with IPs seeing as you can ping and use the servers by both name or IP. By name it seems to be slower at times, but for some reason it still works. Likely the DNS cache. IIRC, Windows default is to hold for 24 hours, or till you run a ipconfig /flushdns. I don't know what other OS' defaults are. said by JF05 :Additionally, the issues with internet connection speed, DNS and so forth all seemed to temporary clear up after rebooting all major network devices (routers, modems and switches) after-hours. However today the problem has returned. This leads me to think something on the network (likely someones computer?) is causing the issues. Do you have any sort of monitoring / trending in place? The thing that peeves me the most about networks is that it's something everyone can point to and say it's the problem, but never have to provide any evidence to back it up. At a most basic level, interface duplex / stats, pings, traceroutes, and DEFINATELY some sort of solid baselining needs to be established to know what 'normal' is before saying it's not normal. As for the rebooting of your network devices, unless it's a POS device running POS code, rebooting more often masks the problem than solving anything longterm. If you're leaning towards DNS failure, look into a DNS response baseliner. GRC DNS Benchmark is one to try out. Regards |
|
JF05 @comcastbusiness.net |
JF05
Anon
2011-Jun-16 11:31 am
said by HELLFIRE:So figuring out the authority issue, it seems to happen only when the connection is in failover. So there any indication of a connection failover when this problem occurs? How's routing or failover setup between the three connections you have here? Dynamic? Static? Load Balanced? It is a "Basic Failover" where the interfaces are ordered (X3 first, then X4). It probes every 5 seconds and when the connection on X3 fails, switches to X4. The X1 (T1) is statically routed for specific purposes such as web hosting, static vpn etc and is not used for internet access for the general LAN. That being said all 3 connections exhibit the same slowness and ping response times to anything connected to the router and behind. Essentially all that seems certain is something is causing general network turmoil. In addition to all the internet connections being extremely slow, file transfers to local servers are now running in the kbps, etc. Trying out the DNS benchmark, it shows that DNS is largely working fine at least for external name resolution. |
|
|
tomdlgns
Premium Member
2011-Jun-16 4:47 pm
JF05- can you please give me the DHCP Scope settings?
i still think there is an issue you need to resolve, first.
lets say you are using ISP A IPs in your DHCP scope for the first connection...when that connection drops, DHCP isnt reassigned right away, there is an active lease that is still in effect. so while your sonicwall has failed over to ISP B, the DHCP scope is still using some of ISP As IPs.
can you also include and ipconfig/all from one of the workstations that is part of the DHCP scope?
what is the ip of your sonicwall? |
|
JF05 @comcastbusiness.net |
JF05
Anon
2011-Jun-17 12:22 pm
the Sonicwall is 192.168.1.1 the DHCP scope is attached. When failover occurs, all it does is switch the WAN routing. Checking your computers external IP will show the default IP of the connection that the router is using. So usually, it says IP A - Comcast Business, then if that fails it switches to IP B - DSL |
|
|
tomdlgns
Premium Member
2011-Jun-17 5:02 pm
those are the actives leases. click on the scope options icon and you will see the information i am looking for.
IP, sub, gateway. and on another tab, i think you will see DNS info. on my sonicwall, pro2040, DNS is on its own tab. |
|
tomdlgns |
to JF05
6_.__.__._8 2_.__.__._5 6_.__.__._6
this is what i was refering to in your DHCP scope. i would set this address as 192.168.1.1 and for the second DNS address, i would use a 3rd party address....
8.8.8.8
that way, regardless of whichever connection you are using, ISP A of ISP B, you are using a 3rd party DNS server that doesnt belong to either.
I would not want my computer using ISP A if ISP A is down. |
|
JF05 @comcastbusiness.net |
JF05
Anon
2011-Jun-21 5:28 pm
After trying it out again, using 192.168.1.1 as a DNS server doesnt work. I'm pretty sure the NSA 240 does not have any sort of DNS server capability and can only forward the DNS server address when assigning DHCP. Adding that ip to the DNS list only results in DNS request timed out.
Using public DNS servers does ensure that regardless of the active WAN connection, the computer is allowed to resolve an address.
Ultimately it would appear that the issue might not be caused directly by DNS issues. The symptoms are and remain at extremely slow file transfers within the network, as well as dismal performance with any of the WANs. Occasionally a computer seems to get full speed service, but this happens rarely and and random. Next plan of action is to just go through all the simple possible causes and hopefully rule some more out... |
|