dslreports logo
 
    All Forums Hot Topics Gallery
spc
Search similar:


uniqs
7220
sangsi
join:2010-03-10

sangsi

Member

Weird behavior with USG 50 WAN failover

I have setup the Zywall USG 50 (3.30(BDS.2)) to take advantage of 2 separate internet connections. I'm using the spillover method with active and passive setup. So far everything works perfectly. The Zywall does its job by sensing that the WAN1 connectivity has failed and activated WAN2. However, the VoIP phones will not work even when WAN2 connectivity gets activated. I performed some packet capture on the device itself and after analyzing the captured packets, for some odd reason, SIP registration requests are still trying to go out from the WAN1 interface, which happens to be dead. I have no idea as to why the router would try to forward packets when it knows that the port is deemed dead. Just to rule out that the VoIP phones aren't the culprit here, I reboot them manually and when they are initiating, they connect to the FTP server to obtain conf. files just fine. It's just the SIP registrations that are not being diverted to the working WAN interface. I have zero routing policy setup so that is not the case either.
polarisdb
join:2004-07-12
USA

polarisdb

Member

It's been a while so I can't cite specifics, but I have seen some WAN failover weirdness with my USG50 since I've had it. I do remember one occurrence where it was difficult getting traffic to use WAN1 once it was back up since my WAN2 connection blocked SMTP traffic that WAN1 allowed. This happens so infrequently that I just reboot the USG50 and move on. Most of the time I don't even notice when one of the WAN connections goes down/up.

Anav
Sarcastic Llama? Naw, Just Acerbic
Premium Member
join:2001-07-16
Dartmouth, NS

Anav to sangsi

Premium Member

to sangsi
I think that maybe policy routing is also at play here. There are some specific settings one needs to make to allow or disallow routing when one network goes down, cant remember at the moment but worthwile checking into.
sangsi
join:2010-03-10

1 edit

sangsi to polarisdb

Member

to polarisdb
Yes, a reboot of the USG 50 fixes this issue. What I do is, I unplug the coaxial cable that feeds the cable modem, which is my main (active) WAN connection, namely WAN1, to simulate an Internet outage. After about 2 mins., WAN failover kicks in and start routing traffic through the WAN2 connection. Everything works just fine except the Polycom VoIP phones. After seeing this behavior, I rebooted the USG 50 while the coaxial cable to the cable modem is still unplugged, and when the router comes back up, it repeats the process and deems WAN1 as dead from the get go and diverts traffic through WAN2 and somehow the Polycom phones just work.

This is telling me that the USG 50 is the culprit here. I shouldn't have to manually reboot the USG 50 myself. Everything should be automatic. I won't always be at the office to perform this task.
sangsi

sangsi to Anav

Member

to Anav
Hmm... I cannot think of anything off the top of my head to make this work. I mean less is more is the case when it comes to policies in my opinion. In this case, there are zero entries under my policy routing table. Same is true for the static as well. I'm not familiar with the Policy Routing in terms of "IF" such a route is dead, divert it through X interface type setup. Is this even possible? If so, what's the purpose of the WAN failover engine?
polarisdb
join:2004-07-12
USA

polarisdb to sangsi

Member

to sangsi
Do you have the SIP ALG enabled on the USG50? I don't have it enabled and can't recall having a similar issue with my Vonage ATA.
sangsi
join:2010-03-10

sangsi

Member

Nope. I stay the hell away from SIP ALG. It does more harm than good in my humble opinion.
sangsi

sangsi

Member

I did some further testing tonight and found out the following:

The act of trying to send SIP registration attempts via WAN1 interface while the WAN1 interface is deemed dead is not happening anymore. At least this is the correct/expected behavior.

However, VoIP phones still do not work properly when WAN1 (active) fails and WAN2 (passive) takes over.

At this point, while the coaxial cable to the ISP1's cable modem, which feeds into the WAN1 port, is still in disconnected state to simulate an outage, I reboot the USG 50.

Once the USG 50 comes back up, everything works perfectly fine, including the VoIP phones.

After which, I wait 5 ~ 10 mins. and reconnect the coaxial cable back into the ISP1's cable modem and USG 50 restores the connection of the WAN1 and deactivates WAN2. And guess what, VoIP phones still work.

So there is something wrong with active to passive. Because passive to active just works fine. In order for active to passive to work, I have to reboot the router, which is a bit silly to think of it as an automated wan fail-over procedure.

Any thoughts?
FirebirdTN
join:2012-12-13
Brighton, TN

FirebirdTN

Member

This is a little over my head, as I have zero experience with SIP.

But the behavior you describe sounds like an "active sessions" problem.

-Alan

morbo
Complete Your Transaction
join:2002-01-22
00000

morbo to sangsi

Member

to sangsi
This just happened to me.
morbo

morbo to sangsi

Member

to sangsi
said by sangsi:

for some odd reason, SIP registration requests are still trying to go out from the WAN1 interface, which happens to be dead.

Your scenario just happened for me. WAN1 failed, WAN2 switchover, VoIP phones fail registration, internet browsing on workstations still working fine.

A reboot solves the issue.

When WAN1 resumes, the switchover is fine and VoIP continues.
morbo

morbo to FirebirdTN

Member

to FirebirdTN
said by FirebirdTN:

This is a little over my head, as I have zero experience with SIP.

But the behavior you describe sounds like an "active sessions" problem.

Is that a configuration option in the USG?
morbo

morbo to sangsi

Member

to sangsi
Do you think this setting could be blame?
sangsi
join:2010-03-10

sangsi

Member

morbo,

Unfortunately not. It has nothing to do with what's happening to us.

On another note, here's what that feature does. When your active interface comes back up after an outage, Zywall forces the traffic to go back through the active interface by disconnecting it from the passive interface.

I'm at a loss just as you are. I have a trouble ticket open with Zyxel and I've had two lousy responses from a technician. I don't know what else to do at this point.

I'm thinking of switching to another vendor that does WAN failover, true ingress/egress QoS, fast IDP and etc.

Has anyone looked into Sophos? They have a free version for up to 50 IPs and seems like a solid solution with geographic access rules and etc.

morbo
Complete Your Transaction
join:2002-01-22
00000

morbo

Member

said by sangsi:

I'm thinking of switching to another vendor that does WAN failover, true ingress/egress QoS, fast IDP and etc.

What are you considering besides Sophos? Any small business class Cisco products that aren't a fortune?
sangsi
join:2010-03-10

sangsi

Member

To be honest with you, I have no idea.

Even though I haven't tried Sophos yet, from what I've been reading on the forums and etc., it seems like a solid offering. Plus, I've come to learn that all these routers on the market are incapable of handling high bandwidth throughput with some sort of IDP, anti-virus, content-filtering (active-x, java type blocking)...

Might as well build a small box with my own parts and get superior performance out of it.

morbo
Complete Your Transaction
join:2002-01-22
00000

morbo

Member

What are you planning to build? A pfsense box with with dual WAN?
sangsi
join:2010-03-10

sangsi

Member

A Sophos box with dual WAN/LAN. I have an older (unused) Supermicro 5017C-MTRF server that is sitting idle here. I'm going to install VMware ESXi 5.5 on it and tryout Sophos. I need to purchase a quad-port PCI-E NIC to make it a dual WAN and dual LAN capable router - a replica of what I have with Zywall basically. Sophos UTM (home edition) is free up to 50 IP(s) by the way. If you have any old, unused hardware, I'd say try it too. Once the proof-of-concept works for you, you can build a mini-ITX setup with a decent CPU. It may cost around $400 ~ $600 but you know damn well that there won't be any performance issue emitting from the hardware.

mozerd
Light Will Pierce The Darkness
MVM
join:2004-04-23
Nepean, ON

mozerd

MVM

said by sangsi:

I need to purchase a quad-port PCI-E NIC to make it a dual WAN and dual LAN capable router - a replica of what I have with Zywall basically. Sophos UTM (home edition) is free up to 50 IP(s) by the way.

For your quad-port PCI-E NIC I strongly recommend that you purchase Intel

And Yes Sophos UTM (home edition) is one BEAUTIFUL software product.
sangsi
join:2010-03-10

1 edit

sangsi

Member

Hello mozerd,

I was planning on getting the Intel I350-T4. Would you recommended the one you linked over the I350-T4?

As far as what I see from Intel's own website, Intel PRO/1000 PT Quad Port PCIe cards were released in 2007 vs. the I350 was released around 2nd quarter of 2011.

mozerd
Light Will Pierce The Darkness
MVM
join:2004-04-23
Nepean, ON

mozerd

MVM

My point was to buy Intel NICS and the one I linked too is the one I had experience with ... The Intel I350-T4 looks good to me.
sangsi
join:2010-03-10

sangsi

Member

Thanks mozerd!

morbo
Complete Your Transaction
join:2002-01-22
00000

morbo to sangsi

Member

to sangsi
I have no old, unused hardware so I'm looking at other options. The Peplink 20 (BPL-021) looks decent.

FYI: I submitted a support ticket to Zytel for the Zywall failover issue with SIP registrations. Maybe they will fix this problem.
sangsi
join:2010-03-10

sangsi

Member

I had contacted them a while back when I first found out about it, I had several back and forth emails with some lower level tech. support personnel and I think they decided to escalate the issue to an engineer of some sorts today because I just received an email from a different person asking me for my configuration file.

Let see what will happen. I just hope that it won't take a long time to resolve this issue.

morbo
Complete Your Transaction
join:2002-01-22
00000

morbo

Member

said by sangsi:

Let see what will happen. I just hope that it won't take a long time to resolve this issue.

Agreed.
morbo

morbo to sangsi

Member

to sangsi
Just received email from them as well. Tier2 is looking into it.
sangsi
join:2010-03-10

sangsi

Member

Someone by the name of Marcus by any chance?

morbo
Complete Your Transaction
join:2002-01-22
00000

morbo

Member

Ronald.
JPedroT
Premium Member
join:2005-02-18

JPedroT to sangsi

Premium Member

to sangsi
If I should venture a guess, some state table/mechanism is not cleared properly when the failover happens.

I do wonder if it happens with the NAT and Firewall in use or instead of a reboot you flush the NAT and firewall tables to see if that helps.

morbo
Complete Your Transaction
join:2002-01-22
00000

morbo to sangsi

Member

to sangsi
Any peep further from tech support?