dslreports logo
 
    All Forums Hot Topics Gallery
spc
Search similar:


uniqs
4614
tyche
join:2014-03-06

tyche to Sanek

Member

to Sanek

Re: SFTP/SCP uploads break

That's 3 of us in the Nepean/Kanata area and I'm sure there's more it's just subtle depending on how/where you transfer (you wouldn't see it with http). My problem is going to Netherlands probably elsewhere overseas I just haven't tried other tests.

Yeah, if I use ftp instead of sftp and set the timeout low it constantly loops disconnecting and continues the transfer but that's just showing the problem exists.
Sanek
join:2006-08-10
Kanata, ON

1 edit

Sanek

Member


Stats Before Reset

Stats After Reset
Click for full size
Logs Before Reset
Click for full size
Logs After Reset
I'll see if I can get that info for you, rocca.

Please note that my server is in North Carolina, so there might be a different issue with EU that you guys are having (or maybe it has nothing to do with the server and is indeed a local issue). Could also be an issue in the area...

//Edit: Attached before & after factory reset s/s of the stats and logs. I don't see any difference in behavior after the factory reset. Overall, I do not see any speed-related issues and signal seems fine to me:


rocca
Start.ca
Premium Member
join:2008-11-16
London, ON

rocca

Premium Member

Thanks, that's what I suspected - your upstream levels are off and needs a ticket. I think what's happening is that FEC is resending those packets. Anyway, give us a call and get a ticket in so we can get the signal levels fixed up. Thanks.
mlord
join:2006-11-05
Kanata, ON

mlord

Member

Maybe, perhaps. But that's definitely not the case here.
Sanek
join:2006-08-10
Kanata, ON

1 edit

Sanek to rocca

Member

to rocca
said by rocca:

Thanks, that's what I suspected - your upstream levels are off and needs a ticket. I think what's happening is that FEC is resending those packets. Anyway, give us a call and get a ticket in so we can get the signal levels fixed up. Thanks.

I thought the upstream levels are almost perfect - aren't they supposed to be in the ~20dBmV - ~50dBmV range? That said, it does look like 40dBmV is the recommended min, just weird as connection "appears" to be stable.

I would expect there to be some upstream speed issues if it had to resend stuff all the time, but I seem to get the full 10 Mbit? I guess it would depend how the networking hardware caps the speed though (I think it would consider resends as traffic, so I would expect to see lower speeds than cap)...
mlord
join:2006-11-05
Kanata, ON

mlord

Member

Another buddy of mine also reports failed downloads over start.ca's network. Definitely something new/wrong happening here. Using VPN avoids the issue.
Sanek
join:2006-08-10
Kanata, ON

Sanek

Member

Oh, if it helps, tyche gave me a couple of http links to try to download and they both stalled and failed in Windows, but had no issues under Linux.

rocca
Start.ca
Premium Member
join:2008-11-16
London, ON

rocca to mlord

Premium Member

to mlord
Yours are out out spec on the forward path. -11 to +11 is within Red spec but in practise that's pushing it and as close to 0 as possible is best. Having 11.3 and 11.1 is definitely going to be causing problems.

It seems like we have a handful of people all in the same area and all with similar problems here and so far all of them I've seen signals from have been just out of spec, ie enough to not lose connectivity but obviously enough that FEC is kicking in and some OS stacks don't seem to like that very much. Given that this happened at same time and happened in same area, it's likely that noise has crept into the plant or that the temperature change has exposed connections that were marginal but the issues were unnoticed.

As for why it's working with VPN, I suspect that's the protocol driver differences, ie why the connection also works on Mac and various Linux's but problem with Windows - and/or perhaps Windows Defender preventing what it considers malicious traffic directed to the stack.

If it is a plant issue it's possible that Red will get it fixed on their own (ie this will effect their customers and they'll be correlating complaints too), however it'd be prudent to get a ticket in with us because of the signal issues that way we can ensure it gets resolved.
rocca

rocca to Sanek

Premium Member

to Sanek
said by Sanek:

aren't they supposed to be in the ~20dBmV - ~50dBmV range?

+35 to +52. Anything less and SNR ratios erode and you get packet loss, anything above you're at risk of the CMTS kicking you during warm parts of the day. +45 is ideal.
tyche
join:2014-03-06

tyche to Sanek

Member

to Sanek
Are my stats bad?
mlord
join:2006-11-05
Kanata, ON

mlord to rocca

Member

to rocca
said by rocca:

Yours are out out spec on the forward path. -11 to +11 is within Red spec but in practise that's pushing it and as close to 0 as possible is best. Having 11.3 and 11.1 is definitely going to be causing problems.

So long as it's within -12 to +12 it ought to be fine, and the 0% BER confirms that.

This really smells like a bad router on an EU route somewhere, just like every time it has happened before. Opening a Red ticket for +11 signal is a total waste of time for all concerned, and won't resolve anything.

I'll just continue to use shorter timeouts to work around the start.ca issue until things return to normal.

Cheers.
tyche
join:2014-03-06

tyche to Sanek

Member

to Sanek
I've been trying every day and this morning around 7am it still was failing.

HOWEVER

I have just tried now and on my Windows PC I am getting 5.3 MB/s which is max speed and successful downloads. I've tried 6 times now and no problems so I get the feeling whatever was broke is now fixed.

Others should check as well.
Sanek
join:2006-08-10
Kanata, ON

Sanek

Member

said by tyche:

I've been trying every day and this morning around 7am it still was failing.

HOWEVER

I have just tried now and on my Windows PC I am getting 5.3 MB/s which is max speed and successful downloads. I've tried 6 times now and no problems so I get the feeling whatever was broke is now fixed.

Others should check as well.

My SFTP Uploads work now as well!

Modem signal levels are the same as before. I guess our issues were related and whatever the fix was remains a mystery.
tyche
join:2014-03-06

tyche

Member

And...it's broke again.

rocca
Start.ca
Premium Member
join:2008-11-16
London, ON

rocca

Premium Member

I'll reach out to you, would like to do some testing.
chromius
join:2012-10-13

chromius to Sanek

Member

to Sanek
Don't know if it's related, but the last couple of days I've been having almost identical problems, and I'm also in Nepean/Barrhaven. Nothing network wise has changed on my end, and it seems to only be affecting the upload.

rocca
Start.ca
Premium Member
join:2008-11-16
London, ON

rocca

Premium Member

Are you able to post your modem stats?
chromius
join:2012-10-13

chromius

Member

I'm at work at the moment, but I'll post them as soon as I get home.

rocca
Start.ca
Premium Member
join:2008-11-16
London, ON

rocca

Premium Member

Great thanks.
chromius
join:2012-10-13

chromius to rocca

Member

to rocca
Click for full size
Here they are...I think it looks normal?

sbrook
Mod
join:2001-12-14
Ottawa

sbrook

Mod

Not what I'd call normal but certainly not out of range ... you should have no problems with these.

rocca
Start.ca
Premium Member
join:2008-11-16
London, ON

1 recommendation

rocca to chromius

Premium Member

to chromius
Thanks for those that helped out with some testing and posting signals.

We've confirmed the issue as being a problem with packets being sent from connections that have a high return path power levels on the Rogers network. Specifically we've identified that the packets are malformed and have invalid TCP checksums.

What appears to be happening is that the packets are either being corrected by forward error correction (FEC) but the TCP checksum is not being corrected, or the packet is being rewritten incorrectly after FEC, or that the CMTS is passing the corrupted packets directly without validating them at the CMTS. In all cases, the effect is that packets being received on our end with this issue would fail TCP checksum validation and be dropped.

To 'resolve' the issue in the mean-time we've disabled checksum validation which means that we allow the packet to traverse the network and when the the receiving server gets the malformed packet it'll do one of two things - either not validate the checksum (which seems to be the case for many stacks) and process the packet - usually an ACK packet, or reject the packet as invalid and issue a resend request for it - often seen as a duplicate packet on the sender side.

This behaviour is not happening with Bell nor Cogeco, nor Rogers customers with good signals and/or most CMTS's, ie it could be that only a few Rogers CMTS's don't have this filter enabled - or it could be that some areas have a higher noise floor and are causing more packet errors on marginal connections.

Those that have return path power levels in the 50's should probably put in a ticket request since even when we pass these packets you're still having packet retransmissions, just better recovery since the receiving stack is dealing with it directly. Or if you're content to leave it alone that's certainly your prerogative as well.

In the mean-time we'll leave checksum validation off which had been disabled for quite some time before and was only enforced recently, which is what exposed the flaws in these connections.

Thanks again, and if you're still having upload issues please PM me your account # and the details and I'll be happy to take a peak.

Have a good night.
loki9
join:2014-07-14

loki9

Member

So are you talking to Red? Are they going to fix it?

I don't have any issues at the moment, but this seems like some sort of temp solution and that problems can continue to crop up until permanently fixed. So, ongoing news about this would be appreciated.
chromius
join:2012-10-13

chromius to rocca

Member

to rocca
Wow, thanks rocca! I've been doing a couple of checks this morning, and all seems to be back to normal. This is why I love Start!

I do have the same question as loki though. Is this a permanent fix? or is it possible this could crop back up?
penman4
join:2011-10-20

penman4

Member

Just to throw a monkey wrench into the discussion....
Two days ago I was trying to download a file (from EU BTW) with my Windows 7 PC. The download would halt after ~20% download. I tried the download about 4-5 times with no success. Then I decided to try the download using my OLD XP netbook. The download succeeded first time.
my 2 cents.
P

rocca
Start.ca
Premium Member
join:2008-11-16
London, ON

rocca to loki9

Premium Member

to loki9
said by loki9:

So are you talking to Red? Are they going to fix it?

Yes, we'll let them know but also leave this disabled as a permanent fix.
loki9
join:2014-07-14

loki9

Member

OK, gotcha.
mlord
join:2006-11-05
Kanata, ON

mlord to rocca

Member

to rocca
said by rocca:

To 'resolve' the issue in the mean-time we've disabled checksum validation which means that we allow the packet to traverse the network and when the the receiving server gets the malformed packet it'll do one of two things - either not validate the checksum (which seems to be the case for many stacks) and process the packet - usually an ACK packet, or reject the packet as invalid and issue a resend request for it - often seen as a duplicate packet on the sender side.

Okay, for the past few days now my connections to/from Europe have been behaving as they did back when this "checksum validation" thing was first broken. Did somebody mistakenly enable it again recently?

Wicked fast 60/10 to speedtest.net of course, but wicked slow (like, 8mbit/sec from EU unless I use a VPN to avoid start.ca routing).

rocca
Start.ca
Premium Member
join:2008-11-16
London, ON

rocca

Premium Member

No, sounds like a different problem too as the other one caused stalled uploads. Can you send a trace from the direction the packets are going (ie, presumably downloading from the EU server).
mlord
join:2006-11-05
Kanata, ON

2 edits

mlord

Member

Here is traceroute from the EU server to my home box:

traceroute to xxx (24.53.240.xxx), 30 hops max, 60 byte packets
1 185.21.216.129 (185.21.216.129) 0.314 ms 0.310 ms 0.301 ms
2 185.21.216.66 (185.21.216.66) 8.796 ms 8.785 ms 185.21.216.64 (185.21.216.64) 0.103 ms
3 host-46-18-174-253.in-addr.ixreach.com (46.18.174.253) 0.420 ms 1.079 ms 1.063 ms
4 r2.thn.lon.ixreach.com (91.196.184.137) 6.590 ms 6.580 ms 8.803 ms
5 r1.tx1.nyc.ixreach.com (91.196.184.114) 81.645 ms 81.635 ms 81.620 ms
6 * * *
7 * * *
8 dhcp-198-2-121-46.cable.user.start.ca (198.2.121.46) 196.790 ms 196.695 ms 92.514 ms
9 * * *
10 dhcp-24-53-240-101.cable.user.start.ca (24.53.240.xxx) 114.019 ms 107.162 ms 107.157 ms

And here is traceroute from my home box to the EU server:

traceroute to xxx (185.21.216.130), 30 hops max, 60 byte packets
1 tomato (xxx) 0.443 ms
2 10.106.193.1 (10.106.193.1) 7.281 ms
3 66.185.90.221 (66.185.90.221) 15.766 ms
4 so-4-0-0.gw02.ym.phub.net.cable.rogers.com (66.185.82.125) 15.995 ms
5 10ge7-4.core1.tor1.he.net (209.51.164.81) 15.747 ms
6 100ge1-2.core1.nyc4.he.net (184.105.80.9) 24.742 ms
7 100ge7-2.core1.lon2.he.net (72.52.92.165) 101.246 ms
8 100ge3-2.core1.ams1.he.net (72.52.92.214) 100.999 ms
9 amsix.network.feral.io (80.249.211.10) 135.142 ms
10 185.21.216.130 (185.21.216.130) 107.250 ms

Both routes are now operating at a trickle of the normal (7MByte/sec) speed.
And even the VPN trick only nets me around 2MBytes/sec now.