|reply to kaugustino9 |
Re: [ALL] Ask ShawSean
said by kaugustino9:I took a look at your destination from my connection in Calgary. While I don't end up routing through as6453.net I still end up with a large amount of packetloss near the end of cogentco's network (more than you were seeing). Wanting to avoid any re-prioritization of ICMP packets I re-ran my test using TCP SYN packets to port 80, followed by a RST packet, by the bucketful (assuming a bucket is 100 packets for each respective TTL :P) I didn't want more than 100 since that could be seen as a SYN flood. The same principle applies with the default *nix UDP traceroute though; when a packet expires an ICMP packet should be sent back indicating the original packet expired in transit.
I've been having some really poor speed issues here in Edmonton for the past couple of weeks. I think I might have tracked the problem to a lossy router of a transit provider Shaw is peered with. Here is an mtr from myself to my server:
I know mtr isn't really fool proof since a lot of backbone routers prioritize traffic directed at them differently than what is routed through them. However I believe the as6453.net routers to be the source of this problem. I did some further testing using looking glasses from both as6453.net and cogentco. From the as6453 looking glass pinging back to me there was packet loss in 2 out of every 5 runs, but no packet loss when I ran the test going forward to the server I mtr'd. Using the cogentco looking glass pinging back to me there was also packet loss pinging back to me, but no packet loss pinging toward the server again. So I'm fairly certain the problem lies with as6453.
If you could help me with this issue I'd be very grateful.
You can have a look at my results (and the command I ran) here »pastebin.com/q4W4eVSe - its pretty huge and I didn't want to clutter up this forum.
What I see is that around hop 15 ("around" because there are different routes used for each packet right from the get go [I enforced a start TTL of 4, and even then we had divergent routes], not that different routes are necessarily a bad thing) the 'time exceeded in-transit' packets were starting to be reliably unseen.
A few hops further, however, (around 17) we start to see reliable 'time exceeded in-transit' responses again. This means that while these cogentco routers are not sending the 'time exceeded in-transit' packets (either load related or design related), they are not seeing much packetloss; the TCP SYN packets are making it through when the TTL is high enough.
The last hop (18) is supposed to be your target (dragon305.startdedicated.com), but due to the differing paths 126.96.36.199 shows up for ~50% of the responses. Since it is listening on port 80 we should see SYN ACK packets back from it (i.e. packets that would not be re-prioritized or preferentially dropped by the network in between). The 188.8.131.52 responses there are the same-old 'time exceeded in-transit'. If we prune out the divergent route responses (and only focus on the SYN ACK responses we need to look at), I count 52 responses and 2 lost packets, just shy of 4% packetloss right at the destination.
If you have a linux machine, or can get a linux VM running, I would try to run the same test from your location to see if your results match mine.
Thanks for taking the time to help test.
I ran an mtr from the server to the first hop listed in your test to see if the return route would be much different than the forward route. It was, it does use as6534 going back to you (well as close as I could get back to you since I don't know your IP):
Host Loss% Snt Last Avg Best Wrst StD ev
1. static-ip-69-64-35-253.inaddr.ip-pool.com 0.0% 101 2.5 2.7 1.3 7.2 1 .5
2. static-ip-209-239-125-2.inaddr.ip-pool.com 0.0% 100 0.5 3.6 0.4 54.3 10 .3
3. te3-7.ccr01.stl03.atlas.cogentco.com 0.0% 100 18.2 46.0 0.6 460.6 75 .9
4. te0-2-0-5.ccr22.ord01.atlas.cogentco.com 0.0% 100 7.7 7.6 7.4 8.0 0 .1
5. te0-5-0-3.ccr22.ord03.atlas.cogentco.com 0.0% 100 7.8 7.8 7.6 8.1 0 .1
6. ix-1-3-1-0.tcore1.CT8-Chicago.as6453.net 7.0% 100 18.4 24.3 17.7 71.5 10 .7
7. 184.108.40.206 4.0% 100 29.7 24.7 19.0 34.4 3 .6
8. rc2ec-tge0-0-1-0.il.shawcable.net 3.0% 100 22.4 22.1 18.9 72.5 5 .6
9. rd2cs-tge1-1-1.ok.shawcable.net 3.0% 100 38.1 38.4 34.3 42.4 2 .3
10. rc2so-tge0-3-0-0.cg.shawcable.net 7.0% 100 73.5 72.8 70.8 77.3 1 .3
I haven't done a forward test like you did yet, but will shortly.
|reply to Jumpy |
Here is the tcptraceroute from home to the server: »pastebin.com/hjnZdgD7
Looks like it gets bad toward the end of as6534 again... I don't know if this is really a reliable test since we are still sending packets directly to routers...they may still deprioritize tcp packets sent to them.
Please tell me your thoughts on these results.
You aren't sending them directly to the routers; you're still addressing your packets to your destination (dragon305). The only difference between the tcptraceroute and a tcp connection is that tcptraceroute artificially limits the TTL of the packet, causing the normal packet handling of each router along the way to encounter an 'expire' condition which _should_ be dealt with by sending an ICMP packet in response. This ICMP response packet is what is used to determine the route the packets are taking.
Looking at your paste, I see the same thing I saw in mine. One router (220.127.116.11, hop 10) shows a bunch of timeouts. This simply means that the ICMP response packet was never seen. The very next hop (11, 18.104.22.168) shows no timeouts. This means that your TCP packets are making it through just fine and the SYNACK response is making its way back just fine.
Actually, it is only hop 10 that is showing any sort of issue. Since we know that it is relaying the TCP packets properly, and that there are no timeouts that occur after it, I don't think we can blame anything on a network issue. I added a longer wait (add -w 10 somewhere in that command) to my trace and the final hop for me showed no loss (some 3 second + responses that were not seen prior to that final hop though). I'd start by opening a ticket with your hosting provider (I assume that dragon305 is your host).
I have opened tickets with the server host, they think that its a problem with as6453 too, but they don't have a peering agreement with them so they can't complain about it. Shaw does have a peering agreement with as6453 though, so I was kind of hoping they could complain about the loss. I can download fast from that server using other servers that don't route through as6453.