Modem diag screen showing maintenance at 2:16am when Hector went offline
I just want to smack the network engineer at some headend or hubsite who screwed this up.
It seems that during last night's maintenance, some idiot null routed my friend's subnet.
Addresses 126.96.36.199-188.8.131.52 inclusive (184.108.40.206/23) are ALL unreachable. Traceroutes below.
Small backstory: I'm the go-to network guy for some friends of mine in the Denver area. I've got custom OpenWRT running on their routers, with SSH open for remote access and each router has a dyndns address in case I need to fix something remotely.
I get a call from my friend Hector. He tells me that since about 2am last night (~02:00 December 14 2011) his internet has been dead.
I explain to him that Comcast often will perform network upgrades and maintenance at this time of night, as when I subscribed to their service, staying awake all night long, I'd lose connectivity around that time for a few hours every few months.
So I asked if it came back on later, and no, it was still dead as of 7am December 14 before he left for work.
So I traceroute from my house, which from CentQwesturyLink goes thru texas instead of _directly in town_ (i'll NEVER understand why the 2 main ISPs in denver screw their customers over by routing to DALLAS first. HLRN means Highlands Ranch, a suburb of Denver where, apparently, the DSL services are hubbed)
traceroute to hectors.dyndns.host (220.127.116.11), 30 hops max, 60 byte packets
1 Heimdall.mystica.lan (192.168.x.x) 0.163 ms 0.185 ms 0.220 ms
2 192.168.0.1 (192.168.0.1) 0.870 ms 0.884 ms 0.904 ms
3 hlrn-dsl-gw91-156.hlrn.qwest.net (18.104.22.168) 21.901 ms 22.038 ms 22.315 ms
4 hlrn-agw1.inet.qwest.net (22.214.171.124) 22.325 ms 22.900 ms 22.911 ms
5 dap-brdr-04.inet.qwest.net (126.96.36.199) 46.566 ms 47.289 ms 46.719 ms
6 ix-0-1-0-0.tcore2.DT8-Dallas.as6453.net (188.8.131.52) 48.026 ms 46.924 ms 45.620 ms
7 184.108.40.206 (220.127.116.11) 47.866 ms 47.925 ms 18.104.22.168 (22.214.171.124) 47.874 ms
8 pos-2-2-0-0-cr01.dallas.tx.ibone.comcast.net (126.96.36.199) 46.524 ms 46.374 ms 45.685 ms
9 pos-2-10-0-0-cr01.denver.co.ibone.comcast.net (188.8.131.52) 60.631 ms 60.842 ms 60.801 ms
10 * * *
This traceroute is very unique, because of where it dies. It does not die on the 'ar' hop. Not on the 'ur' hop and not on the 'cdn' hop. It dies on the 'cr' hop, presumably meaning 'core router'. The IBONE HOP. Baffled at this, I tried from my cellphone.
traceroute to 184.108.40.206 (220.127.116.11), 30 hops max, 38 byte packets
1 10.170.205.48 (10.170.205.48) 701.783 ms 116.394 ms 118.317 ms
[10.x addresses in tmobile's internal network were removed for space ]
9 10.176.188.190 (10.176.188.190) 118.408 ms 96.497 ms 119.812 ms
10 te-9-3.car2.Denver1.Level3.net (18.104.22.168) 119.782 ms 96.710 ms 119.690 ms
11 ae-2-52.edge3.Denver1.Level3.net (22.214.171.124) 119.812 ms 135.681 ms 110.535 ms
12 COMCAST-IP.edge3.Denver1.Level3.net (126.96.36.199) 119.721 ms COMCAST-IP.edge3.Denver1.Level3.net (188.8.131.52) 100.037 ms COMCAST-IP.edge3.Denver1.Level3.net (184.108.40.206) 96.680 ms
13 * * *
No Ibone, but, again, the hop just before the Denver Comcast core router, this time on Level3's side.
Stumped at traces being blocked I had someone presently on Comcast try:
traceroute to 220.127.116.11 (18.104.22.168), 30 hops max, 60 byte packets
1 10.120.150.240 (10.120.150.240) 0.663 ms 1.440 ms 1.821 ms
2 22.214.171.124 (126.96.36.199) 24.303 ms 25.202 ms 27.116 ms
3 te-9-2-ur01.aurora.co.denver.comcast.net (188.8.131.52) 16.506 ms 16.629 ms 16.737 ms
4 * * *
5 * * *
So this seems to be a largish routing failure. It never reaches the CDN hop that a proper trace shows:
9 pos-2-15-0-0-cr01.denver.co.ibone.comcast.net (184.108.40.206) 60.671 ms 58.924 ms 60.328 ms
10 pos-0-13-0-0-ar02.aurora.co.denver.comcast.net (220.127.116.11) 64.588 ms 60.150 ms 64.050 ms
11 te-8-3-ur01.aurora.co.denver.comcast.net (18.104.22.168) 60.068 ms 59.640 ms 59.886 ms
12 te-17-10-cdn12.aurora.co.denver.comcast.net (22.214.171.124) 72.856 ms 61.788 ms 79.876 ms
13 c-71-196-255-xxx.hsd1.co.comcast.net (71.196.255.xxx) 68.618 ms 70.031 ms 71.013 ms
So I meet him at home after he gets out of work, and I find his router is PROPERLY DHCPING THE IP address. IT HAS PROPER DNS SERVERS. IT HAS THE PROPER GATEWAY. IT SIMPLY CANNOT ROUTE.
I try the laptop directly to the modem? works perfectly fine. Of course, its a different IP, as the IPs are based on MAC address and assigned by DHCP.
I CHANGE THE ROUTERS MAC ADDRESS to DHCP a new IP; IT ROUTES FINE!
How did you MESS THIS UP SO BAD COMCAST?!
Without me actively understanding IP networks, my friend would STILL be without internet.
I called in a trouble ticket and explained to the first level tech that the packets were not routing at all, gave the IP inside the subnet that was affected... I just hope that said trouble ticket gets to the right techs, else a few more people are likely without internet right now and DON'T KNOW WHY.--