Nice job L3 :)
Considering it's a router in NY cutting off twitter... I have to assume they are being yelled at by a few people .....
Just impressed by what L3 has done in NY...
Tracing route to twitter.com [22.214.171.124]
over a maximum of 30 hops:
1 1 ms 1 ms 1 ms Wireless_Broadband_Router.home [10.10.1.1]
2 10 ms 7 ms 6 ms L100.NWRKNJ-VFTTP-86.verizon-gni.net [126.96.36.199]
3 9 ms 6 ms 6 ms G2-0-3-886.NWRKNJ-LCR-08.verizon-gni.net [188.8.131.52]
4 9 ms 11 ms 10 ms so-3-1-0-0.NWRK-BB-RTR2.verizon-gni.net [184.108.40.206]
5 13 ms 14 ms 14 ms 0.xe-11-1-0.BR1.NYC1.ALTER.NET [220.127.116.11]
6 12 ms 11 ms 12 ms ae11.edge2.NewYork.Level3.net [18.104.22.168]
7 11 ms 9 ms 12 ms vlan51.ebr1.NewYork2.Level3.net [22.214.171.124]
8 * * * Request timed out.
9 * * * Request timed out.
10 * * * Request timed out.
11 * * * Request timed out.
12 * * * Request timed out.
13 12 ms 11 ms 9 ms vlan51.ebr1.NewYork2.Level3.net [126.96.36.199]
14 * * * Request timed out.
15 * * * Request timed out.
16 * * * Request timed out.
17 * * * Request timed out.
18 * * * Request timed out.
19 16 ms 9 ms 10 ms vlan51.ebr1.NewYork2.Level3.net [188.8.131.52]
20 * * * Request timed out.
21 * * * Request timed out.
22 * * * Request timed out.
23 * * * Request timed out.
24 * * * Request timed out.
25 10 ms 12 ms 11 ms vlan51.ebr1.NewYork2.Level3.net [184.108.40.206]
26 * * * Request timed out.
27 * * * Request timed out.
28 * * * Request timed out.
29 * * * Request timed out.
30 * * * Request timed out.
Nice job L3
Fort Wayne, IN
I presume that you've always executed your occupation flawlessly?
|reply to hdlevy |
Yup looks like a route got corrupt.
Is it working now? I assume it will not stay broken very long.
|reply to cdru |
We saw this too... but we were on the flip side. As a Level3 customer we lost everything.
Level3 somehow botched a software upgrade to something like 80 of their core routers. How something like this is even allowed to happen, I have no idea. I'm still waiting for the RFO to come back so I can talk to someone about it, but there were so many things wrong with this entire situation:
* Upgrades were done to handfuls of routers at a time, not one at a time in an orderly fashion.
* Upgrades were done while there was a fiber cut on a backbone fiber in the Pittston, PA area.
* Tickets to Level3 went unanswered for hours after the maintenance window.
* No one had any idea what was going on.
* The maintenance window had a maximum outage of "30 minutes listed", which extended to 4+ hours and drug outside the maintenance window.
Speaking from the other side of things, and from your individual gripes matthopp , you've never been
part of such work before, am I right?
Sometimes this stuff is well planned out and you have a couple months lead time. Othertimes you're tossed
into the boiling pot and told to sink or swim, for managerial, business unit or political reasons; either case, you
just buckle down and get the job done. I won't say that doing an 80 router upgrade should always be a walk in
the park, BUT so long as you plan it out right AND don't get a visit from Brother Murphy, it's doable. I've seen
and pushed such upgrades myself before, but a) I agree, when this blows up it sucks to be caught in the backblast,
whether you're a downstream user or at the epicenter of it all trying to fix the mess, and b) without that RFO
though anything at this point is pure speculation.
My 00000010bits anyways.
If you get that RFO and it's not under any sort of NDA restrictions, please do share.
Yeah I work on the "other side".. just don't post over here much.
Here's the RFO.. it was a "perfect storm" and Level3 should have not done the GCR with circuits down.
Yesterday at 16:10 GMT an excavating company drove right through our pole line which took down 4 spans of fiber total fiber count 300 pairs. Poles and fiber were restored at 12:06 GMT total outage 20 hours 10 minutes.
Level 3 had a planned Global Change Request issued for over 100 Juniper Core routers in Europe and on the Eastern seaboard including Chicago and much of the Midwest. Level 3 could not postpone the Global Change Request because a maintenance window was issued to 10,000+ customers. Unfortunately when the Juniper Core was upgraded the working path was taken down and your protect path was cut due to the larger fiber issue.
---- END QUOTE ----
During a scheduled maintenance (GCR 6287157) to upgrade a Washington DC router, a configuration issue caused the two routing engines on the router not to sync. Once the Technical Service Center notified the IP NOC of the customer impact, the NOC wiped out the configuration and loaded a current configuration to restore services.
....I agree, perfect storm indeed. And a total FML / date with Crown Royal moment.
As for that routing engine sync issue... I've worked with enough gear to know that that's
ALOT harder to zero in on... and if it was part of the fiber issue, it makes tracking it down
that much harder as all your other issues can mask the problem.
Moral of the story... I hope when they fixed the thing, the guys who fixed it were given
a 2week allexpenses trip to the Bahamas, minimum. Cuz I can say I wouldn't've wanted
any part of that cleanup job ... [facepalm]