Re: It's for the children The must be furiously updating routers because of the IPv4 route memory exhaustion going on. And when you rush, things break
Mountain View, CA
Technical details/explanation ...albeit briefly were discussed on outages.org and NANOG:
In English: either TWC's or L3's routers announced to the rest of the Internet "don't bother sending us traffic for 220.127.116.11/17 or 18.104.22.168/17 any more" (this isn't the same thing as a route expiring; this is administrative, as in someone made a routing table config change hence the withdraw). Then ~1.5 hours later, requested traffic to those two /17s be sent back to them.
Whether or not this was truly a Level3-induced mistake is unknown at this time, but it's certainly possible depending upon the relationship between TWC and L3 (who controls and manages what equipment, etc.).
RIPE's BGPlay replays (you may want to choose "Options" and set animation speed to 4 or 5 to speed things up, as there's a lot of nonsense going on):
You'll see some route changes (at 08:43 UTC, 09:19 UTC, and 09:28 UTC), followed immediately by a complete mess of route path changes and some widthdrawls (from 09:29 UTC onward), until 09:31 UTC where over the next 4 minutes the rest of the Internet said "withdraw? Okay you got it" and they disappeared from the 'net. 1 hour and 15 minutes later (at 10:50 UTC) routes were reannounced.
outages.org thread: »puck.nether.net/pipermail/outage···935.html
NANOG thread: »mailman.nanog.org/pipermail/nano···452.html
Making life hard for others since 1977.
I speak for myself and not my employer/affiliates of my employer.
Pacific Palisades, CA
Re: Technical details/explanation From TWC Untangled
This Morning's Outage
During an overnight network maintenance activity in which we were managing IP addresses, an erroneous configuration was propagated throughout our national backbone, resulting in a network outage.
We immediately identified and corrected the root cause of the issue and restored service by 7:30 am ET. We apologize for any inconvenience this caused our customers.
A failure of this size is very serious and we are taking the necessary steps to improve our processes with the objective of making sure this doesnt happen again.