site Search:


 
    All Forums Hot Topics Gallery






how-to block ads


 
Search Topic:
Uniqs:
1199
Share Topic
Posting?
Post a:
Post a:
Links: ·Site FAQ ·Broadband FAQ ·All Faqs ·HOWTO post
page: 1 · 2
AuthorAll Replies


HiVolt
Premium
join:2000-12-28
Toronto, ON
kudos:17

Recovery progress?

Any update on how the recovery is progressing? No update from Justin on that google docs page for a while now.
--
BUCK FELL ,,!,,('-'),,!,,


mjf
wish I was blue
Premium,Mod
join:2000-08-05
New Orleans, LA
kudos:2

I have heard from a very good source that he is guessing within a week.

I wouldn't bet the house on that!



HiVolt
Premium
join:2000-12-28
Toronto, ON
kudos:17

Awesome!



shortckt
Watchen Das Blinken Lights
Premium
join:2000-12-05
Tenant Hell

reply to HiVolt
I hope Justin (or someone else who has access to all the details) eventually posts up a detailed description of the failure and restoration with all the small technical details when this is all over and they have time.



state
stress magnet
Premium,Mod
join:2002-02-08
Purgatory
kudos:6
Host:
Webhosting
Android
Sonic.net
Washington & Balti..
UK Chat

There is an update here: »docs.google.com/document/d/1kll8···rHY/edit

Update: 7th May
I am told by the second lab the missing data is intact up to the time of the power fail. I have to now wait for payment to go through and return by courier of the media. Then there will be time spent checking the data, and restoration of both data and hardware without adding too much downtime. For the curious: this event has cost $28,000 in lab recovery fees. This is not including any money that we must now spend on new hardware, does not include financial impact of downtime, permanently lost traffic, and so on.



HiVolt
Premium
join:2000-12-28
Toronto, ON
kudos:17

Yeah I just read that... Great that the recovery was successful.

The cost for recovery is nuts though...
--
BUCK FELL ,,!,,('-'),,!,,



J E F F
Whatta Ya Think About Dat?
Premium
join:2004-04-01
Kitchener, ON
kudos:1
Reviews:
·Rogers Portable ..
·WIND Mobile
·Rogers Hi-Speed
·magicjack.com

reply to state
IS Justin going to come on here so we can talk to him? I had concluded that this mess was costing about $40,000, so I'm close. Someone needs to buy Justin and me a beer.

You think Justin will set up a paypal like what wikipedia does to collect? Or is it good?
--
Not all men are idiots. There are still a lot of bachelors out there.


Bobcat79
Premium
join:2001-02-04
Reviews:
·Verizon Online DSL
·Optimum Online
·EarthLink

reply to shortckt

said by shortckt:

I hope Justin (or someone else who has access to all the details) eventually posts up a detailed description of the failure and restoration with all the small technical details when this is all over and they have time.

Maybe Justin will write a front page article about how RAID is not backup.


shortckt
Watchen Das Blinken Lights
Premium
join:2000-12-05
Tenant Hell

said by Bobcat79:

Maybe Justin will write a front page article about how RAID is not backup.



Would definitely be a timely article, since backups are often an ignored subject in both personal and business settings. BBR got lucky... a white paper I downloaded some time back gave some stark, eye opening figures for business losses and failures caused by data loss. A timeline of what happened here, along with the associated numbers, can be a good example to point to when a client wonders why backups are so important.

Mostly for curiosity I would still like to read a technical writeup of the incident, such as how the drives are configured and what is stored where, where was the damage and how was it recovered, what prevented use of the mirrored data, how are backups performed and what backups were available*, did the hosting site UPS have the means to signal power failure, did they ever determine why the gen didn't start.

Along with that, it would be interesting to see site stats for the first few days BBR was online again.

From: sequence of unfortunate events
Tuesday 17th

Dell support says they don’t know the cause, but we must wipe entire array, do firmware upgrades, and start again. I don’t trust this gear.
*Check backups, mail: ok, nfs:ok, site files: ok. The sql backup is incomplete.


Since NAC is a large hosting facility I wonder how many other clients had problems or loss caused by the power failure.


state
stress magnet
Premium,Mod
join:2002-02-08
Purgatory
kudos:6
Host:
Webhosting
Android
Sonic.net
Washington & Balti..
UK Chat

reply to HiVolt
Just a quick status update:

Data restoration is underway. User accounts, ISP reviews and news have been restored. There will be some hours of down time scheduled for final restoration. Full restoration is anticipated within days (May 9th)



HiVolt
Premium
join:2000-12-28
Toronto, ON
kudos:17

Awesome!



dvd536
as Mr. Pink as they come
Premium
join:2001-04-27
Phoenix, AZ
kudos:4

reply to HiVolt

said by HiVolt:

Yeah I just read that... Great that the recovery was successful.

The cost for recovery is nuts though...

Expensive event! what is nac.net kicking back to justin because it was their fault?
-
I saw on a site that lists what sites make on ad revenue and dslr was around ~$1300 per day. OUCH!


Matt_31
Who Hit The Power Button
Premium
join:2003-02-21
Jasper, IN

reply to state
feels good to be back. What a mess, I have missed this place.



Jackarino
Premium
join:2006-12-28
Allendale, NJ
kudos:1

reply to HiVolt
You never realize what you have until its gone



StyxKee

join:2001-07-05
GTA, Canada

reply to state
Just saw announcement on the top of the page....Excellent news. Great work.


UmmaGumma

join:2011-06-19

I guess I don't understand the whole process. If all the data was intact, but just something with the SQL got messed, why the need for recovery? Why not be able to use, or just copy the existing drives?



Weirdal
Premium
join:2003-06-28
Grand Island, NE
kudos:20

reply to HiVolt
Looks like a few threads got mixed up in the recovery process. For example:
»We're back...
(most of that thread was originally in the cooler)

Good job getting everything back on the site though.
--
»[Info] The DSLR Orangeface extension 2.0!



cdru
Go Colts
Premium,MVM
join:2003-05-14
Fort Wayne, IN
kudos:7

reply to UmmaGumma

Re: Recovery progress?

said by UmmaGumma:

I guess I don't understand the whole process. If all the data was intact, but just something with the SQL got messed, why the need for recovery? Why not be able to use, or just copy the existing drives?

The site was ran on multiple servers off of a common storage array, a Dell MD3000 plus a MD1000 expansion module, from the status update document Justin was maintaining. The storage array keeps track of the drives, their RAID array configuration, etc and presents storage to the server operating systems as one or more virtual disks across one or more physical drives.

The problem was that the storage array decided to go on vacation and just leave the virtual drives in an inaccessible state. All the bits are still there, or at least almost all there depending on what exactly had or hadn't been committed when the power was lost. Just where all those bits were at precisely and in what order was the first step to just determining the state of the rest of the system.

Once they could determine that the virtual drives could be recovered then the "fun" task of recovering the files/databases/etc and trying to reincorporate them back into the site that was limping along.


aannoonn

@optonline.net

The real problems:

1. Justin used Dell hardware.
2. Justin didn't have real backups.
3. NAC is a lousy datacenter.


Saturday, 18-May 12:40:01 Terms of Use & Privacy | feedback | contact | Hosting by nac.net - DSL,Hosting & Co-lo
over 13.5 years online © 1999-2013 dslreports.com.
Most commented news this week
Hot Topics