dslreports logo
    All Forums Hot Topics Gallery


Search Topic:
share rss forum feed

It's all good
Lake Villa, IL

Data Storage for archiving & backup

For anyone that has major regulatory around data arching & integrity - are you still using internal storage or looking at external storage? I'm possibly looking at using Amazon Glacier.

Anyone else using them? What's your experiences in getting the data up to Amazon and have you had to retrieve anything?

I would be starting from scratch (don't ask why). No reliable backups in house. No SAN replication. Rather then start with something I know that will grow almost unmanageable in house - I'm looking to setup DR/HA in house and keep backups and archives out side.

My concern is retrieval. Amazon is stating 3 to 5 hours for a "job". I think when I get a plan completed - the DR/HA would be for the last 72 to 96 hours in house, and everything else gets shipped off. There may be some archived data (180 days or less) that will need to be frequently accessed, and I'm thinking I put the cash and infrastructure into that as well.

If you aren't using Amazon - are you using some other type of cloud based company or something in house?

So you know what I'm dealing with - the data would be e-mail, financial data for one segment. The other is recordings of messaging between clients (sms/e-mail/IM/voice). The messaging is required to be kept for 7 years due to various regulations.
"All that is necessary for the triumph of evil is that good men do nothing.” - Edmund Burke

Lanett, AL
I don't have experience with Amazon specifically or any other cloud based backup service but I can give you some pointers.

You mentioned "job" time, I'm assuming this is backup/restore job completion time? I would ask what kind of internet bandwidth you have at the main site where this data will flow to/from? I did a comparison for backing up my shared data folder (at the time was a mere 180GB) to a server at my house. I had 16/2 from Comcast at the time so 16 Mbps was the theoretical fastest a "backup" could go. Using the calculator on this site I found that at maximum possible speed it would have taken 26.8 hours to complete that backup.

Obviously this time is going to vary based on your internet speed and the size of your backup. Initial backup will take the longest of course and any consistency checks if things come out of sync (backup fails or something). Amazon or another provider will certainly have more bandwidth than I had at home but take a look at your upload speed. For us, it wasn't feasible because we weren't even planning on having that much bandwidth (we were planning this in conjunction with a change in ISP and went with 22/5 from Comcast Business). FWIW at 5 Mbps, that would have taken 85.9 hours or about half a week.

It's all good
Lake Villa, IL
I'm not too worried about the connection speed/upload time. I would have some type of dedicated commercial internet or MPLS/PTP circuit to Amazon (or xxx service) to push the data to them. I'm only talking about 20 to 50 gig's of data being pushed maybe 2 or 3 times a day.

When I was referring to job time, I was talking about retrieval/restoration of data. 3 to 5 hours for archive retrieval isn't bad, since that data would be over 72-96 hours old. I would have a policy that x type of data takes 24 hours to retrieve.
"All that is necessary for the triumph of evil is that good men do nothing.” - Edmund Burke

reply to djtim21
said by djtim21:

I'm not too worried about the connection speed/upload time.

First off, I would be... especially if you have 20 - 50GBs of data.. as JoelC707 See Profile said, you may want to
do some math to figure out how long it's going to take to transmit 50GBs of data to Amazon. Short ver, unless
you get an OC-3 or something, expect to be staring at the hourglass icon, worst case scenario on the order
of days. If you need this archived reliably and ontime, MAKE SURE YOUR INFRASTRUCTURE IS SCALED FOR IT.

Second : Amazon / cloud / [insert fancy buzzword here] -- take with a grain of salt and READ THE FINE PRINT.
What is their SLA, what are their guarentees, who are your contact(s) / escalation point(s), what internal backups / DR
does Amazon have if their datacenter suddenly became a smoking crater? I think all of us DSLR'ers only have to look back
at this board's outage earlier this year to see why you want to make sure whoever you outsource it to knows a) what to do
when the lights go up like a Christmas tree, and b) isn't going to give you the runaround of "it's being worked on. We'll get
back to you."

Third : what regulation(s) does your company fall under, and does Amazon fulfill it, especially from a Legal standpoint.
Again, if Amazon suddenly became a smoking crater in the ground, that's the WORST time to find out you're in breach
of a legal requirement.

Just my inital 00000010bits.