How to lose your data

See that picture? When it happens to you, it may not look quite that bad (or be quite that obvious), but data loss sucks. And it does happen. I’ve been working with computers for 10+ years, and I’ve had it happen a couple times myself. Did I mention how much it sucks?
I’m not going to spend a couple pages telling you why you should backup, I’m just going to be straight about it, unless you really couldn’t care less if that happened to your computer, you are flat out stupid if you are not backing up your data on a regular basis.
Instead of telling you why to backup, I’m going to tell you how to ensure that you are not going to get your data back, even if you think you are backing it up.
Method 1: I’ll just back the data up to CD/DVD.
Well sure, this will work for a bit, but:
- Ever try to save 20GB to CD? Or 250GB to DVD? Ugh.
- How long do you think that optical desk is going to be readable?
Going this route, you can quickly end up trapped behind a small mountain of plastic. Or lets say you manage to somehow keep the optical disks to a manageable quantity, will the marker you labeled it with make the disk unreadable in a year, or is the dye layer unstable, rendering your disk unreadable in six months, or will the glue on the label you made for the disk make it worthless in a year or two? These are just a couple of examples of why optical media should not be considered an archive grade solution.
Method 2: ok then, I’ll just copy the data to a USB hard drive.
Sure it’s better than nothing, but single HDD solutions are not going to keep your data safe. Hard drives fail. In fact it will happen to every single hard disk you will ever come across. The only question is; when? It’s not a matter of if, or of MTBF (mean time between failure), it is more a matter of “you never know, it could fail in ten years, or in ten seconds”.
Don’t get me wrong, if this is the only way you can back the data up, then it is your only choice, and it’s better than nothing. Just be aware, as soon as you copy the data to that USB HDD, the “Clock of Death” is ticking.
Much better would be to copy the data over to a machine with a RAID storage system (preferably RAID5).
Method 3: I bought actual Backup Software (or use a vetted Open Source solution), and run Incremental Backups (to tape!) every single day!
Ok, so you spent some money on a tape backup solution, spent hours reading the manual and configuring your backup. Congratulations, I bet you think your data is safe! Until you find out how Incremental Backups really work (this usually happens after a disaster, and the tapes is all you have left of your pr0n, illegal mp3’s downloaded movies warez mission critical data).
Let’s pretend for a minute that your backup tapes look something like this:
Full_backup_tape (tape 1 – doesn’t matter what you tell it to be, the first backup is always and without exception, a full backup)
Incremental_backup_1 (tape 2)
Incremental_backup_2 (tape 3)
Incremental_backup_3 (tape 4)
Incremental_backup_4 (tape 5)
Incremental_backup_5 (tape 6)
And then you have a catastrophic failure. So you’re sitting there at 2am merrily running the restore, and you hit a snag: tape 2 won’t read. Doesn’t matter why, the tape could be bad, maybe you left it out of the tape safe overnight, and the radio station next door managed to erase it with the magnetic waves they transmit (this actually happened), the data is gone. So is all data after it. See Incremental backups require that all tapes since the last full backup be present and working. So tapes 3-6 may as well be empty, because you are never getting the data off of them. Ever.
If you can’t run full backups every day, use Differential backups instead of Incrementals. Let’s say that in the scenario the user had been running differentials rather than incrementals. They could then restore to current using just the original full backup, and the last differential.
Method 4: Now I’m running differential backups to tape every single day!
But you fail to check the backup logs every day, and the backup job you though had been running for the last year actually failed 273 days ago, and has been requesting the “correct” tape since then. I’ve seen this one a lot (in fact, I think this would be the most popular reason for data loss if you have backup software running).
You’ve got to check your backup logs. It sucks, and it’s boring, but it’s one of those things you just have to do.
Method 5: Alright, I’m running differentials to tape, and have been checking my logs for the last 2 years every single day!
But you’ve never run a test restore. If you haven’t restored data from the tape successfully, there is no data on the tape. The tape was bad, the backup software failed (silently of course), the gremlins ate it.
Method 6: Ok, now I spend two hours reading the log and then randomly restoring files from my backups (before putting the tapes in the tape safe) every single day!
And then your server room catches fire. All machines, and the safe holding the backup tapes are destroyed. You never took any offsite, because you have a tape safe. It happens. It’s unfortunate.
Method 7: Enough, I give up on tape! Now I run a full backup to a RAID5 NAS every single day!
But you ordered your NAS with the drives form the manufacturer, and they used 4 HDD’s from the same batch, and two failed. This is the one that always gets them! The strength of RAID5 is that more than one drive has to fail before the RAID is unrecoverable. The weakness is that hard drives from the same batch tend to fail at the same time (or thereabouts).
To strengthen your RAID system, always make sure that you have drives from different batches, if not from different manufacturers (this is not always the best idea, but that is an argument for another time). For instance: to take care of my backup needs at home, I bought a Buffalo Terastation. Unfortunately, Buffalo sent me a Terastation with 4 drives from the same batch (you can usually tell if they all have the same date on them, sometimes there will be a batch code on the drive). I bought 3 more of the same model drive from 3 different manufacturers, and now have the most healthy RAID I can.
These are not the only ways to lose data, but they are by far the most common. How would I know? I was the Worldwide Manager of Technical Support for a backup software company for several years. And I always got to be the one to explain to the customers why their data is gone.
So what do I do?
There are as many answers to that question as there are IT shops with backup systems. Here is how I protect data at my office:
I backup all data every day (full backup) to a NAS configured in RAID5, with a hot spare. I check the health of the RAID every day (it takes about two minutes). Once a week I backup the entire RAID to LTO3 tape, and take the tapes offsite (currently I am taking them home, where they go into a DATA rated fire safe (there is a difference, do your homework), and then into my large safe where I keep all my other valuables. My ideal would be to have them delivered to a bank safety deposit box, but that costs money.
At home, I back up all my data to the aforementioned Terastation. Once per month, I copy all the data off to a USB HDD (actually two of them), and take one to work where it goes into the tape safe.
Is it perfect? No. Does it stand a much better chance of keeping that data alive through a catastrophic event? Absolutely. You don’t have to go to these lengths to protect your data, but you should be aware of the risks.

June 24, 2009 - 5:49 am
Karl,
What are your thoughts on online backup solutions, for example mozy.com?
Kerry
Click to Reply to This Comment.
Karl L. Gechlik | AskTheAdmin.com Reply:
June 25th, 2009 at 12:33 pm
I am with JoeG on this one. Unless the data can do no harm if it fell into the wrong hands basic online backup services are not the way to go. But if it is some stuff like recipes or a list of comic books then by all means check out SugarSync, Mosy and GetDropBox.
Click to Reply to This Comment.
June 24, 2009 - 8:17 am
You should seriously consider looking at JungleDisk (www.jungledisk.com) which backs up your data to Amazon S3 or Rackspace hosted solutions. You can set it to backup at various intervals, and having the data offsite is always the best way to prevent permanent loss.
Click to Reply to This Comment.
June 24, 2009 - 8:59 am
Ok, this is just going to be personal opinion and experience here.
I have always had concerns about online backup services. There have been several recently that have announced that they will shut down, including HP Upline.
So I guess one issue is that you never know when the service is going to be discontinued, which I guess is not terrible (I am assuming they will offer you plenty of time to go get your data before they close).
My other issue is that you are entrusting your data to someone else, and for the most part, in a format that is easily readable.
Now I am aware of all the disclaimers, and privacy agreements and policies, but I am also very well aware of human nature. After all there are people that manage these services, and they will be tempted, just to maybe take a peek (best case scenario), or to maybe go all out and harvest identity information, and sell it off on the black market (worst case scenario).
I’m sure everyone here has read all about the Geek Squad, and their penchant for ummm, rifling through customer’s data.
Now add to that the fact that I know how easy it would be to write some scripts to look for file names that might contain passwords and such, and then copy that data off to a USB HDD, and it quickly becomes a risk that I personally am not comfortable taking.
Backup is all about mitigating risk to begin with. If you are comfortable taking the risk that some bored technician at the service company may look through your data, then it is just not a factor for you.
Personally I will not allow my data to be placed anywhere outside of my direct control without it being fully encrypted (I’m talking 128 bit AES at least).
Now if we are just talking things like digital pictures of your family, it might make sense to have that stored somewhere in the cloud.
Click to Reply to This Comment.
June 25, 2009 - 6:54 am
So I guess my question is – What happend to that computer? Burned out from overheat or started a fire?
James
Click to Reply to This Comment.
June 29, 2009 - 12:28 am
In my work place I got 5 tapes named for every working day of the week.
we take a full backup every working day, then we put it in a fire proof safe and get the yesterday’s tape and send it to the IT-department in a sister company which has other building in the same city.
if a disaster happens we-at least- got the yesterdays work backup.
Click to Reply to This Comment.
July 10, 2009 - 1:26 am
Best explaning i have ever read
“Is it perfect? No. Does it stand a much better chance of keeping that data alive through a catastrophic event? Absolutely”
Click to Reply to This Comment.