The raison d’être of this entire digital life strategy is process which will lead to piece of mind. Every bit has a purpose and direction. It is either temporary, important or necessary. Knowing there’s a process for each type of bit and that the important and necessary stuff exists in multiple places, relatively safe from single points of failure (such as a hard disk crapping out), will bring me piece of mind. That’s the plan anyway.
From a pure archival and backup perspective, the Sun in my digital solar system will be a mass storage device—in this case, a Network Attached Storage (NAS) device. There are plenty of these all-in-one appliances out there. Iomega, D-Link, NetGear, HP, Drobo and LG all make some flavor of NAS. I’m sure any NAS from any of those brands will work for most people, but I really want to understand how my storage solution works so it can evolve and, in event of error, I can recover it.
After researching a bit and talking to my system admins at work, I decided that I want a hybrid system that has drive redundancy, a removable component and off-site replication.
For disk redundancy, I’m going with a RAID5 configuration—set of striped disks with parity data spread across all disks—which means the NAS can take a single hard drive failure and continue to operate without any data loss. I considered RAID6 (allows for two drive failures), but that’s really overkill and, combined with the remainder of my backup strategy, unnecessary.
Why do you want disk redundancy? Simply put, hard drives suck. They are so error prone that each drive has a system (called S.M.A.R.T.) that monitors how many errors it makes in hoping to predict a disk failure. Google did a study on drive failure rates. You can see the chart to the right which indicates percent failure by age (years in service). 8.6% of drives with three years of service fail! Ouch. So given the high rate of hard disk failures, that was an absolute requirement for my solution.
The removable component entails replicating certain data shares from the NAS to an additional external disk that I can easily unplug and take with me in the event of a natural disaster (hurricane, tornado, fire, etc.). This is also part of my archival strategy. I can offload older, lesser accessed data onto multiple external drives for safe keeping.
Finally, the off-site, or remote, backup means I’ll be replicating data to a disk located somewhere out in the Cloud (I knew I could work the Cloud into this post). Why? Because the Cloud is all the rage. No, that’s not why. Having remote storage means my really, really important stuff is geographically dispersed and not next to all of my other data. Sounds kind of contradictory doesn’t it? What? You want your important stuff out on the hacker haven that is the Internets? Yes, sounds stupid. But if properly protected (unlike Twitter’s data), it’s fine.
The solution that will allow me to accomplish these goals? I’m going to build it. I will use the open-source software project called FreeNAS. It has quite a few lovely features (including software RAID, iSCSI and RSYNC) that will [hopefully] perform all I need. I did a little test using the downloadable VMWare image and a USB keychain drive. I was able to do exactly what I wanted, albiet a much more scaled down version. I will document as I go in case anyone wants to try to follow in my footsteps. In the next part, I will discuss the machine build and the parts I will be using.