Simple, Practical RAW Archive Backup & Organization
by Ethan G. Salwen
November 01, 2010 — “I had hundreds of thousands of wonderful RAW image files but I don’t know what happened to them.”
—Retired professional photographer, 2052
In 2002 I toured the National Geographic Society’s image archive. The crème de le crème of the images were housed in a subzero, bombproof achieve vault. The chilly storage container was a reminder that all traditional photographic materials degrade. Given this, it was interesting that during the shift to digital, many photographers chief concern was that their digital archives would become as obsolete and unusable as 8-track tapes. Luckily, photographers no longer seem worried about this. Unfortunately, earlier fears have been replaced with dangerous complacency. It seems that only the serious photo nerds put proper time and effort into ensuring that not a single RAW file will go missing. Although we all have access to better backup options than were available to the National Geographic Society a decade ago, most of us are too busy making and sharing images (and dealing with the pressing concerns of making money with them) to ensure our digital archives are 100 percent secure.
Instead of trying to terrorize you into backup action (that approach never works), I’m going to offer a way to vastly improve the security of your archive of RAW files in a manner that will likely help you address organizational concerns. This approach is fuss-free, requires no complex learning, and best of all, will help you adopt best practices moving forward. This flexible system, which I have embraced with success, is both effective and anxiety reducing. It’s based on using a number of hard drives in an intelligent manner.
The Standard Best Practice
Peter Krogh, author of The DAM Book: Digital Asset Management for Photographers, and Richard Anderson, author of Digital Photography Best Practices and Workflow Handbook, are the driving forces behind dpBestflow.org, an online resource dedicated to teaching best practices for digital imaging. They advocate saving at least three copies of RAW files on two different types of storage media—namely hard drives and write-once storage media, such as DVDs or Blu-ray discs—and storing copies at two different locations.
If you have every one of your RAW files backed up this way, kudos to you—big time! You are in a teeny, tiny minority, and you can stop reading this article. If you are like most photographers, you are not following this best practice, and you lack a game plan for rectifying the situation quickly. I’m a big fan of the best-practice thinking Krogh and Anderson outline in their books and at dpBestflow.org. However, if you are in jeopardy of losing even a single critical RAW file, you don’t have time to figure out best practices. I suggest that focus on achieving reasonable, practical RAW archival security using hard drives alone. Immediately valuable, it will also help you work toward best practices.
The Benefits and Problems with Hard Drives and DVDs
The great thing about hard drives is that they are fast, affordable and hold tons of data. The crappy thing about hard drives is that they are fickle, fragile machines that will crash sooner or later. Say goodbye to all that data. Another major pooh-pooh against hard drives is that it’s easy to accidentally write over data. Say goodbye to those images.
The great thing about DVDs and other write-once storage media is that they are seriously stable, won’t crash and can’t be written over. The problem with write-once media is that it takes a heck of a lot of time to burn data to them, and even big, bad Blu-ray discs don’t hold much data compared to hard drives.
What is an Archive Anyway?
In proposing different workflows, Krogh and Anderson talk a lot about the image “archive.” It turns out that “image archive” means different things to different people. Krogh helped me understand a key concept by explaining that once RAW files get archived, they never change. This means that archived RAW files can be backed up and backed up again, without introducing confusion, even as we add further metadata that is linked to these archive image files.
Example One: After you capture 2000 RAW files at a wedding, you rename them, add contact information and copyright metadata, convert them to DNG files and then add them to your RAW image archive. Now that they are archived, you are not going to add more metadata directly to these files. They are in permanent storage, and you only need to ensure they are backed up. Later, in a program like Lightroom, you will likely apply ratings, keywords and metadata processing instructions. This information will live in your Lightroom catalog, but will not get added to the files in your RAW image archive.
Example Two: Before converting to DNGs and archiving your 2000 RAW files, in addition to renaming them and adding your standard, bulk metadata, you apply ratings, keywords and processing instructions. Clearly these files will get archived with more metadata. However, again you very well might also add more in other programs that link to these archived files.
Which is a better way to work? Both are fine. And depending on time and specific needs, you will likely archive RAW images with varying degrees of metadata embedded. The only thing you must do before archiving a RAW file is give it a unique name. What is important to understand is that once you archive and image, it never changes.
Working Files, Derivative Files and Catalogs
Both Krogh and Anderson suggest creating “workflow pipelines” for processing RAW files. Until a RAW file is ready to be archived, they suggest putting it in one parent folder/directory called “Working.” Working files need to be backed up, too, of course, but they are not ready to be “put away” into your
Derivative files are the files that we make from our RAW files. These include Photoshop, TIFF and JPEG files. We need to back these derivative files, of course. But these are not part of our RAW file archive. Catalogs programs, such as Lightroom and Aperture, contain critical data linked to captures in our RAW archives. But once again these are not part of our RAW file archive. Understanding that RAW files belong in their own archive and that, after a certain point, we no longer want to add metadata directly to them, if very helpful. (Only with DNG files can we actually add metadata to the file; proprietary RAW files require pesky “sidecar” files.)
Protecting a RAW File Archive with Hard Drives Only
Step 1: Make a rough estimate how much RAW data you have. This data might be on one drive, but it is likely spread among many. It’s time to bring it all together.
Step 2: Buy three identical hard drives that are at least 20 percent bigger than your estimate. If you have about one terabyte of RAW files, buy three 1.5 terabyte drives. We’ll call them “A,” “B,” and “C.”
Step 3. Format each drive using a disk utility application that writes all zeros to the disk. This will ensure the drive is in great shape, also giving it an excellent workout before you transfer critical data to it. (For Krogh’s step-by-step disk formatting video tutorials for Macs and PCs, go to: dpBestflow.org > Best Practices > Data Storage Hardware > Disk Configurations.)
Step 4: Transfer all your RAW files to drive “A” using validated transfers. Even if your RAW files only on one drive, they are likely mixed with all kind of other files. This is a chance to separate them out onto a dedicated RAW archive hard drive. Don’t worry about how these files are organized at this point. Just dump in a folder named something like, “Incoming RAW Files_Validated.”
About Validated Transfers: Standard file copying can introduce data corruption. Validated transfers check copied data bit-for-bit against original data. Chronosync for Mac and Syncback for Windows are great, affordable programs for performing validated transfers. (See Krogh’s step-by-step video tutorials at: dpBestflow.org > Best Practices > Data Validation > Data Transfer.)
Step 5: Backup “A” to both “B” and “C,” again using validated transfers.
Step 6: Store “C” in a safe, secure offsite location.
Creating Order and Advancing with Best Practices
Having a backup hard drive stored off site is a standard best practice. The reason to make two backup drives is because it allows you to ensure much, much greater security through redundancy, without having to burn data to write-once storage media. You will backup all files to write-one storage media. But now you can do so calmly, when time allows, with greater peace of mind.
If your RAW archive is an organizational nightmare, no worries. You can set up a “organizational pipeline” right on drive “A,” even renaming older files, if desired, and certainly adding keywords and other metadata. Although, ideally, an archive of RAW files does not change, you want to be open-mined to how you can clean up your archive as you move forward. It’s easy now that your RAW files are separated and backed up on multiple hard drives.
Ever time you add new RAW files to “A,” reorganize its contents, or add metadata to file not yet burned to write-once media, back it up to “B.” And then, as soon as possible, swap “B” out for “C” and update that drive. You can achieve all of these steps quickly and easily, helping you organize your archive and protecting you RAW files with confidence. No subzero, bombproof vault required.
Ethan G. Salwen is an independent photographer and writer based in Buenos Aires, Argentina. He specializes in Latin American cultures, and also covers a wide variety of topics for professional photographers including digital technology, marketing techniques and industry trends. Salwen received his training in photography at Rochester Institute of Technology. Visit his blog at www.aftercapture.com.
You Might Also Like
You are invited to experience the digital edition of AfterCapture magazine. See the 2012 AfterCapture Digital Imaging Contest Winners' Gallery - extraordinary images that will inspire you.Read the Full Story »
AfterCapture is now available for free as a digital edition. View the new issue via the computer or download the app through the iTunes store and view AfterCapture on your desktop, tablet or mobile device.Read the Full Story »
Adam Fried started a company with one large-format printer out of a garage in 2005. He called the startup Simply Canvas and, true to its name, it sold gallery wrapped canvases to professional photographers. Simply Canvas grew rapidly, moved into a new building, and began hearing requests for more services. Customers wanted the company to handle paper prints also, for instance. Read the Full Story »
Get the latest from Rangefinder and WPPI straight in your in-box. Sign up for our newsletter!