What to do with those old hard drives full of photos?
A few people have asked me what to do with all those old hard drives full of photos that you might have sitting in your closet. Over the years, as people have been going digital, they accumulate fragments of their asset collection spread across multiple hard drives: each one with some different portion of the collection, with varying degrees of overlap. I have first hand experience with the process of sorting out this mess, having worked through it a few years ago for someone (who definitely is not my wife).
In Don’t be the turkey // a “belt and suspenders” approach to Personal Digital Asset Management, I presented my Personal Digital Asset Management (“PDAM”) workflow using Lightroom Classic + Dropbox + periodic external backups, along with the “CRR” framework to achieve a “belt and suspenders” approach to risk management. CRR stands for “Consolidated, Redundant, Routine” — aka “Cheesy Rainbow Rave” (figure 1), just to make it all easier to remember.
One of the components of the CRR framework is “consolidated”. With the second part “redundant”, that doesn’t mean you don’t have multiple copies of things, but they should nonetheless all be within one consolidated system — as opposed to fragmented with some parts over here and some parts over there.
The solution to dealing with the mess of old hard drives is to gather them, dump all their contents into one bucket, and then run a “photo duplicate cleaner” tool to scrub out the duplicates. Searching in Google and Apple’s App Store yields a plethora of standalone tools. Meanwhile, Apple Photos has a duplicate scrubber built in >> macOS + iOS. Google Photos appears to have a feature that detects duplicates on import, though unclear how well it works.
The tool I used back in 2016, and which still looks the best to me is PhotoSweeper (figure 2) >>
- it comes highly recommended at 4.7 stars on 6,700 reviews (figure 2)
- compatibility with top photo tools >> it works with files in folders, Apple Photos, Capture One, and Lightroom Classic (figure 3)
- the $15 cost is very reasonable
- I used this tool to perform the process back in 2016, and I was pretty satisfied with the results
There’s also Photos Duplicate Cleaner (figure 4), which also has good reviews but not as good as PhotoSweeper, and without the latter’s tools compatibility.
PhotoSweeper is the tool that I used back in 2016, and it would be my first choice if I had to run such a process again today. But my guess would be that many of the top rated tools would probably get the job done just fine. I would recommend to pick one from Apple’s App Store rather than something out of Google, just to make sure it has been vetted as legit/safe software. All else being equal, I’d prefer to pay a very reasonable $15 to buy the app, rather than wade into all the “hidden costs” of “free” apps.
That being said, it’s important to understand that there’s some “elbow grease” involved in the process… PhotoSweeper scans the sources you feed to it (folders, Lightroom, Apple Photos, etc.) and then groups/stacks all the duplicates that it finds. Within each stack of duplicates, it picks a “keeper” and “marks” the rejects for removal (figure 5). As you review the results, you can manually adjust which of the duplicates is the “keeper” versus the ones “marked” for removal. Once you have reviewed and adjusted the results as needed, the tool then moves the rejected duplicates to the trash (figure 6).
What counts as a duplicate isn’t always so straightforward — you might have the same image in different versions >>
- high res versus low res // different image size (pixel dimensions)
- different file formats // compressed versus uncompressed
- different filenames
- different capture time/date // depending on whether/how the metadata has been corrupted
- different scans of the same image
PhotoSweeper has “comparison settings” where you can dial in how the tool should deal with such differences (figure 7) — and then, as mentioned, you can manually adjust the “keeper” for each set of duplicates. You may need to run the process more than once and tweak the comparison settings in order to refine the ratio of “false positives” (detected duplicates “marked” for removal that should be keepers) versus “false negatives” (undetected duplicates).
Following is an outline of the overall process for consolidating your collection of old hard drives:
- Gather up all those old hard drives.
- One by one, connect each of the drives to your computer, and copy the assets from each hard drive into one “bucket” folder. You can use a subfolder within that bucket for each hard drive, but it doesn’t matter, since you will re-sort the scrubbed assets at the end. Depending on the size of your collection and the volume of data on these different drives, you may need to get a larger external drive for the bucket (although this may slow the process) — for some suitable SSD drives to consider, kindly refer to Diving deeper on External Snapshot Backups.
- Run PhotoSweeper and let it scrub out the duplicates. You have to manually approve the groupings of duplicates “marked” as rejects, and you may have to run the tool/process more than once as you dial in the “compare settings”.
- Now that the bucket folder will have been scrubbed of duplicates, you then import it into Lightroom and move/sort all the residual scrubbed assets into a new folder structure (eg. subdivided by year).
As always, please feel free to reach out with any questions or comments >>