Hacker News new | ask | show | jobs
by Flenser 2844 days ago
> I've got 20-odd hard drives that have accumulated over the years, filled with backups and libraries from Apple Photos, Aperture, Picasa, several hundred gigs of Google Takeout tarballs, and other ancient DAM apps.

I'm in a similar boat. What I'd like to know is: where are the duplicates and what can I safely delete? Anything that can help me clear it up would be a godsend!

1 comments

This was the approach I originally was considering (to do in-place duplicate deletion), but eventually gave up due to the impact of "undiscovered features" in my code.

The approach I've settled on which should work for most people is to establish a new library, with unique copies of each of your originals, skipping exact SHA matches and invalid files.

In your case, though, you'd run PhotoStructure in its "don't copy into the library" mode. Once it finishes scanning your drives, you can run a simple SQL query against your SQLite db to get a list of duplicate files. That query will be in the FAQ.

Thanks, that would be a great help!

I could manage that query but your average user wouldn't. How about a way to export it to a CVS file so it can be viewed and filtered in the users choice of spreadsheet app?

That's a good idea, thanks.