Hacker News new | ask | show | jobs
by alanbernstein 1254 days ago
A photo deduplication tool. For some reason I always resort to writing small, scenario-specific python scripts. A couple of reasons I can recall:

- 3 or more input directories which have specific roles like "main archive, prioritize", "temp folder, remove from here first"

- multiple levels of equivalence test, including file name, exif tags, checksum, perhaps perceptual hash (e.g. for flagging downscaled images to be deleted)

2 comments

I found this library worked extremely well for finding and removing similar images https://github.com/elisemercury/Duplicate-Image-Finder Maybe this doesn't fulfill your main and temp folder requirements but worth a try.
I think this one[0] can do most of that.

[0] https://github.com/qarmin/czkawka