Hacker News new | ask | show | jobs
by girishso 601 days ago
Just reporting in some plain text format so I can manually delete the duplicates, or create some script to delete.

I can't have like 10 external HDDs attached at the same time, so the tool needs to dump details (hashes?) somewhere on Mac HDD, and compare against those to find the duplicates.

1 comments

Here you go:

    cd /path/to/drive
    find . -type f -exec sha256sum {} + | sed -E 's/^([^ ]+) \./\1,/' >> ~/all_hashes.txt
Run that for each drive, then when you're done run:

    sort ~/all_hashes.txt > ~/sorted_hashes.txt
    awk -F, 'NR==1{print;next} {print $0 | "sort | uniq -w64 -D"}' ~/sorted_hashes.txt > ~/non_unique_hashes.txt
The output in ~/non_unique_hashes.txt will contain only the non-unique hashes that appear on more than one path.