| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hvidgaard 3707 days ago
	A hash is often use for an online algorithm. If you know the hash you know there is a potential for dedupe, and you can do a byte for byte comparison. I suppose you could use size as a prefilter for dedupe. This is if you do dedupe on a file level. Dedupe on block level doesn't care about the content, only the blocks, and it's not unusual to see for instance mp3 files with the same mp3 stream but different metadata. You cannot do the latter without hashing.