| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ninkendo 617 days ago

The point of the content hash is to make it trivial to verify that the content hasn’t changed from when its hash was made. If you just make a uuid that has nothing to do with the file’s contents, you could easily forget to update the UUID when you do change its content, leading to invalid caches (or generate a new UUID even though the content hasn’t changed, leading to wasteful invalidation.)

Having the filename be a simple hash of the content guarantees that you don’t make the mistakes above, and makes it trivial to verify.

For example, if my css files are compiled from a build script, and a caching proxy sits in front of my web server, I can set content-hashed files to infinite lifetime on the caching proxy and not worry about invalidating anything. Even if I clean my build output and rebuild, if the resulting css file is identical, it will get the same hash again, automatically. If I used UUID’s and blew away my output folder and rebuilt, suddenly all files have new UUID’s even though their contents are identical, which is wasteful.