| HN Mirror

The point of the content hash is to make it trivial to verify that the content hasn’t changed from when its hash was made. If you just make a uuid that has nothing to do with the file’s contents, you could easily forget to update the UUID when you do change its content, leading to invalid caches (or generate a new UUID even though the content hasn’t changed, leading to wasteful invalidation.)

Having the filename be a simple hash of the content guarantees that you don’t make the mistakes above, and makes it trivial to verify.

For example, if my css files are compiled from a build script, and a caching proxy sits in front of my web server, I can set content-hashed files to infinite lifetime on the caching proxy and not worry about invalidating anything. Even if I clean my build output and rebuild, if the resulting css file is identical, it will get the same hash again, automatically. If I used UUID’s and blew away my output folder and rebuilt, suddenly all files have new UUID’s even though their contents are identical, which is wasteful.