|
|
|
|
|
by ChuckMcM
3278 days ago
|
|
Very nice. Content addressable storage has a number of wonderful properties. At Blekko we would hash 'keys' (like a URI) which would identify a 'bucket' where that URI was stored. This spread crawling the web evenly across multiple servers. At Netapp I worked for a bit on a content addressable version of a filer where each 4K block was hashed and the hash became the block address. Unlike Ugarit the block hashes were in an SSD based metadata server rather than being hashed into directories. The feature that fell out of this was you got content deduplication for 'free' since any block that hashed to a particular code you already had stored you didn't need to store again. (and this exploited the fixed length defense against hash collisions). |
|