|
|
|
|
|
by lobster_johnson
5757 days ago
|
|
GridFS is just a standard convention of how to map files to key-value stores like MongoDB -- you can implement GridFS over MongoDB in just a few lines of Ruby code. GridFS breaks files into fixed-size chunks, and uses a single MongoDB document per chunk. It's not exactly rocket science. The author of the blog post touts it as a _feature_ of MongoDB, but it's more accurate to say that it's an artifact of MongoDB's 4MB document size limit -- you simply cannot store large files in MongoDB without breaking them up. Sure, by splitting files into chunks you can parallelize loading them, but that's about the only advantage. Among the key-value NoSQL databases, Cassandra and Riak are much better at storing large chunks of data -- neither has a specific limit on the size of objects. I have used both successfully to store assets such as JPEGs, and they are both extremely fast both on reads and on writes. Neither is built for that purpose, and will load an entire object into memory instead of streaming it, so if you have lots of concurrent queries you will simply run out of memory at some point -- 10 clients each loading a 10MB image at the same time will have the database peak at 100MB at that moment. Actually, Riak uses dangerously large amounts of memory when just saving a number of large files. I don't know if that's because of Erlang's garbage collector lagging behind, or what; I would be worried about swapping or running out of memory when running it in a production system. |
|