Hacker News new | ask | show | jobs
by kornish 1930 days ago
> Too proprietary.

Every major cloud provider has a blob store. Most are S3-interface compatible, or can be fronted by something which is, like Minio.

> Can't be run locally, on robots, etc.

If the contents of your S3 keys are just line-delineated JSON files, you can easily download those files and run scripts or process them locally.

> Can't be transferred to other cloud providers.

Again, not true — a tool like rclone handles this case in a single command. Depends on the amount of data, but if anything it's easier to move a bunch of flat files between providers than copying a database backup around. You have to pay for egress bandwidth, though, of course.

> Also S3 is horribly slow to write or delete thousands of records at once.

If you're keeping one JSON record per S3 key, that is a blatant misuse of the tool and performance will be terrible. On the other hand, if you batch records into files of appropriate sizes, it's very cost-effective to store and query via Athena, Spark on EMR, etc.

If you need individual record-level access as identifiable by a key, then you'd probably be better off with Redis or Memcached. (though, those will not be as good for bulk offline processing) It's all about your access patterns.

2 comments

> On the other hand, if you batch records into files of appropriate sizes

Too much work. If you're a startup that's too much stuff to maintain. Much easier to get a hosted MongoDB and launch tomorrow.

And when you only have 2 months rent in the bank and investors want demo after demo in order to give you cash, tools like MongoDB do make a difference. And yes, I've been in that situation before.

Your ability to turn the smallest of technical hurdles into an insurmountable problem is truly impressive.
> Also S3 is horribly slow to write or delete thousands of records at once.

I'll add that S3 is only slow to delete things sequentially. You can delete or write an effectively infinite amount of items in parallel.