Hacker News new | ask | show | jobs
by Boxxed 3212 days ago
Completely mirrors my experience with Cassandra. I think they'd have a real contender on their hands if operating a cassandra cluster didn't basically take a full time engineer. Its backup story is absolutely abysmal, and tooling is atrocious -- during a support incident a DataStax guy suggested I dump a table with sstable2json (or something like that) which generated a 100GB json file. When I pointed out that basically nothing could consume it because it was one 100GB hash object, he said "Yeah, I guess no one ever uses this stuff."
1 comments

As a long time Cassandra user: people use sstable2json all the time, but most people don't have 100gb sstables (or 20gb sstables that make 100gb of json)

Certainly something we can do better - how would you break it up? Adding a key to dump an individual partition to json?

It wasn't that large a database if I remember -- maybe 1TB? The sstable sizes seemed reasonable at the time, I think it was just explosion due to json.

Anywhoo, one huge file is fine, what's not fine is having one huge json object -- streaming parsers might be ubiquitous in the XML world, but definitely not in json land. Something simple like small json documents separated by newlines would work.

What a guy slash gal; cool!