| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Boxxed 3212 days ago
	Completely mirrors my experience with Cassandra. I think they'd have a real contender on their hands if operating a cassandra cluster didn't basically take a full time engineer. Its backup story is absolutely abysmal, and tooling is atrocious -- during a support incident a DataStax guy suggested I dump a table with sstable2json (or something like that) which generated a 100GB json file. When I pointed out that basically nothing could consume it because it was one 100GB hash object, he said "Yeah, I guess no one ever uses this stuff."

1 comments

jjirsa 3212 days ago

As a long time Cassandra user: people use sstable2json all the time, but most people don't have 100gb sstables (or 20gb sstables that make 100gb of json)

Certainly something we can do better - how would you break it up? Adding a key to dump an individual partition to json?

link

Boxxed 3212 days ago

It wasn't that large a database if I remember -- maybe 1TB? The sstable sizes seemed reasonable at the time, I think it was just explosion due to json.

Anywhoo, one huge file is fine, what's not fine is having one huge json object -- streaming parsers might be ubiquitous in the XML world, but definitely not in json land. Something simple like small json documents separated by newlines would work.

link

jjirsa 3212 days ago

https://issues.apache.org/jira/browse/CASSANDRA-13848 Created just for you

link

Boxxed 3212 days ago

What a guy slash gal; cool!

link