Hacker News new | ask | show | jobs
by mi100hael 3214 days ago
> Cassandra’s data model is a partitioned row store with tunable consistency where each row is an instance of a column family that follows the same schema

"Total Newbie" apparently means well-versed in database paradigms and terminology.

2 comments

It's not you, it's the author. That's a horrible way to describe it.

It's partitioned: Cassandra is a clustered database that will automatically route data to the right nodes. It does this by partitioning a token ring among members of the cluster. If you need more capacity, you add nodes and they claim more of the "token ring".

The row store: Cassandra groups data within partitions (see above) which determines which hosts get the data. Within each partition, Cassandra sorts the CQL rows based on your schema. If you had a table of "employees", you could have them partitioned by last initial, and then clustered by last name - all of the employees with last name starting with "J" would be on the same machines, and then they'd be sorted on disk "Ja...", "Je...", etc

Agreed. If anyone has recommended resources for someone coming from from SQL world and wanting to learn more about databases like Cassandra and HBase space, please share!
Datastax academy is probably the best free source

Searching YouTube for Cassandra summit talks is probably second

There was a push to do some better docs on the ASF website but it's just manpower that is currently spending time writing code instead - we have no real full time doc writers that focus on the open source product. Maybe some day someone will volunteer (and if you want to volunteer, I'll commit the docs for you - the site has a how to contribute guide, but honestly I'll take GitHub PRs if they're nontrivial even though it's an annoying workflow for our non-GitHub master).