Hacker News new | ask | show | jobs
by k_bx 3444 days ago
> Riak is not a good model since its more a blob store and we wanted to simply range scan through messages rather than sharding blobs (Cassandra is REALLY good at this).

Can you tell a little bit more please? Range scan is done by using secondary indexes (index by timestamp) in our system. I'm not sure I understood the part about blobs or some things specific to Cassandra. Reply is highly appreciated.

1 comments

Cassandra uses consistent hashing. A segment of data that is addressed by a key is called partition, found by the partition key. Partitions can contain just 1 "row" if you only use a single column as the key, or you can create a compound key with a part dedicated to finding the partition and the rest to finding several rows within that partition.

If you use a compound keys (multiple rows), these rows are all stored in the same partition (which all lives on the single node which owns or replicates that partition in the consistent hash ring), so scanning those rows is very fast and efficient.