Hacker News new | ask | show | jobs
by errantmind 3214 days ago
These days I work with Cassandra on a daily basis. The company I am contracting with switched to Cassandra a while back for their primary data store. A few poor decisions later and they were spending tens of thousands of dollars a month running Cassandra in Azure. The cost was high because they modeled and queried their data like they were still using a SQL database which was incredibly inefficient.

The lesson here is to think long and hard about how you are going to access your data before switching to a database like Cassandra. This will help you decide if Cassandra is the right database to fit your use-cases. If so, be sure to model your data appropriately.

In this case, based on how the company wants to query the data, they would have been better off with PostgeSQL.

2 comments

> The cost was high because they modeled and queried their data like they were still using a SQL database which was incredibly inefficient.

That's literally every Cassandra database I've ever encountered in the wild.

If you use Cassandra, you WILL need to duplicate data across tables for lookups. Don't use Cassandra if you can't stomach that fact (and the disk bills that come with it).

any recommendation and resources to read up on for when to use cassandra and how to design the schema?
Use Cassandra when you need real time HA cross datacenter without having to manually fail over

Use Cassandra when you're going to need to grow our database cluster often and don't have tooling to handle resharding

Use Cassandra when you do millions of simple queries (per second), not a handful of complex JOINs

I've used Cassandra at 3 different employers now, and I can't imagine using anything else for many use cases, but there will always be some where it's the wrong choice.

I like that your comment's denormalised for better use with Cassandra.
When you need a key value store that can easily and mostly consistently and with low latency replicate across multiple data centers (or aws regions), in multi master setups.

In all other cases you'll probably be better off with Postgres, MySQL or similar.

Using Cassandra as a key value store is ok, but ignores one is it's legit strengths.
Check out their training:

https://academy.datastax.com/courses

Once you make it past the videos trying to sell you on NoSQL, they are incredibly informational.