Hacker News new | ask | show | jobs
by graffitici 3625 days ago
I wish the OP had also drawn a comparison between Postgres and Cassandra (which he mentions early on).

Based on the schema he writes at the beginning, it looks as though he works mostly with time-series. My understanding is that Cassandra is particularly well-suited for these types of workloads (definitely more so than MongoDB?). If the write throughput is such that a single server can handle it, how would Postgres compare with Cassandra? What are the distinctions in read-latencies? Storage space?

Perhaps for the next article..?

3 comments

If you enjoy this sort of exploration and comparison of databases and want more, I can't recommend the book Seven Databases in Seven Weeks enough.

https://pragprog.com/book/rwdata/seven-databases-in-seven-we...

So true, I often see people choose a database without considering pros and cons, they often considering only what they want as a developer to work with or to try and don't consider the nature of the data. Time series data fits the best in Cassandra and some other column-oriented DBs but definitely not in MongoDB.
Time series data is actually a great fit for MongoDB.

You create a collection for each measurement, JSON document for each day and then preallocate the fields for each hour/minute/second. With batching you can update 100K documents a second pretty easily off one node. For example: https://blog.serverdensity.com/using-mongodb-as-a-time-serie...

I don't like MongoDB as much as the next guy, but can people explain why this comment is downvoted?
Because haters gonna hate
I read that more as what he normally uses for tests and demos since it can be really easily generated using native SQL functions, and not necessarily his day job. Not that it matters much, I guess, and I agree it'd be neat to see that compared to Cassandra.
That's pretty much it. No time series just yet, just an existing Mongo set that I'm trying to wrap my head around, and that means picking it up ASAP.

I've messed around with Cassandra, but not in much depth. I ran into the way it treats NULL, and that all searchable fields must be indexed, and put it on the backburner for later.