Hacker News new | ask | show | jobs
by al_james 2985 days ago
A great article, and I am a big fan of algolia, Citus and Redshift. However this article ends up making an odd apples to oranges comparison.

They state that "However, achieving sub-second aggregation performances on very large datasets is prohibitively expensive with RedShift", this suggests that they want to do sub-second aggregations across raw event data. However, later in the article, the solution they build is to use rollup tables for sub-second responses.

You can also do rollup tables in Redshift, and I can assure you (if you enable the fast query acceleration option) you can get sub-second queries from the rolled up lower-cardinality tables. If you want even better response times, you can store the rollups in plain old Postgres and use something like dblink or postgres_fdw to perform the periodic aggregations on Redshift and insert into the local rollup tables (see [1]). In this model the solution ends up being very similar to their solution with Citus.... and I would predict that this is cheaper than Citus Cloud as Redshift really is a great price point for a hosted system.

So the question of performing sub-second aggregations across the raw data remains unanswered... however that really is the ideal end game as you can then offer way more flexibility in terms of filtering than any rollup based solution.

Right now, research suggests Clickhouse, Redshift or BigQuery are probably the fastest solutions for that. Not sure about Druid, I dont know it. GPU databasees appear to the be the future of this. I would be interested to see benchmarks of Citus under this use case. I should imagine that Citus is also way better if you have something like a mixed OLAP and OLTP workload (e.g. you need the analytics and the row data to match exactly at all times).

Aside: It would be great to see Citus benchmarked against the 1.1 billion taxi rides benchmark by Mark Litwintschik. Any chance of that?

[1] https://aws.amazon.com/blogs/big-data/join-amazon-redshift-a... [2] http://tech.marksblogg.com/benchmarks.html

2 comments

Similar to your point about mixed workloads, I have a hunch that Mark's benchmarks are not comprehensive enough to correlate well to real-world usage across a lot of different scenarios, even on pure OLAP workloads. It's great that a billion rows can be aggregated in 0.02 seconds, but there's a reason TPC-H uses 9(-ish?) different queries with varying aggregations and joins, vs. these benchmarks on a single table. (Of course, if your use case is heavy on a specific type of aggregation, it probably makes sense to optimize for that at the expense of other query performance.)

And - perhaps I missed it, but his benchmarks don't seem to utilize rollup/materialization unless the DB does it automatically (or at least easily) on the backend.

As is, it's almost certain that Citus would underperform most of the leaders here. The PG9.5 benchmark actually uses the Citus-developed cstore_fdw extension, and it shows up towards the bottom, albeit running on a single node with hardware a few CPU generations old. (Same as used for the Clickhouse benchmark.) I am curious how Citus/Postgres might perform using the HLL / TopN extensions, though.

Also of note is his Redshift benchmark was run on magnetic drives on ds2 instances, not SSDs. Using those would almost certainly bump performance up a bit.

Druid is optimized for aggregation and filtering, and is somewhat similar to BigQuery on the backend, as I understand it. The Cloudflare blog posted elsewhere in the thread covers it briefly. https://blog.cloudflare.com/how-cloudflare-analyzes-1m-dns-q... More on its indexing strategy here: https://hortonworks.com/blog/apache-hive-druid-part-1-3/

Druid's downsides are: more complex deploy and operational needs due to architectural complexity, lack of full SQL support, limited fault tolerance on the query execution path, and the whole query being bottlenecked by the slowest historical data access.

More here: https://medium.com/@leventov/the-problems-with-druid-at-larg...

That's a fair point. Indeed we started looking at doing aggregations across raw events, before realizing this was probably ill fated.

It's very possible we could have done the same with RedShift but it didn't seem obvious how. With Citus offering extensions like topn and hll we however quickly saw how that could work for us.

Thanks for the link btw!

Yeah, thats a good point. Redshift does not have the same level of 'probabilistic counting' functions, that can be used from rollups. Redshift does have HLL (SELECT APPROXIMATE COUNT(*)) however that can only be applied when scanning the full data, I am not sure its possible to store a HLL object in a rollup and later aggregate them.