Hacker News new | ask | show | jobs
by BFLpL0QNek 2488 days ago
Do you have any numbers to back up Flink is faster than KStreams, also under what scenario?

I am genuinely interested as use KStreams a lot but the engineering discipline in the API leads a lot to be desired and more than happy to switch the API if Flink is that much better.

1 comments

Here's a benchmark of KStreams and Flink [1]. Note that the Flink vs Spark comparison is disputed [2], but both Flink and Spark are several orders of magnitude faster than KStreams. This is inevitable given KStreams architecture -- it stores all its state in Kafka rather than in a data store and with data structures optimized for the use case and doesn't do much coordination among workers. KStreams is there if you want streaming semantics on top of a small-ish Kafka topic you own, but don't care too much about perf. Deploying and maintaining Flink is a much bigger hassle than KStreams -- you need DevOps support to get Flink running, whereas KStreams runs (albeit quite slowly) inside your application with no new state store needed.

Confluent has a good discussion of the ownership issue (DevOps for Flink, devs for KStreams) here [3] though they seriously downplay the huge gap in perf.

[1] https://databricks.com/blog/2017/10/11/benchmarking-structur...

[2] https://www.ververica.com/blog/curious-case-broken-benchmark...

[3] https://www.confluent.io/blog/apache-flink-apache-kafka-stre...

mmh. i found this more recent benchmark. where flink was still faster but kstreams' perfromance much more closer then in the 2017 benchmarks.

I guess kstreams improved performance over time ? Or is the benchmark design just different ?