| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by sitkack 1135 days ago

I agree, sounds like a good approach.

Can the instance do 2 GB/s to disk at the same time it is doing 3.1GB/s across the network? Is that bidirectional capacity or on a single direction? How many threads does it take to achieve those numbers?

That is kind of a nice property, that the network has 50% more bandwidth than the disk. 2x would be even nicer, but that turns out to be 1.5 and 3, so a slight reduction in disk throughput.

Are you able to run a single RP Kafka node and blast data into it over loopback? That could isolate the network and see how much of the available disk bandwidth a single node is able to achieve over different payload configurations before moving on to a distributed disk+network test. If it can only hit 1GB/s on a single node, you know there is room to improve in the write path to disk.

The other thing that people might be looking for when using RP over AK is less jitter due to GC activity. For latency sensitive applications this can be way more important than raw throughput. I'd use or borrow some techniques from wrk2 that makes sure to account for coordinated omission.

https://github.com/giltene/wrk2

https://github.com/fede1024/kafka-benchmark

https://github.com/fede1024/rust-rdkafka