Hacker News new | ask | show | jobs
by dzello 3748 days ago
Great post! Makes a lot of sense to move I/O to a thread pool. Do you control max topology I/O with the thread pool size now, in addition to Storm settings like # of executors? Another q - do you monitor how long each bolt blocks waiting for a thread to become available? (as a measure of pool saturation)?
1 comments

Hi

The max thread pool size definitely affects the maximum I/O we can do and we've tuned that based on perf testing the memcache/cassandra clients in isolation.

However the bottom line is that we are still limited by the max spout pending or the max number of queries we can run concurrently. There will be short spikes that might use up all the I/O threads but they are short lived because there are other CPU intensive operations that the topology is doing at the same time.

In terms of blocking for each bolt, we don't block, the bolt tries to execute the operation in the current thread if the I/O thread pool is saturated. This way if there are pending tuples they queue up in storm's receive queues and it creates a back pressure when we have too much work outstanding.

Hope this helps! Manu

Cool idea for how to still let Storm apply its own backpressure, thanks for sharing!