Hacker News new | ask | show | jobs
by agibsonccc 4724 days ago
I can vouch for storm. If only for the fact it's pretty easy to setup (especially compared to hadoop) Being able to leverage zookeeper for coordination allows you some extra capabilities for coordination as well. With that being said, just watch how you build your bolts/spouts. There's lots of ways you can send data in to the system, but in general , storm's documentation has been superb to work with.

I built a mini library for myself to auto construct the topologies based on a set of named dependencies to handle bolt/spout wiring. Aside from that, the builder interface for it is really nice if your data pipeline doesn't change.

There's good support for testing with a local cluster as well.

1 comments

Thanks for your suggestion. Do you have any specific readings for me to look into for building bolts/spouts for sending data into the system?

Thanks

Here's the root wiki: https://github.com/nathanmarz/storm/wiki

Here's the system architecture: https://github.com/nathanmarz/storm/wiki/Concepts

Here's non JVM languages (specifically python) for building spouts/bolts https://github.com/nathanmarz/storm/wiki/Using-non-JVM-langu...

Here's an example project: https://github.com/nathanmarz/storm-starter

Thanks!