Hacker News new | ask | show | jobs
by iskander 4409 days ago
I'd love to learn more about how they're using Spark, are there are any blog posts or tech talks floating around?
1 comments

Here's a talk at Hakka Labs done by a Ooyala Engineer (@evanfchan), which is how I knew they used Spark: https://www.youtube.com/watch?v=PjZp7K5z7ew - and the accompanying slides: http://www.slideshare.net/planetcassandra/south-bay-cassandr...

They use Spark on top of Cassandra, as well as they are users of Spark's version of Hive - Shark.

Thanks for posting this. I'm starting to get a feel for when Spark is usable-- you need an underlying indexed data store which lets you fetch small subsets of your data into RDDs (or, your data can be tiny to begin with). We've been trying to use Spark on input sizes which, while smaller than our cluster's available memory, are probably too big for Spark to handle (> 1TB).
These guys look to be doing some nice work integrating Cassandra and Spark http://blog.tuplejump.com/ They've piggybacked on the Cassandra clustering using a java agent to run the Spark masters. Doesn't seem to be a realease available yet though.