Did you consider Summingbird? Seems like a lot of what you are doing might have been simplified by using Summingbird rather than building separate speed and batch layers in Hadoop / Spark.
That's a great point for our next session - Summingbird would be perfect for everyone trying to implement Lambda Architecture without having to produce separate code for streaming and batch.