We don't use Camus; IIRC, it didn't exist at the time that we built most of the infrastructure. We just read data directly out of Kafka using client libraries.
Apologies for delayed response, I'm guessing you won't see it, but...without Camus in place...did you do anything to ensure exactly-once semantics in moving the data to the Batch Layer?
For the real time layer I see it as not being mission critical for most data sets to be 100% correct, but for the ETL part of the process, the guarantees provided by Camus (ensured by the OutputCommitters part of MR I believe) are invaluable.
For the real time layer I see it as not being mission critical for most data sets to be 100% correct, but for the ETL part of the process, the guarantees provided by Camus (ensured by the OutputCommitters part of MR I believe) are invaluable.