|
|
|
|
|
by javiermaestro
2440 days ago
|
|
I'm not following. I understood from your first comment that you think the amount of data is low ("underwhelming") and from your last comment that it's a lot ("that much data"). In any case, the data is "whatever needs to be logged". And it's not "server logs", which is what I'm interpreting from your comment. Scribe transports most data at Facebook to be processed by real-time systems (e.g. Puma, Scuba) and also "batch systems" (data warehouse). So, it's quite a lot, being "the ingestion pipe" for Facebook. Does this answer your question? :-? Puma: https://research.fb.com/publications/realtime-data-processin... Scuba: https://research.fb.com/publications/scuba-diving-into-data-... |
|
I see. I walked away from the article with the impression that it was meant to be a log aggregation service a la flume, splunk, or logstash.
> the amount of data is low ("underwhelming") and from your last comment that it's a lot ("that much data").
I was remarking on the numbers in regard to generation, not consumption. Based on the article, my estimate is pointing out that generating 2.5TB/s of transactional logs and telemetry data using "millions" of machines would be technically possible but not reasonably practical...and thus likely not real ;). But, you corrected my understanding: That number isn't based on a different use case.