|
|
|
|
|
by DarkContinent
703 days ago
|
|
> Batch is just a special case of streaming No. Designing a system that is always up and running and can process small amounts of data constantly is a completely different problem from designing a system that runs occasionally with a lot of data. For one thing, your output formats are usually different in the latter case (maybe you're creating a PDF for example). Also the high availability requirement just makes things different at the design level. Finally, the author claims it's not hard to switch between batch and streaming. With a large volume of preexisting data, this is just not true. For example, if you make a REST API call for each document in a DB, it can take days or months to load that. If batching together documents isn't a possibility, how do you move data between stores easily? (This data movement is often required when switching between batch and streaming.) |
|