|
|
|
Ask HN: Is ETL (data integration in batch processing mode) really dead?
|
|
9 points
by srigan
3483 days ago
|
|
I have recently come across this presentation https://www.infoq.com/presentations/etl-streams?utm_source=infoq&utm_medium=popular_widget&utm_campaign=popular_content_list&utm_content=homepage
Should every data integration or data processing pipeline should be built based on stream processing architecture, even though there is no need for such a thing from day zero? The argument i hear for doing so is that in future we might have a need for real time processing. Would like to hear what others are thinking. |
|
SOAP and CSV are not sexy. They have plenty of shortcomings. However, those are the formats that are used in the real world today (and for some time to come).
Stream processing is a very useful design pattern but like any design pattern it should be used carefully and only where appropriate (see: Microservices).
If I were to build a new complex ERP from the ground up I'd be remiss not to use something like Kafka or Confluent for data processing.
If I want to communicate with legacy systems though that's an entirely different matter. The same applies when targeting SMBs. You'd have a hard time explaining to small business owners why they suddenly need a newfangled stream processing architecture while their old "Export CSV and load that into Excel" process worked just fine.