|
|
|
|
|
by zeta0x10
2304 days ago
|
|
Don't think he is that wrong. Putting aside Spark, the remaining terms are pretty broad. Data Lakes: Is this our new fancy term describing "data". So what distinguishes "data lakes" from "data"? ETL: I would guess 95% of application programs take input, parse it (extract), do some data wrangling (transform) and save the result somewhere else (load). Validation: Again a broad term. Do you mean validation of statistical models? Without validation your predictions are worthless so I guess it is a standard thing to do if you want to do any kind of machine learning. Schema Management and data catalogs: Standard DB stuff I would say. We just like to define new job descriptions. It's the same with DevOps, which seems to be the new term for System Administrator. |
|