Hacker News new | ask | show | jobs
by aliqot 1413 days ago
None of this really drives home why a new term was necessary. I'm still seeing "dataset".
3 comments

Because dataset doesn't tell you if the data was ETL'd or ELT'd; data warehouse and data lake do tell you.

Now just wait till you hear someone reference "data lake-house" ...

A quick web search gave me:

> A data warehouse is a repository for structured, filtered data that has already been processed for a specific purpose.

Structured logs that have been filtered by their relevance to security really seem to fit the definition. If we must use newspeak, "log warehouse" then?

Think of it as "raw dataset" where all the data lands first and only copies of it are modified.
Someone's processed data is someone else's raw input.

Here these are logs that were already filtered by their relevance to security and exported as structured data. Considering those "raw unstructured data" because you haven't personally done ETL on it seems wrong.

That's because it's just marketing buzzword bingo