| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by luckydata 2313 days ago
	I don't think anyone ever suggested that. The use case for a data lake is precisely the one you describe, it allows you to start collecting data without having to do a lot of work ahead of time before y9u know how you actually want to structure things. Allows for schema evolution too. It's not a panacea, it's just a way to avoid the inertia most large data projects have.

1 comments

towelpluswater 2313 days ago

Nobody here suggested it, just something I see organizations doing quite often.

(edit: the rationale behind this tends to be that you can avoid the heavy lifting of ETL/transformation logic by just using a data lake - obviously not the case, as most of us know)

link

threeseed 2313 days ago

I've worked on nearly a dozen Data Lakes. I have never seen nor heard of anyone who said that Data Lakes meant you could avoid ETL. If anything it has necessitated more of it as users expect to join these disparate data sets.

There is after all a reason that the role Data Engineer became popular just as Data Lakes become popular.

link

towelpluswater 2312 days ago

Just means we have different anecdotal experience, then. Very little of mine has been in the tech industry.

link