Hacker News new | ask | show | jobs
by mping 1681 days ago
A data lake is a system designed for ingesting, and possibly transforming lots of data, a "lake" where you dump your data. This is different from an eg postgres db (a single source of truth for a crud app for example), because it captures more data (eg events) and it's normally not consistent with the single source of truth (the data may arrive in batches, imported from other database, etc). Because the volume of data is normally huge, you need a cluster to store it, and some way of querying it.

Snowflake and data bricks are companies that operate in this space, providing ways to ingest, transform and analyze large volumes of data.