Hacker News new | ask | show | jobs
by nikita 39 days ago
Lakebase is referring to the fact that in addition to disaggregated storage s3 is authoritative storage for older data.

Since data is on s3 (or lake) you can perform direct to s3 type operations like data loading, reading this data by engines that are not Postgres and more

1 comments

  > in addition to disaggregated storage s3 is authoritative storage for older data
Suppose a person retrives cold data from another Object Storage protocol rather than S3. This is no longer a "Lakebase", so we have to come up with a different name to avoid confusion.

But if you say "Disaggregated Storage on S3" then you have the flexibility to change that to "Disaggregated Storage on FOOBAR" to avoid confusion.

> Suppose a person retrives cold data from another Object Storage protocol rather than S3. This is no longer a "Lakebase", so we have to come up with a different name to avoid confusion.

I've never seen "lake" or adjacent terminology refer to S3 specifically like that vs other object storage. A data lake on Ceph would still be a data lake.

(My quibble would be that "lake" often refers to inconsistent or unstructured, and itself has always been a bit handwavy compared to "warehouse," whereas this is very structured data on object storage.)

Yes.

Maybe I’m wrong, but AFAICT this is block (page) storage backed by S3, tuned for Postgres with some paxos-linked storage/caching servers sitting in front? Sounds good, but I’m not sure “lake” or “warehouse” is a word I’d choose… much closer to Litestream-with-reads, or the somewhat-famous “I ran out of RAM so I downloaded some more” blog article.