| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by akarve 2503 days ago
	Not quite ;) S3 is the primary data and metadata store, so that the rest of the stack is a pure function of S3 data (including Elastic). We don't use git at all yet. We use S3 object versioning and then capture the version, SHA-256, etag, etc. in a JSONL-based manifest https://open.quiltdata.com/b/quilt-example/tree/.quilt/packa.... Said JSONL manifest is simply a "locked list" of all the S3 objects in that package. The same manifests can be checked into git for fork/merge of data sets, but we're still exploring the right way to do that. I'll let Kevin answer the database fragments question.

1 comments

raghava 2502 days ago

Neat. But would this not build dependency on s3s versioning and make it hard for getting this portable across other clouds?

link

akarve 2501 days ago

Not quite. Abstraction layers like min.io support versioning. More importantly, Quilt manifests only require a "fully qualified physical key" that points to the data. In theory, the manifest can work with any URI: S3, local disk, etc.

link