Hacker News new | ask | show | jobs
by chrislusf 860 days ago
> How is that not mmap?

The allocated storage is append only. For updates, just allocate another blob. The deleted blobs would be garbage collected later. So it is not really mmap.

> Also what is the difference between a file, an object, a blob, a filesystem and an object store?

The answer would be too long to fit here. Maybe chatgpt can help. :)

> Is all this just files indexed with sql?

Sort of yes.

2 comments

I, too, am interested in your views on the last 2 questions, since your views, not chatGPT's, are what informed the design. Part of learning from others' designs [0] is understanding what the designers think about their own design, and how they came about it.

Would you mind elaborating on them? HN gives a lot of space, and I'm confident you can find a way to summarize without running out, or sounding dismissive (which is what the response kind of sounds like now).

0 – https://aosabook.org/en/

The blob storage is what SeaweedFS built on. All blob access has O(1) network and disk operation.

Files and S3 are higher layers above the blob storage. They require metadata to manage to the blobs, and other metadata for directories, S3 access, etc.

These metadata usually sit together with the disks containing the files. But in highly scalable systems, the metadata has dedicated stores, e.g., Google's Colossus, Facebook's Techtonics, etc. SeaweedFS file system layer is built as a web application of managing the metadata of blobs.

Actually SeaweedFS file system implementation is just one way to manage the metadata. There are other possible variations, depending on requirements.

There are a couple of slides on the SeaweedFS github README page. You may get more details there.

Thank you, that was very informative. I appreciate your succinct, information dense writing style, and appreciate it in the documentation, too, after reviewing that.
You made the claim:

what makes it different is a new way of programming for the cloud era.

but you aren't even explaining how anything is different from what a normal file system can do, let alone what makes it a "new way of programming for the cloud era".

Sorry it was not so clear. Previously fallocate just allocate disk space for a local server. Now SeaweeedFS can allocate a blob on a remote storage.
What is the difference between a blob and a file and what is the difference between allocating a blob on remote storage or a file on remote storage?
A large file can be chunked into blobs.
> I really don't understand why you aren't eager to explain the differences and what problems are being solved.

Sorry, everybody has different background of knowledge. Hard to understand where the question comes from. I think https://www.usenix.org/system/files/fast21-pan.pdf may be helpful here.

Sorry, everybody has different background of knowledge. Hard to understand where the question comes from.

They were straightforward questions. The paper you linked talks about blobs as a term for appending to files. Mostly it seems to be about wrapping and replicating XFS.

Is that why you are avoiding talking about specifics? Are you wrapping XFS?

> Why does a user need that? Filesystems already break up files into blocks / sectors. Why wouldn't a user just deal with files and let the filesystem handle it?

A blob has its own storage, which can be replicated to other hosts in case current host is not available. It can scale up independently of the file metadata.

Why does a user need that? Filesystems already break up files into blocks / sectors. Why wouldn't a user just deal with files and let the filesystem handle it?

I really don't understand why you aren't eager to explain the differences and what problems are being solved.