Hacker News new | ask | show | jobs
by ori_b 2519 days ago
> What does it mean to mmap a file on an NFS server?

It means you have issues around synchronization and performance, if you use it as anything other than a private read only mapping.

And some things are just impossible, like a shared memory ringbuffer. Which is exactly what you do with the memory you mmap from a video card: submit commands to the command ringbuffer.

> So it's a better idea to not implement many of these things and instead simply return an error.

And now you need to start writing multiple code paths in user code, testing which calls work and which don't, one of which will be broken due to lack of testing. And when you guess wrong at the needed operations, software often goes off the rails instead of failing gracefully. Failure modes like blindly retrying forever, or assuming the wrong state of the system and destroying data.

Too many complicated abstractions break the ability to do interesting things with a system. It's death by a thousand edge cases.

On plan 9, you have 9p.

https://9p.io/magic/man2html/5/0intro

That, and process creation/control/namespace management, are the only ways to do anything with the system. There are few edge cases. Implementing a complete, correct server is a matter of hours, not weeks.

1 comments

> like a [remote] shared memory ringbuffer.

Technically just as possible, only very slow... Performance is abstracted out by the VFS. You need to stay sane through other measures, like having your software configured right, etc.

> And now you need to start writing multiple code paths in user code

I don't think the number of paths is increased. Any software should handle calls that fail - if only by bailing out. That's acceptable for any operation that just can't complete due to failed assumptions - whether it's about file permissions or that the resource must be "performant" / not on an NFS share, etc.

> 9p.

Now what is the point? How's that different or better? They actually are much more into sharing resources over the network... which means less possible assumptions about availability/reliability/performance. I doubt they can make the shared ringbuffer work better.

> Technically just as possible, only very slow... Performance is abstracted out by the VFS.

How would you go about implementing the CPU cache coherency that allows you to do the cross machine compare and swap?

> I don't think the number of paths is increased. Any software should handle calls that fail - if only by bailing out.

If the software works fine without making a call, then you can just skip the extra work in the first place. Delete the call, and the checks around if the call fails. And if the call is important somehow, you need to find some workaround, or some alternative implementation, which is by definition never going to be very well tested.

> Now what is the point? How's that different or better? They actually are much more into sharing resources over the network... which means less possible assumptions about availability/reliability/performance. I doubt they can make the shared ringbuffer work better.

The tools to make a shared ringbuffer that depends on cache coherent operations simply aren't there -- it's not something you can write with those tools.

And that's the point: The tools needed simply don't work across the network. Instead of trying to patch broken abstractions, adding millions of lines of complexity to support things that aren't going to work anyways (and if they do work, they'd work poorly) pick a set of abstractions that work well everywhere, and skip the feature tests and guesswork.

Primitives that work everywhere, implement them uniformly, and stop special casing broken or inappropriate tools.

And then, it's a day of work to implement a 9p server, and everything works with it. So I can serve git as a file based API, DNS as a file API, fonts as a file based API, doom resources as a file API, or even json hierarchies as a file API, and not worry about whether my tools will run into an edge case. I can export any resource this way, and not need special handling anywhere.

Plan 9 doesn't have VNC; it has 'mount' and 'bind', which shuffles around which `/dev/draw` your programs write to, and which `/dev/mouse` and `/dev/kbd` your programs write to.

Plan 9 doesn't have NAT; it has 'mount' and 'bind', which shuffles around which machine's network stack your programs write to.

Plan 9 doesn't have SSH proxying that applications need to know about: It has sshnet, which is a file server that provides a network stack that looks just like any other network stack.

From parsimony comes flexibility. You're not dragging around a manacle of complexity.

> How would you go about implementing the CPU cache coherency that allows you to do the cross machine compare and swap?

build it in the protocol!

And so on...

> The tools to make a shared ringbuffer that depends on cache coherent operations simply aren't there -- it's not something you can write with those tools. And that's the point: The tools needed simply don't work across the network.

Ok. In theory, we just need to build access to the tool in the network protocol and have the network server execute the magic on the remote machine.

Of course, one needs a way to map e.g. a CAS operation to a network request. I don't think today's CPUs let us do that.

> Delete the call, and the checks around if the call fails.

    FILE *f = fopen(filepath, "rb");
    if (f == NULL)
        fatal("Failed to open file %s!\n", filepath);
There. I wouldn't remove a line, and I've magically handled whatever error condition it was, regardless if I've thought about network transparency issues or not.

> Primitives that work everywhere, implement them uniformly, and stop special casing broken or inappropriate tools.

I've never seriously looked at 9p, but the page you linked strongly suggests to me that it's more abstraction if anything (your initial statement was that that's bad), and more vague (if anything) as a consequence. More like HTTP, and I don't think of HTTP as a sort of universal solution - it's rather a sort of bandaid to glue things together with minimal introspection (HTTP verbs, status codes...). And the fact that it tries to be universal also means that it doesn't match some problems very well, and people will basically just sidestep HTTP there (I'm not a web person, but I've heard of major services that just return HTTP 200 always and just HTTP as a transport for their custom RPC mechanism or whatever).

> Plan 9 doesn't have VNC ... NAT ... SSH

Great. I get it. 9p is a basic transport method that gives some introspection for free if you can model your problem domain as an object hierarchy. But it's far from a free solution for any problem. It might save you some parsing in some cases, but it doesn't compress your VNC stream for example. Nor define the primitives of any problem domain that it just can't know about.

> build it in the protocol!

You don't have access to the interprocess cache snooping in software. This is CPU interconnect internal shit, and you actually need access to the local memory bus for correctness. mmap in its fully glory is only really worth having if you can share pages from the buffer cache.

And even if you did, and you turn a ten nanosecond operation into a ten millisecond operation, counting the network packets you send (a factor of a million overhead), without the assumption that all the peers are reliable and never fail, the abstraction still breaks. And if you assume all your peers are reliable in a distributed system, you're wrong. Damned either way.

> I've never seriously looked at 9p, but the page you linked strongly suggests to me that it's more abstraction if anything

No, it's a single abstraction, instead of dozens that step on each other's toes.

> Great. I get it. 9p is a basic transport method that gives some introspection for free

What introspection? It's just namespaces and transparent passthrough. Unless you're talking about file names.

Yes, I realized the CPU issue and already updated my comment. Technically we would need a way to catch the CAS operation and convert it into a network request - like for example segfaults can be handled and converted into a disk load.

And also we'd need to extend all the cache coherency stuff over the network.

> And if you assume all your peers are reliable in a distributed system, you're wrong. Damned either way.

Technically you have the reliability issues with all the components inside a single system, just as well. They are just more reliable. But I'm sure I have seen hard disks failing, etc.

--

Ok, let me think about that abstraction stuff. Thanks.

I'd also argue that if you need to turn a cache snoop into a network round trip (or several?), your abstraction is just as broken as if it returned the wrong value; it's unusably slow :)