| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by btown 2374 days ago
	This is more or less what https://capnproto.org/ does.

2 comments

davedx 2374 days ago

I think this is also what Rust's bincode crate does. It's very sane and you can just open up files in a hex editor and see what's there.

https://crates.io/crates/bincode

link

omginternets 2374 days ago

From the capnproto docs:

>Isn’t this all horribly insecure?

>No no no! To be clear, we’re NOT just casting a buffer pointer to a struct pointer and calling it a day.

Isn't this a direct contradiction to your claim? Or have I misunderstood them?

link

jtolmar 2374 days ago

IIRC: capnproto generates messages that you could deserialize by casting them to the right struct, but refrains from actually doing it that way. Instead it generates a bunch of accessor methods that parse the data, as if you were reading something that's not basically a c-struct, like a protobuff.

link

kentonv 2374 days ago

That's basically correct. Cap'n Proto generates classes with inline accessor methods that do roughly the same pointer arithmetic that the compiler would generate for struct access.

There's a couple subtle differences:

* The struct is allowed to be shorter than expected, in which case fields past the end are assumed to have their schema-defined default values. This is what allows you to add new fields over time while remaining forwards- and backwards-compatible.

* Pointers are in a non-native format. They are offset-based (rather than absolute) and contain some extra type information (such as the size of the target, needed for the previous point). Following a pointer requires validating it for security.

(Disclosure: I'm the author of Cap'n Proto.)

link

asveikau 2374 days ago

Re-read the comment I think. It doesn't say casting a struct pointer. It says putting the members of the struct into network byte order over the wire. I read that as individually serializing each member in a portable, safe way.

Anyway even if you do choose the struct pointer hack (which I do not see advocated here) it can be done relatively well albeit requiring language extensions and a bit of care. Pragmas and attributes to ensure zero padding and alignment between members. No pointer members. Checking sizes and offsets after a read (the hardest part).

link

Animats 2374 days ago

"As of this writing, Cap’n Proto has not undergone a security review, therefore we suggest caution when handling messages from untrusted sources."

Something like that has to be rigorously tested or proven to be free of buffer overflows. It's so easy to attack with malformed messages. Parsers for remote messages are a classic source of vulnerabilities. It's hard to test this, because it's a code generator.

This looks promising as an attack vector for a big system built on microservices. If you can find an exploit in this that lets you overwrite memory, and can break into some service of a set of microservices by other means, you can leverage that into a break-in of other services that thought their input was a trusted source.

The "zero overhead" claim goes away as soon as you send variable length items. Then there has to be some marshaling.

link

kentonv 2374 days ago

> As of this writing, Cap’n Proto has not undergone a security review

This is outdated, I should remove it. Cap'n Proto has been reviewed by multiple security experts, though not in a strictly formal setting. I trust it enough to rely on it for security in my own projects, but yeah, I am cautious about making promises to others...

> Something like that has to be rigorously tested or proven to be free of buffer overflows.

I've done a bunch of fuzz testing with AFL and by hand. I've also employed static analysis via template metaprogramming to catch some bugs. See:

https://capnproto.org/news/2015-03-02-security-advisory-and-...

(That was... almost five years ago.)

> The "zero overhead" claim goes away as soon as you send variable length items. Then there has to be some marshaling.

Space for messages is allocated in large blocks. The contents of the message are allocated sequentially in that space and constructed in-place. So once built, the message is already composed of a small number of contiguous memory segments (usually, one segment), which can then be written out easily. Or, if you're mmaping a file, you can have the blocks point directly into the memory-mapped space and avoid copying at all -- hence, zero-copy.

So no, there is no marshaling.

link

lidHanteyk 2374 days ago

Capn is better than C at struct layout. We are not, under any circumstances, going back to the 90s. We are moving forward and learning from mistakes.

link