Libnop: C++ Native Object Protocols | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

	Libnop: C++ Native Object Protocols (github.com)
	68 points by zbhojkiuy 2815 days ago

9 comments

nly 2815 days ago

These things (Protobufs, Flatbuffers, Cereal, Cap'n Proto, Bond, Apache Avro, Thrift, MessagePack etc etc) are now a dime a dozen. In fact, even if you want a C++-only solution like this, you're still spoiled for choice (any of Boost.Hana, Boost.Fusion, Cereal, or Boost.Serialization for reflection + a little code for binding to your choice of codec). Hopefully one day we'll get standardized compile-time reflection as a language feature.

Imho, code generators and schema files are a feature. If you don't care about forward or backward compatibility, or portability to other languages, then just avoid pointers and memcpy your structs to disk. There are even libraries like Boost.Interprocess that will help you do just that... even with complex multi-indexed data structures like hash tables and maps inside mmap()'d blocks.

That all said, these days I use Flatbuffers, because it's super efficient, header-only, round-trips to JSON, the code gen is tolerable, and it doesn't lead to executable size bloat.

sytelus 2815 days ago

Code generators and schema files are not a good feature, IMO. One shouldn't have to duplicate everything in yet another "neutral" language and then maintain two copies of definitions.

The header-only libraries are good. Less dependency is good. No need for build system is good. Many of the existing solutions you mentioned have tons of dependencies. When code needs to be built reliably on 3 different OS and interfaced with arcane twisted build systems, dependencies become nightmare. Boost is great overall but I rather avoid this monster if I can.

CamouflagedKiwi 2815 days ago

You're using them wrong if you duplicate everything and maintain two copies. You should have one copy of the objects in the "neutral" language (protobuf or whatever) and compile the language-specific files out of that - but you don't maintain those any more than you maintain any other build output.

je42 2815 days ago

Code generators are good - if they are properly supported and tooling good. Have a look at the new work-in-progress about C++ meta classes (P0707): https://youtu.be/80BZxujhY38?t=3141 this stuff looks very neat.

zvrba 2815 days ago

> Imho, code generators and schema files are a feature.

Yes, especially if data has to be consumed by different languages. Personally, I don't see a need for yet another C++-only serialization framework.

otabdeveloper2 2815 days ago

It's insane that with this proliferation of serialization libraries none of them are still usable for real-world work.

Here in 2018, we still have to roll our own.

Libnop has the right idea, but still fails on two important points:

a) Versioning is a must-have feature. (And no, some weird non-standard 'table' type is not a replacement.)

b) The wire protocol needs to be human-readable ASCII. (Or at least a human-readable ASCII with no performance degradation option must be available.)

coreytabaka 2815 days ago

The table approach is fairly flexible. Both Protos and FlatBuffers do more or less functionally similar things. What specific deficiency do you see?

Take a look at the binary format spec:

https://github.com/google/libnop/blob/master/docs/format.md

The format is designed specifically to provide enough structural information that the binary format can be parsed without knowing the original structure definitions. This makes it easy to write a binary-to-string converter than can grok any payload with minimal complexity. Message observability was a primary goal during development.

otabdeveloper2 2814 days ago

Proto/Flatbuffers are okay for message passing, but many times you don't want to pass messages, you want to serialize C++ structures. There's no way to represent your 1Gb in-memory hashtable as a Flatbuffer, and that is not okay.

Writing a binary-to-string converter sucks. You write code once, and debug it every day. Anything that makes debugging harder is humongous pain that shouldn't exist; binary-to-string converters can be written, but let's be real for a moment: this is something that is dead last on the list of business priorities. If your message can be parsed with ad-hoc shell or Python scripts, that's a huge win for everyone.

throwaway080383 2815 days ago

I can assure you that Protobufs and Thrift are used in the real world.

lemmyg 2815 days ago

In general this looks great!

I assume the reason why many functions take non-const pointers as arguments (rather than non-const references) is because it follows the google style guide? https://google.github.io/styleguide/cppguide.html#Reference_...

Cheeky question: have you Googlers thought about revising this guideline? It seemed weird to me when I read it years ago and it seems to be getting more and more unconventional. When I see a pointer argument in most C++ code now I would assume that passing a null ptr is not an error, but I see quite a lot of the code in this library doesn't check for null pointers inside the function body and would explode if you passed one in.

This doesn't mean I don't sympathize with the justification in the style guide: "References can be confusing, as they have value syntax but pointer semantics". If C++ had a non-null, non-reassignable pointer it might do a lot of what references do and be clearer, particularly for generic code. I don't really know, but references are what we have, they suit indicating the expectation of non-nullity, and it seems to me that the benefit of clearly communicating that expectation gets you more of a benefit than the confusion around semantics takes away.

coreytabaka 2815 days ago

This argument comes up a lot within Google. The style guide is revisited frequently (for example =delete in the public section is now the standard for disabling copy and/or move/assignment vs. the old macros in the private section).

Non-nullable types are helpful for implying pre-conditions. However, readability at the call site (e.g. Foo(&bar) might mutate bar, whereas Foo(bar) should not) is still considered more valuable in a large scale codebase. Passing nullptr as a pointer argument is generally assumed to not be okay unless explicitly documented as permitted -- this is opposite the assumption that you stated.

There are places exceptions are made, where the use of pointers is deemed more confusing than non-const references (e.g. move-maybe semantics in very specific cases). Ultimately, most code follows the default style guide.

Besides, nullptr dereferences are pretty easy to diagnose in a library like this. And more often than not everywhere else too.

jcelerier 2815 days ago

> However, readability at the call site (e.g. Foo(&bar) might mutate bar, whereas Foo(bar) should not) is still considered more valuable in a large scale codebase

but does it make it any more readable ? if you use the pointer anywhere else and have it as a variable then suddenly you don't distinguish anymore between a pointer and a reference. It would frankly make more sense to have empty `#define in` and `#define out` macros and make a small clang plug-in that checks correct usage in your codebase - e.g.

    int foo(int x, const foobar& my_foobar, boo& my_boo);
    foo(x, in fb, out b); // ok
    foo(x, out fb, out b); // compile error

coreytabaka 2815 days ago

In most cases the context provides enough information:

  void Baz(Bar*);
  void Baz(const Bar&);

  void Foo(Bar* mutable_bar) {
    Baz(*mutable_bar); // Not mutated.
  }

  void Foo(Bar* mutable_bar) {
    Baz(mutable_bar); // Possibly mutated.
  }

  void Foo(const Bar& bar) {
    Baz(bar); // Not mutated.
  }

  void Foo(const Bar& bar) {
    Baz(&bar); // Compiler error.
  }

The rule doesn't perfectly eliminate ambiguity, but it does a pretty good job overall. The point is to address the general use cases with familiar constructs that work across multiple toolchains.

coreytabaka 2815 days ago

Primary author and maintainer here! Happy to answer any questions anyone has.

Game_Ender 2815 days ago

Have you done any speed comparisons to other libraries? Since there are so many of these libraries that is a big deciding point, especially in C++.

zbhojkiuy 2815 days ago

I'm considering using the library for an embedded project that communicates with a Linux host and I have some questions.

- Does serialization/de-serialization require dynamic memory allocation? From what I've seen it looks like it doesn't.

- How are you handling endianness? floating point numbers?

- How complete/tested is the library? Are you aware of projects using it?

arcticbull 2815 days ago

I suggest looking at nanopb to compare, that’s how I’ve solved this problem in the past. It’s just a couple kB on an AVR.

zbhojkiuy 2815 days ago

I've used nanopb in the past. It's a great library. The only drawback is that it requires a type compilation step that generates the sources.

coreytabaka 2815 days ago

Thanks for considering using libnop!

This library works well in an embedded context. I've personally used it on Cortex-M class micro controller firmware.

The serializer/deserializer does not require dynamic memory allocation. Whether or not dynamic allocation happens depends on how you use the library. If you avoid using data types that perform dynamic allocation (e.g. std::vector) and use (or write) Reader/Writer types that use static memory (e.g. nop::BufferReader/nop::BufferWriter) then you should be fine.

There are some nice tricks you can use to permit protocols that have convenient dynamic containers on the host and static versions on the embedded device:

https://github.com/google/libnop/blob/master/docs/getting-st... https://github.com/google/libnop/blob/master/docs/getting-st... https://github.com/google/libnop/blob/master/examples/shared...

Endianness is assumed to be little because the vast majority of hardware is little endian. There are utilities to convert here:

https://github.com/google/libnop/blob/master/include/nop/uti...

These are not currently used by the Reader and Writer types included with libnop for efficiency, but are available for you to use in your own Reader and Writer types if you really need it.

Floating point is a much stickier problem due to lack of standardization across hardware. The library does not address this automatically and just packs floating point types in machine order. However, the library provides tools to help address the issue. One approach that works well is to use a fixed point representation for the wire type -- a value wrapper type is convenient for automating this:

https://github.com/google/libnop/blob/master/docs/getting-st...

See the Fixed template in the example. This is especially handy if you have a micro controller without floating point support or you want to minimize the size of the payload at the cost of range and/or precision.

The library has a complete suite of tests and 97.9% line coverage according to GCOV, with particular attention to conditional and error paths.

We use the library for a few internal embedded prototypes. I have not tracked its usage outside of Google since its recent release.

Best, C

kevincox 2815 days ago

> Endianness is assumed to be little because the vast majority of hardware is little endian ... libnop for efficiency

Do you have any evidence of this? I would be flabbergasted if a modern compiler couldn't optimize out a no-op endiannes conversion.

coreytabaka 2815 days ago

For the most part the optimization works well. The more subtle issue is that the union trick used by the endian utilities is not compatible with constexpr expressions. There are some interesting use cases for constexpr serialization that fail if the conversions are interposed in the Reader/Writer types. The alternative is to use reinterpret_cast, which is also incompatible with constexpr expressions.

Moreover, endian conversion templates are unusually frustrating in the current C++ standard. I wish there were a better way, but one does not currently exist.

zbhojkiuy 2815 days ago

Thanks for the detailed answer. Looks like it is a good fit for an embedded platform.

coreytabaka 2815 days ago

Supporting embedded platforms is one of the objectives of the library. Feel free to open a ticket on github if you have any further questions.

sytelus 2815 days ago

Why C++14 dependency? A lot of other projects are still stuck on C++11 because of variety of reasons. It's sad to see this dependency :(.

coreytabaka 2815 days ago

I tried to keep it C++11, but generalized lambdas are critical in important use cases (see nop::Variant). Besides, GCC and Clang have solid C++14 support. C++17 is another story... ;-)

sytelus 2815 days ago

The problem is not compiler support but the fact library is going to get used with other code base that is still C++11. Typically projects build system will enforce this and it’s hard to change without full evaluation of all dependencies. Is it possible to add support for C++11 and turn of these additional features that might not get used?

sytelus 2815 days ago

Serialization is part of the story. It would be great if someone can also write header-only dependency-free cross-platform RPC library :).

coreytabaka 2815 days ago

Since you mention it, check out the experimental RPC support:

https://github.com/google/libnop/blob/master/examples/interf... https://github.com/google/libnop/blob/master/include/nop/rpc...

I haven't documented it yet since it's not fully baked, but it's quite functional nonetheless. I have a working prototype of RPC over USB between a host PC and a Cortex-M micro controller. The ability to define constexpr dispatch tables is very convenient in a micro controller environment.

sytelus 2815 days ago

What transport do you use? http? Do you use any dependency for handling transport layer? I haven’t seen a good clean dependency less headed-only approach for this yet.

coreytabaka 2814 days ago

The library is transport agnostic. There is a Reader/Writer abstraction to adapt to any transport you like.

https://github.com/google/libnop/blob/master/docs/getting-st...

CyberDildonics 2815 days ago

How is this better than cereal, which is already mature and very easy.

https://uscilab.github.io/cereal/

stochastic_monk 2815 days ago

I’m familiar with capnproto [0], which claims something like zero overhead.

Personally, I just write serialization functions using fwrite/fread/write/read, but I’ve used projects which depend on it.

[0] https://capnproto.org/

CyberDildonics 2815 days ago

I've done that, but it can be very error prone and therefore time consuming.

Instead of relying on serialization of existing data structures I actually try to use more general data structures that already exist in a single span of memory.

Then serialization is just a matter of writing them out from start to finish, or allocating them in a memory mapped file.

gmueckl 2815 days ago

Can cereal handle user defined types without having to write custom serialization and deserialization functions for each? At first glance this seems to be the main difference.

nly 2815 days ago

Cereal requires a bit more copy & paste template boilerplate (it's a customization point) but they're basically the same.

* libnop uses a macro called NOP_STRUCTURE to create its key-value pairs.

* Cereal has a macro called CEREAL_NVP.

* Boost.Fusion has BOOST_FUSION_ADAPT_ASSOC_STRUCT

* Boost.Hana has BOOST_HANA_ADAPT_STRUCT

* Boost.Serialization has BOOST_SERIALIZATION_NVP

...all these things work the same way to get reflection

coreytabaka 2815 days ago

Yes! Hopefully C++20 or C++25-ish will provide better alternatives for compile-time reflection.

Don't forget that libnop also has NOP_EXTERNAL_STRUCTURE (and friends) to decouple the annotation from the structure definition. This is handy when you have a C ABI with C++ internal implementation. I don't recall seeing a similar facility in other libraries.

nly 2813 days ago

Most of the above macros work this way. Boost fusions macro will let you specify arbitrary code for getting and setting attributes.

jhasse 2815 days ago

cereal supports shared_ptrs and unique_ptrs, this doesn't (pls correct me if I'm wrong).

coreytabaka 2815 days ago

unique_ptr and nop::Optional<unique_ptr<T>> is on the way. shared_ptr opens up the ability to create cycles, which are not supported yet.

ognarb 2815 days ago

How does it compare to serde in rust?

mgkimsal 2815 days ago

I had to pause to think of a reason to use a whole library for a "no-op", thinking this might be some left_pad shenanigans infiltrating the c++ world.

sephware 2815 days ago

> Note: This is not an officially supported Google product at this time.

Why is that? Is it kind of like the other project that was here recently, where it was created at FB but now independently maintained and not a corporate sponsored project (anymore)? Or is it deprecated and no longer recommended?

kentonv 2815 days ago

It usually means this is an experimental side project of some Google employee, and not an official part of any Google product strategy. Google likes for its employees' side projects to be released under Google's GitHub org, which you can variously interpret as good (Google wants to promote and support open source experimentation) or evil (Google wants to assert ownership of its employee's side projects) depending on how you feel about Google.

(I used to work at Google and had a lot of side projects. That was before Google moved everything to GitHub, but they liked for me to mark the code as copyright Google but "not an official Google product". I was fine with this arrangement.)

userbinator 2815 days ago

or evil (Google wants to assert ownership of its employee's side projects)

I believe these are still worked on company time, in which case it is absolutely normal for Google to claim ownership.

kentonv 2815 days ago

Some are, some aren't. Google wanted me to assign copyright even for projects that I worked on entirely at home. There's an additional process you have to go through if you don't want to assign copyright. Personally I didn't care since it's under an unencumbered open source license anyway, so it hardly makes a difference.

(To be clear, I meant the good/evil thing to be tongue-in-cheek...)

userbinator 2815 days ago

for projects that I worked on entirely at home

That's... disturbing, and all the more reason to keep your work and private life strongly isolated. Suppose outside of work you write scripts for various things and distribute them online to friends and so forth, or blog posts, etc. Your employer should never be able to claim ownership of that.

coreytabaka 2815 days ago

It means not currently an officially supported project. Most open source releases of internal code start out this way and may or may not become officially supported. It is different than deprecated, which was once supported but no longer.

tcbawo 2815 days ago

Looks great. The interface and markup macros looks very similar to MessagePack's C++ interface (which is a great and useful library BTW).

coreytabaka 2815 days ago

Thanks! MessagePack is one of the influences.

vinkelhake 2815 days ago

User coreytabaka is providing answers here, but for some reason his comments are marked [dead].

sctb 2815 days ago

Thanks! We've recused them from the spam filter. It's best to email us at hn@ycombinator.com when this happens so we'll be sure to see it.

chc 2815 days ago

I think he somehow accidentally got spam-filtered. I vouched for him, so hopefully that's enough to stop the spam filtering.

gumby 2815 days ago

That's odd. I went and vouched for them all.