These things (Protobufs, Flatbuffers, Cereal, Cap'n Proto, Bond, Apache Avro, Thrift, MessagePack etc etc) are now a dime a dozen.
In fact, even if you want a C++-only solution like this, you're still spoiled for choice (any of Boost.Hana, Boost.Fusion, Cereal, or Boost.Serialization for reflection + a little code for binding to your choice of codec). Hopefully one day we'll get standardized compile-time reflection as a language feature.
Imho, code generators and schema files are a feature. If you don't care about forward or backward compatibility, or portability to other languages, then just avoid pointers and memcpy your structs to disk. There are even libraries like Boost.Interprocess that will help you do just that... even with complex multi-indexed data structures like hash tables and maps inside mmap()'d blocks.
That all said, these days I use Flatbuffers, because it's super efficient, header-only, round-trips to JSON, the code gen is tolerable, and it doesn't lead to executable size bloat.
Code generators and schema files are not a good feature, IMO. One shouldn't have to duplicate everything in yet another "neutral" language and then maintain two copies of definitions.
The header-only libraries are good. Less dependency is good. No need for build system is good. Many of the existing solutions you mentioned have tons of dependencies. When code needs to be built reliably on 3 different OS and interfaced with arcane twisted build systems, dependencies become nightmare. Boost is great overall but I rather avoid this monster if I can.
You're using them wrong if you duplicate everything and maintain two copies. You should have one copy of the objects in the "neutral" language (protobuf or whatever) and compile the language-specific files out of that - but you don't maintain those any more than you maintain any other build output.
Code generators are good - if they are properly supported and tooling good. Have a look at the new work-in-progress about C++ meta classes (P0707): https://youtu.be/80BZxujhY38?t=3141 this stuff looks very neat.
The format is designed specifically to provide enough structural information that the binary format can be parsed without knowing the original structure definitions. This makes it easy to write a binary-to-string converter than can grok any payload with minimal complexity. Message observability was a primary goal during development.
Proto/Flatbuffers are okay for message passing, but many times you don't want to pass messages, you want to serialize C++ structures. There's no way to represent your 1Gb in-memory hashtable as a Flatbuffer, and that is not okay.
Writing a binary-to-string converter sucks. You write code once, and debug it every day. Anything that makes debugging harder is humongous pain that shouldn't exist; binary-to-string converters can be written, but let's be real for a moment: this is something that is dead last on the list of business priorities. If your message can be parsed with ad-hoc shell or Python scripts, that's a huge win for everyone.
Cheeky question: have you Googlers thought about revising this guideline? It seemed weird to me when I read it years ago and it seems to be getting more and more unconventional. When I see a pointer argument in most C++ code now I would assume that passing a null ptr is not an error, but I see quite a lot of the code in this library doesn't check for null pointers inside the function body and would explode if you passed one in.
This doesn't mean I don't sympathize with the justification in the style guide: "References can be confusing, as they have value syntax but pointer semantics". If C++ had a non-null, non-reassignable pointer it might do a lot of what references do and be clearer, particularly for generic code. I don't really know, but references are what we have, they suit indicating the expectation of non-nullity, and it seems to me that the benefit of clearly communicating that expectation gets you more of a benefit than the confusion around semantics takes away.
This argument comes up a lot within Google. The style guide is revisited frequently (for example =delete in the public section is now the standard for disabling copy and/or move/assignment vs. the old macros in the private section).
Non-nullable types are helpful for implying pre-conditions. However, readability at the call site (e.g. Foo(&bar) might mutate bar, whereas Foo(bar) should not) is still considered more valuable in a large scale codebase. Passing nullptr as a pointer argument is generally assumed to not be okay unless explicitly documented as permitted -- this is opposite the assumption that you stated.
There are places exceptions are made, where the use of pointers is deemed more confusing than non-const references (e.g. move-maybe semantics in very specific cases). Ultimately, most code follows the default style guide.
Besides, nullptr dereferences are pretty easy to diagnose in a library like this. And more often than not everywhere else too.
> However, readability at the call site (e.g. Foo(&bar) might mutate bar, whereas Foo(bar) should not) is still considered more valuable in a large scale codebase
but does it make it any more readable ? if you use the pointer anywhere else and have it as a variable then suddenly you don't distinguish anymore between a pointer and a reference. It would frankly make more sense to have empty `#define in` and `#define out` macros and make a small clang plug-in that checks correct usage in your codebase - e.g.
int foo(int x, const foobar& my_foobar, boo& my_boo);
foo(x, in fb, out b); // ok
foo(x, out fb, out b); // compile error
The rule doesn't perfectly eliminate ambiguity, but it does a pretty good job overall. The point is to address the general use cases with familiar constructs that work across multiple toolchains.
This library works well in an embedded context. I've personally used it on Cortex-M class micro controller firmware.
The serializer/deserializer does not require dynamic memory allocation. Whether or not dynamic allocation happens depends on how you use the library. If you avoid using data types that perform dynamic allocation (e.g. std::vector) and use (or write) Reader/Writer types that use static memory (e.g. nop::BufferReader/nop::BufferWriter) then you should be fine.
There are some nice tricks you can use to permit protocols that have convenient dynamic containers on the host and static versions on the embedded device:
These are not currently used by the Reader and Writer types included with libnop for efficiency, but are available for you to use in your own Reader and Writer types if you really need it.
Floating point is a much stickier problem due to lack of standardization across hardware. The library does not address this automatically and just packs floating point types in machine order. However, the library provides tools to help address the issue. One approach that works well is to use a fixed point representation for the wire type -- a value wrapper type is convenient for automating this:
See the Fixed template in the example. This is especially handy if you have a micro controller without floating point support or you want to minimize the size of the payload at the cost of range and/or precision.
The library has a complete suite of tests and 97.9% line coverage according to GCOV, with particular attention to conditional and error paths.
We use the library for a few internal embedded prototypes. I have not tracked its usage outside of Google since its recent release.
For the most part the optimization works well. The more subtle issue is that the union trick used by the endian utilities is not compatible with constexpr expressions. There are some interesting use cases for constexpr serialization that fail if the conversions are interposed in the Reader/Writer types. The alternative is to use reinterpret_cast, which is also incompatible with constexpr expressions.
Moreover, endian conversion templates are unusually frustrating in the current C++ standard. I wish there were a better way, but one does not currently exist.
I tried to keep it C++11, but generalized lambdas are critical in important use cases (see nop::Variant). Besides, GCC and Clang have solid C++14 support. C++17 is another story... ;-)
The problem is not compiler support but the fact library is going to get used with other code base that is still C++11. Typically projects build system will enforce this and it’s hard to change without full evaluation of all dependencies. Is it possible to add support for C++11 and turn of these additional features that might not get used?
I haven't documented it yet since it's not fully baked, but it's quite functional nonetheless. I have a working prototype of RPC over USB between a host PC and a Cortex-M micro controller. The ability to define constexpr dispatch tables is very convenient in a micro controller environment.
What transport do you use? http? Do you use any dependency for handling transport layer? I haven’t seen a good clean dependency less headed-only approach for this yet.
I've done that, but it can be very error prone and therefore time consuming.
Instead of relying on serialization of existing data structures I actually try to use more general data structures that already exist in a single span of memory.
Then serialization is just a matter of writing them out from start to finish, or allocating them in a memory mapped file.
Can cereal handle user defined types without having to write custom serialization and deserialization functions for each? At first glance this seems to be the main difference.
Yes! Hopefully C++20 or C++25-ish will provide better alternatives for compile-time reflection.
Don't forget that libnop also has NOP_EXTERNAL_STRUCTURE (and friends) to decouple the annotation from the structure definition. This is handy when you have a C ABI with C++ internal implementation. I don't recall seeing a similar facility in other libraries.
> Note: This is not an officially supported Google product at this time.
Why is that? Is it kind of like the other project that was here recently, where it was created at FB but now independently maintained and not a corporate sponsored project (anymore)? Or is it deprecated and no longer recommended?
It usually means this is an experimental side project of some Google employee, and not an official part of any Google product strategy. Google likes for its employees' side projects to be released under Google's GitHub org, which you can variously interpret as good (Google wants to promote and support open source experimentation) or evil (Google wants to assert ownership of its employee's side projects) depending on how you feel about Google.
(I used to work at Google and had a lot of side projects. That was before Google moved everything to GitHub, but they liked for me to mark the code as copyright Google but "not an official Google product". I was fine with this arrangement.)
Some are, some aren't. Google wanted me to assign copyright even for projects that I worked on entirely at home. There's an additional process you have to go through if you don't want to assign copyright. Personally I didn't care since it's under an unencumbered open source license anyway, so it hardly makes a difference.
(To be clear, I meant the good/evil thing to be tongue-in-cheek...)
That's... disturbing, and all the more reason to keep your work and private life strongly isolated. Suppose outside of work you write scripts for various things and distribute them online to friends and so forth, or blog posts, etc. Your employer should never be able to claim ownership of that.
It means not currently an officially supported project. Most open source releases of internal code start out this way and may or may not become officially supported. It is different than deprecated, which was once supported but no longer.
Imho, code generators and schema files are a feature. If you don't care about forward or backward compatibility, or portability to other languages, then just avoid pointers and memcpy your structs to disk. There are even libraries like Boost.Interprocess that will help you do just that... even with complex multi-indexed data structures like hash tables and maps inside mmap()'d blocks.
That all said, these days I use Flatbuffers, because it's super efficient, header-only, round-trips to JSON, the code gen is tolerable, and it doesn't lead to executable size bloat.