Hacker News new | ask | show | jobs
by saboot 1298 days ago
Still hoping for compile time introspection/reflection for class serialization. Whichever language implements it first (C++ or other) I'm all in on. I come from a scientific background, where running code on data gathering machines, and writing it out, then reading it back in later for analysis is 90% of what I do.
9 comments

I’ve done this with libclang: parsing C++ with clang.cindex in Python, walking the AST for structs with the right annotation, and generating code to serialize/deserialize. All integrated into a build system so the dependency links are there. Obviously being built into the language would be way better, but if I was spending 90% of my time I would take any necessary steps.
Interesting, sounds similar to the dictionary that CERN ROOT generates. Id like to be able to do the same, and a generic "dictionary maker" by what you've described could be useful for allowing multiple formats
Interested in sharing any code? This will be useful to many C++ devs who need any sort of reflection in their workflow (especially for gamedevs)
not op, but i've done this a couple times both through the python API:

    https://github.com/jcelerier/dynalizer
to automatically generate safe dlopen stubs for runtime dynamic library loading from header files

and through the C++ one (this one is an extremely quick and dirty prototype):

    https://github.com/ossia/score/blob/master/src/plugins/score-plugin-avnd/SourceParser/SourceParser.cpp
to pre-instantiate get<N>(aggregate), for_each(aggregate, f) and other similar functions in https://github.com/celtera/avendish because of how slow it is when done through TMP (doing it that way removed literally dozens of megabytes from my .o and had a positive performance impact even with -O3) ; so I weep a lot when I read that people in the committee object to pack...[indexing]
Have you checked out the PFR library (perfect flat reflection)? I've coupled this with the magic-enum library to good effect.

PFR can be rewritten in very little code, assuming c++14(?); magic-enum is long enough to just use.

I generally have one TU for just serialization, and don't let PFR and magic-enum "pollute" the rest if my code. This keeps compile times reasonable. (The other is to uniquely name the per-type serializer: C++'s overload resolution is O(n^2)). I then write a single-definition forwarding wrapper (a template) that desugars down to the per-type-name serializers. It strikes a good balance between hand-maintenance, automatic serialization support, and compile-time cost.

This does look very interesting, thank you!
You can do this with Haskell (aeson package) and maybe with Rust (serde?)
Java annotations have enabled compile-time reflection since Java 6, and of course it has been used for serialization: https://github.com/square/moshi/#codegen
Rust basically supports this with pretty low complexity via serde, but I think many developed languages have at least something to do this, although in some it has to be hacked on.
Reflection is definitely a big topic of discussion, but I'm not sure whether it will make it in time for the finalization of the C++23 spec. I think this is the most recent iteration of the proposal:

https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2022/p12...

Now there are two competing proposals, with luck maybe one of them can make it to C++26. or maybe not.
I'm curious about how you could use https://celtera.github.io/avendish for this. I've developed it to enable very easily creating media processors with the little reflection we can do currently in c++20; in my mind data gathering would not be too dissimilar of a task.

It makes me really sad reading about the objections to pack indexing as this library needs it a LOT (and currently, doing it with std::get<> or similar is pretty pretty bad and does not scale at all past 200 elements in terms of build time, compiler memory usage & debug build size)

Compile-time introspection and reflection have been implemented in GHC Haskell as the Generic class. Basically the compiler synthesizes a representation of your data type in terms of basic operations like :+: or :*: (for sum types and product types) and you can easily operate on them. Is that what you mean by compile-time introspection?

It's already being used (for many years in fact) to implement JSON serialization and deserialization in arson without depending on Template Haskell (kind of like macros).

What about making JSON that reflects the class structure and serializing that?
Well yes, but if I had reflection I could make a general 'serializer' routine that has backends for multiple formats (JSON, HDF5, CDF, ROOT, etc).
I’m pretty sure D supports this.