Hacker News new | ask | show | jobs
Why Foreign Function Interfaces are not an easy answer (2012) (flameeyes.blog)
19 points by wheresvic4 1612 days ago
5 comments

>A C interface (a function) is exposed only through its name, and no other data; the name does not encode either the number of the type of parameters, which means that you can’t reflectively load the ABI based off the symbols in a shared object.

What prevents GCC and Clang (and co from Intel/MS) saving a sidecar file next to a program/.so with all the functions and their argument types, struct structures, etc. in some agreed upon format for easy parsing / use? Could even be emitted based on some annotation/pragma like #export before the function declaration.

Is it just that nobody really cares to coordinate such a solution?

They already have it, and they call it a header file.

Better FFI's generate their definitions from such header files, called header parsers. (based on some CPP code). This problem was solved in the 80ies, but I guess forgotten by now.

>They already have it, and they call it a header file

Some languages already use that for the purpose, but it's an ill fit. It's meant for the internal parser of the language, so you need to parse C code (even if it's just the header file syntax). And it wasn't created with FFI in mind, in the first place.

What I propose is a file intended for others wanting to FFI, and thus trivial to parse (could be a schema for a format with ubiquitus parsers, like XML, JSON, YAML, TOML, and co).

That would have the benefit of:

(1) being cross-language (every language that exposes functions with C calling conventions could produce such a file to allow others to interface with it.

(2) being trivial to parse and integrate with ffi facilities of different languages (e.g. ctypes)

(3) being extensible beyond the C-header files give (e.g. metadata about usage, or valid int ranges, even function documentation, etc).

(Article author here.)

Yup, there's also the more common approach of implementing extensions in C (or equivalent), possibly with an intermediate language.

For example, Cython is a much more scalable approach, in my experience, than attempting to describe the interface in Python and using a libffi-style binding.

The best FFIs generate even the C header files from a single parent source.

See: Vulkan's vk.xml

True. I generate my dynamic API headers, code and test cases also automatically. In LibreDWG from some readable spec file. Because static code for 1000 of objects would be overkill. Unlike LiquidXML.
There is no need for a side car file; we could have an extra section in the binary (ELF or other formats) in the same vein we have debug (e.g. dwarf) and other metadata.
Sure, it doesn't have to be sidecar (though as a sidecar you could add it next to existing binaries where you don't have upstream access as well and might be signed so you can't alter them).
On this note, there is also a talk at last's year CppCon.

"Making Libraries Consumable for Non-C++ Developers"

https://www.youtube.com/watch?v=4r09pv9v1w0

For deepstream.io we've gotten quite far with a mix of node and c++ with both languages operating on the same buffer of typed arrays.

Our argument for this weird symbiosis was that we wanted the popularity of the NodeJS ecosystem at the time with the performance of C++ - and we achieved this to an at least satisfactory degree.

When it came to building binaries based on the mixed codebases, especially for windows, things were a lot more challenging.

This is independent of the valid points of the article, but I have found much success in moving to only using types and constructs that adhere to the apache arrow specification.

To tie it back to the piece, it would be pretty easy to generate code that outputs function definitions from an apache arrow scheme in both C and a target interpreted language such as python (e.g. ctypes bindings).

Should be (2012)