Hacker News new | ask | show | jobs
by cheez 2397 days ago
Is this a neat trick or just standard operating procedure for calling C from <your favorite lang>? As it was billed as a trick, I was expecting some sort of runtime code generation to pass the data pointer and some jump instruction to jump to the right spot and unpack the data pointer.

Maybe I just overcomplicate things ;-)

3 comments

I’d say that “standard procedure” would be to do it the same way as it would be done in C: define a struct, allocate one somewhere, then pass a pointer to it as the data pointer. Using the anonymous struct which represents the closure itself seems like skipping a step, the user doesn’t need to spell out which values are stored in the struct.
If the language supports closures which capture variables from their surrounding environment, there's no way around using "the closure itself" as your data object. After all, "the user" is not expected to "know" what any given closure is capturing from the environment; part of the point of closures is implementing a sort of information hiding.
It's standard procedure. I've done the exact same thing when wrapping C APIs into Python using Cython, several times. You pass the Python closure as the void *data and then register a shared generic callback which casts it and calls it. Easy. Getting the memory management right is slightly tricky, but not too bad.

Fun fact: you can't safely do this with ctypes. Since it is called as pure Python, it cannot do watertight Python exception handling in a callback context (because even if you have a try/except block, an exception can always happen right before or after it), and ctypes provides no usable internal way of doing it - it just eats exceptions inside callbacks. This is what motivated me to rewrite Ceph's librbd bindings from ctypes to Cython.

I thought as much, thanks for the confirmation :-)
It does seem quite similar to Haskell FFI code: https://github.com/bobfrank/hasqlite/blob/4e38801d969a43e88b...

The "neat" factor comes from how little type wrangling and unsafe code is needed.

I believe this actually JITs a trampoline with libffi, so only one code pointer is needed, not separate code and data pointers.

(Also hi, go contribute to Nixpkgs again!)