| HN Mirror

A few months ago I created a little API to help with an obnoxious case in FFI [0] in an extremely esoteric language known as Python. It was straightforward, it had a fully typed signature, and I fully documented it. (And the entire implementation was only 50 lines or so of intentionally very straightforward code.) The LLM (Codex 5.2 IIRC) could not manage to call the function with the right arguments even after multiple rounds of prompting.

Sometimes I think LLMs are unbelievably, amazingly good at things. And sometimes I’m deeply suspicious that they really not very smart, and this was an example of the latter.

[0] Python calling to C, passing a callback function pointer and a void *opaque that C will pass back to the callback. Short of writing an extension module, this is pretty much forced to go through an inherently nasty JIT codegen process in libffi, which is sort of tolerable, but you really don’t want to redo it for each object that gets opacified to void*. Codex passed a lambda, which did the nasty JIT thing every time. I wrote a little shim using weakref. Apparently no one has done this before, so Codex wasn’t trained on it, and it couldn’t make itself call the function. Maybe I should post it to PyPI.