|
I'd measure twice before cutting. Almost everyone not deep into cross-language interop and VM design intuits, incorrectly, that FFI mechanisms themselves drive interop costs. In practice, it's almost never the case. While, in principle, compiling a libffi signature to native code could be a win, doing so matters a lot less often than you think. Keep in mind that optimizing the call doesn't optimize the marshaling: even with an AOT-compiled FFI trampoline, if you're, say, sending a string from one place to another, you usually need to transform the string in some manner (copy it, change encoding, add/remove length prefixes, etc.) and JITing the libffi parameter passing won't help you do the string stuff any faster. In fact, trying to AOT the connections can make your program worse, both by bloating it (causing some likely small, but still, cache pressure) and by complicating your build and deployment process. libffi bytecode is good. I wouldn't bother with native code until I had a profile in hand showing the bytecode to be the bottleneck, and even then, I'd check it a three or four times to make sure I didn't get the profiling wrong. FFI is just seldom the problem in real-world systems. |