Hacker News new | ask | show | jobs
by lambda 700 days ago
Calling C from Rust can be quite simple. You just declare the external function and call it. For example, straight out of the Rust book https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html#usin... :

  extern "C" {
      fn abs(input: i32) -> i32;
  }

  fn main() {
      unsafe {
          println!("Absolute value of -3 according to C: {}", abs(-3));
      }
  }
Now, if you have a complex library and don't want to write all of the declarations by hand, you can use a tool like bindgen to automatically generate those extern declarations from a C header file: https://github.com/rust-lang/rust-bindgen

There's an argument to be made that something like bindgen could be included in Rust, not requiring a third party dependency and setting up build.rs to invoke it, but that's not really the issue at hand in this article.

The issue is not the low-level bindings, but higher level wrappers that are more idiomatic in Rust. There's no way you're going to be able to have a general tool that can automatically do that from arbitrary C code.

3 comments

There's also cbindgen for going the other way around. https://github.com/mozilla/cbindgen
Passing integers around is easy, sharing structs or strings and context pointers for use in callbacks crossing the language barrier etc is typically much harder.
For rust code calling C, sharing structs is doable with #[repr(C)]. See https://doc.rust-lang.org/reference/type-layout.html#reprc-s...

(Nitpick: I don’t think it technically is correct to call this “The C representation”, as strict layout in C depends on the C compiler/ABI. I wouldn’t trust this to be good enough for serializing data between 32-bit and 64-bit systems, for example. For calling code on the same system, it’s good enough, though)

That's not really "simple", it's on par with C FFI in about any other language (except C++), with same drawbacks.
It's on par with C++, too. In C++ you need an `extern "C"`, because C++ linkage isn't guaranteed to be the same as C linkage. You can get away with wrapping that around it in a preprocessor conditional, but that's not all that much easier than Rust's bindgen.

A lot of C to C++ interop is actually done wrong without knowing it. Throwing a C++ static function as a callback into a C function usually works, but it's not technically correct because the linkage isn't guaranteed to be the same without an extern "C". In practice, it usually is the same, but this is implementation-defined, and C++ could use a different calling convention from C (e.g. cdecl vs fastcall vs stdcall. The Borland C++ compiler uses fastcall by default for C++ functions, which will make them illegal callbacks for C functions).

The major difference between Objective-C and C++'s C interop and other languages is the lack of the preprocessor. Macros will just work because they use the same preprocessor. That's really not easy to paper over in other languages that can't speak the C preprocessor.

I think you're confusing some terms here.

> In C++ you need an `extern "C"`, because C++ linkage isn't guaranteed to be the same as C linkage.

`extern "C"` has nothing to do with linkage, all it does is disable namemangling, so you get the same symbol name as with a C compiler.

> Throwing a C++ static function as a callback into a C function usually works, but it's not technically correct because the linkage isn't guaranteed to be the same without an extern "C".

Again, linkage is not relevant here. Your C++ callbacks don't have to be declared as extern "C" either, because the symbol name doesn't matter. As you noted correctly, the calling conventions must match, but in practice this only matters on x86 Windows. (One notable example is passing callbacks to Win32 API functions, which use `stdcall` by default.) Fortunately, x86_64 and ARM did away with this madness and only have a single calling convention (per platform).

> `extern "C"` has nothing to do with linkage, all it does is disable namemangling, so you get the same symbol name as with a C compiler.

extern "C" also ensures that the C calling convention is used, which is relevant for callbacks. It's not just name mangling. This is the reason that extern "C" static functions exist. You can actually overload a C++ function by extern "C" vs extern "C++", and it will dispatch it appropriately based on whether the passed in function is declared with C or C++ linkage.

And I'm not sure the terms are confused, because that's how most documentation refers to it: https://learn.microsoft.com/en-us/cpp/cpp/extern-cpp?view=ms...

> In C++, when used with a string, extern specifies that the linkage conventions of another language are being used for the declarator(s). C functions and data can be accessed only if they're previously declared as having C linkage. However, they must be defined in a separately compiled translation unit.

And https://en.cppreference.com/w/cpp/language/language_linkage

The post you're replying to had it completely right. extern "C" is entirely about linkage, which includes calling convention and name mangling.

> As you noted correctly, the calling conventions must match, but in practice this only matters on x86 Windows.

Or if you want your program to actually be correct, instead of just incidentally working for most common cases, including on future systems.

If you're passing a callback to a C function from C++, it's wrong unless the callback is declared extern "C".

> extern "C" also ensures that the C calling convention is used, which is relevant for callbacks. It's not just name mangling.

I stand corrected. I didn't know that `extern "C"` enforces the C calling convention.

However, on modern platforms this doesn't really matter because, as I said, there is only a single calling convention (per platform). And I'm pretty sure that future platforms will keep it that way. Fortunately, if you try to pass a C++ callback of the wrong calling convention, you get a compiler error.

> If you're passing a callback to a C function from C++, it's wrong unless the callback is declared extern "C".

That's certainly not true because `extern "C"` is not the only way to specify the calling convention. In fact, you might need a different calling convention! As I mentioned, on x86 the Windows API uses stdcall for all API functions and callbacks, so `extern "C"` would be wrong. If you look at the Microsoft examples, you will see that they declare the callbacks as WINAPI (without `extern "C"`): https://learn.microsoft.com/en-us/windows/win32/procthread/c...

So I stand by my point that in practice you don't need `extern "C"` for passing C++ callbacks to C functions. You can pass a lambda function just fine, and when it doesn't work the compiler will tell you.

A couple big caveats here:

* cdecl is a platform specific calling convention. There is no standard C ABI. cdecl is a wintel thing, not the standard C calling convention. On Linux, this is the System V ABI for instance. On Windows ARM, it's also not cdecl.

* Specifying calling convention at all is a compiler specific extension. There is no standard way of specifying a C calling convention without `extern`.

So specifying cdecl gets you the right calling convention on some platforms and ties your code to some specific compilers. The only portable way to specify C linkage in a C++ program is extern "C". You will always get the right ABI for your platform and it will work on every compiler.

> So I stand by my point that in practice you don't need `extern "C"` for passing C++ callbacks to C functions. You can pass a lambda function just fine, and when it doesn't work the compiler will tell you.

The compiler will very often not tell you. It will complain if the lambda can't be coerced to a function pointer (because it's a closure) or if the argument or return types are wrong. An incorrect ABI will usually be accepted and will just do the wrong thing or crash at runtime. The C++ standard says that language linkage is part of a function's type, but very few compilers actually support this.

Your position works sometimes for some compilers and some platforms. I assert that it's better to use standard C++ features and just work everywhere.

How is that not simple? You just declare the function and then call it. I find it hard to imagine how it could be any more simple than that.
Now imagine a hundred or two functions, structures and callbacks, some of them exposed only as CPP macros over internal implementation. PJSIP low level API is one example.
But... that's what bindgen is for. Which I mentioned.

I said it "can be quite simple"; for simple use cases, just using extern and translating the declarations by hand is perfectly viable.

For more complex cases, you use bindgen.

Bindings generators exist in most other languages with same limitations.

I would love to see how bindgen would handle a function call defined as a preprocessor macro that I mentioned. Because most likely it won't.

Can someone shed some light on why the parent comment (by varjag) is downvoted?
... And? Most languages make C interop simple.
They quickly become unwieldy on non-trivial APIs, with hundreds of definitions across dozens of files and with macros to boot. Naturally people would still get the job done but it's beyond simple.
That's what bindgen is for, as was mentioned in the original comment you replied to.
How well does it handle preprocessor macros in APIs?
I have used it successfully against header files for Win32 COM interfaces generated from IDL which include major parts of the infamous "windows.h". Almost every type is a macro.

This is an extremely well-understood space.

Just open the docs and do it.