Hacker News new | ask | show | jobs
by swiftcoder 39 days ago
> The converted unsafe rust segfaulted at the same place the C code did. It's compatible, but not safe

That is indeed the point of c2rust. It gives you a baseline that is semantically identical to the original codebase, and with that passing the full test suite, bug-for-bug, you can then start gradually adopting rusty idioms to improve the memory safety of the codebase.

2 comments

What comes out of c2rust is not intended for human consumption. It's more verbose than the original and harder to work on, but no safer. You lose the C idioms that people understand, while not gaining Rust idioms. It's like working on compiler-generated assembly code by hand.

2022 discussion on HN.[1]

There's a DARPA funded effort called TRACTOR, Translate All C To Rust, which has funded some efforts to develop a usable translator.[2] It's about 10 months after award, with no reported progress. I've been checking the personal sites of the academics involved, and they barely mention the project, although $5 million has been allocated to it.[3] The approach comes from U.C. Berkeley - let the LLM generate slop, check it using formal methods.[4] Not expecting near-term results.

[1] https://news.ycombinator.com/item?id=30169263

[2] https://csl.illinois.edu/news-and-media/translating-legacy-c...

[3] https://chandrasekaran-group.github.io/

[4] https://metalift.pages.dev/

Here is a public report from the TRACTOR evaluation team: https://github.com/DARPA-TRACTOR-Program/Reports/blob/main/F...

There are also some papers being published that were funded by TRACTOR, such as https://homes.cs.washington.edu/~mernst/pubs/c-rust-macros-p...

Evaluations of six translators. That's real progress.

Here are the test cases for evaluation #1.[1] There's good coverage of the C language, but the individual tests are mostly simple exercises of one C feature. The next round of test cases will probably be closer to useful programs.

[1] https://github.com/DARPA-TRACTOR-Program/PUBLIC-Test-Corpus

> let the LLM generate slop, check it using formal methods

I'm much more bullish on the opposite approach. Perform the naive translation, let the LLM loose on cleaning it up...

> What comes out of c2rust is not intended for human consumption.

That doesn’t really mean anything.

> It's more verbose than the original and harder to work on, but no safer. You lose the C idioms that people understand, while not gaining Rust idioms.

Yes, and?

The value of c2rust is that you now have the entire codebase working with the rust toolchain, you’re not juggling toolchains and you’re not managing a wavefront of FFI, only a wavefront if unsafe.

C2rust is not the end, it’s the start. It’s never claimed to be anything more (the official website even mentions galois and immunant are working on tooling to convert unsafe to safe / idiomatic rust though i don’t know if that got anywhere yet).

c2rust can generate UB in Rust even when there is no UB in the original C. It isn't bug-free, and C and Rust's undefined behavior semantics overlap but aren't identical.

One example: https://github.com/immunant/c2rust/issues/1678

I firmly believe the right way to port C and C++ (and Zig) programs to Rust is to do it module by module ("Ship of Theseus"). It needs scrutiny by folks who know both languages deeply, and you can port test cases too so you can detect UB at runtime (using tools like Miri). That's what fish did, and their port has been quite successful.

Blindly trusting the results of a machine translation is never a good idea. Especially when the translator has a temperature parameter.