Hacker News new | ask | show | jobs
by harpocrates 2904 days ago
The idea is that this is a first step towards safe Rust. First, you convert to unsafe (but semantically preserving) Rust, then you refactor. The refactor stage probably will involve changing some semantics (read: fixing bugs), or perhaps proving some properties with an SMT solver before applying certain transformations (converting a `libc::c_int` to an `i32`, or a `*const i32` to a `&i32`).
2 comments

C2Rust basically compiles C into a very low level program in Rust. You've then lost the C idioms. Now you have to decompile the low-level rust with pointer arithmetic into Rust abstractions. That's very hard, probably harder than converting C idioms to Rust idioms, checking to see if the result will be equivalent, and falling back to low-level compilation only when absolutely necessary.

The key to this is figuring out the comparable representation for data. Mostly this is a problem with arrays, since C's array/pointer system lacks size info. All C arrays have a size; it's just that the language doesn't know it. The trick is to figure out how the program is representing the size info. Somewhere, there was probably a "malloc" which set the size, and you may have to track backwards to find it. Then you can replace the C array with a Rust array that carries size information, and maybe eliminate variables which carry now-redundant size info.

That would produce readable Rust. But it requires whole-program analysis. That's OK, that's what gigabytes of RAM are for.

I suspect it's not possible for most interesting programs, even with whole program analysis. As soon as you start storing pointers behind other pointers, it's (very) hard to keep track of where they came from.

There's more discussion in the replies to https://news.ycombinator.com/item?id=17382464 that you may've missed.

Yes, I know; I started that discussion.
I see, I'm sorry. It just seemed like you might not have noticed additional replies to your comment, since you didn't reply there and essentially didn't change what you said.
I would really love to talk to you about this, we are working some on this for a project related to converting C to a safe C superset.