Hacker News new | ask | show | jobs
by Animats 2908 days ago
C2Rust basically compiles C into a very low level program in Rust. You've then lost the C idioms. Now you have to decompile the low-level rust with pointer arithmetic into Rust abstractions. That's very hard, probably harder than converting C idioms to Rust idioms, checking to see if the result will be equivalent, and falling back to low-level compilation only when absolutely necessary.

The key to this is figuring out the comparable representation for data. Mostly this is a problem with arrays, since C's array/pointer system lacks size info. All C arrays have a size; it's just that the language doesn't know it. The trick is to figure out how the program is representing the size info. Somewhere, there was probably a "malloc" which set the size, and you may have to track backwards to find it. Then you can replace the C array with a Rust array that carries size information, and maybe eliminate variables which carry now-redundant size info.

That would produce readable Rust. But it requires whole-program analysis. That's OK, that's what gigabytes of RAM are for.

1 comments

I suspect it's not possible for most interesting programs, even with whole program analysis. As soon as you start storing pointers behind other pointers, it's (very) hard to keep track of where they came from.

There's more discussion in the replies to https://news.ycombinator.com/item?id=17382464 that you may've missed.

Yes, I know; I started that discussion.
I see, I'm sorry. It just seemed like you might not have noticed additional replies to your comment, since you didn't reply there and essentially didn't change what you said.