|
|
|
|
|
by juliangmp
263 days ago
|
|
> The test case which caused a segfault in the C version also caused a segfault in the unsafe Rust version. At least it generates compatible code. I mean... That's what I would expect. It's a translation not a rewrite.
The question here is really “what's the end goal?”. Are you gonna refactor parts of the translations output, or is that too much work?
Cause a pure translation won't give you anything besides a different compiler and build system. Ah well I guess you can claim you're memory safe and blazingly fast |
|
The key to this is using an LLM to recognize idioms, then formal methods to verify that the LLM guessed right. Which is what the current project is trying to do.
For example, consider C code with a function:
TRACTOR needs to do something like:- Look at foo and its callers. Determine whether "tab" is an array or an indirect pass of a single integer.
- Look at foo and see if tab is ever modified. That determines whether it is mutable. It may be necessary to go down the call chain. This analysis may not be able to determine if it is mutable, but if determines it is immutable, tab can be passed immutably.
- Ask the LLM to figure out the size of tab from context in both foo and its callers. Is len the length of tab? That's a common idiom in C an LLM can recognize, but it might be a wrong guess. Is it the length in bytes or in ints? Remember, the programmer had to be able to figure this out, so the LLM has a good chance of doing it.
- If tab is immutable, hypothesize that the correct translation is
and within "foo", len comes from tab.len()Now attempt to use "lifting" formal analysis, as mentioned in the paper, to see if that changes the semantics of the program. Does "foo" subscript out of range? Can you prove that it doesn't?
This problem needs both LLMs and formal methods. LLMs to infer intent, and formal methods to check that inference.