|
|
|
|
|
by cobbal
266 days ago
|
|
There are 4 important components to describing a compiler. The source language, the target language, and the meaning (semantics in compiler-speak) of both those languages. We call a C->asm compiler "correct" if the meaning of every valid C program turns into an assembly program with equivalent meaning. The reason LLMs don't work like other compilers is not that they're non-deterministic, it's that the source language is ambiguous. LLMs can never be "correct" compilers, because there's no definite meaning assigned to english. Even if english had precise meaning, LLMs will never be able to accurately turn any arbitary english description into a C program. Imagine how painful development would be if compilers produced incorrect assembly for 1% of all inputs. |
|
The LLM in this loop is the equivalent of a human, which also has ambiguous source language if we’re going by your theory of English being ambiguous. So it sounds like you’re saying that if a human produces a C program, it is not verifiable and testable because the human used an ambiguous source language?
I guess for some reason people thought I meant that the compiler would be LLM > machine code, where actually I meant the compiler would still be whatever language the LLM produces down to machine code. Its just that the language the LLM produces can be checked through things like TDD or a human, etc...