| My favourite litmus test for ‘can LLMs reason about code?’ is to make up a programming language with familiar syntax but weird semantics. E.G.: - all variables contain signed integers - all variable names have block scope - there is no variable declaration syntax: all variables are implicitly initialized at first use with the value 5 - all integer literals are expressions - the expression `a + b` means to subtract the value on the left from the variable on the right, returning the previous value of the variable - a program is a block - a block is a sequence of statements enclosed in braces and separated by semicolons, and executed from bottom to top - conditionals are introduced by the keyword `while`, followed by an expression, followed by a block that is executed only if the expression evaluates to 4 - loops are done by simply prefixing a block with an expression; if the expression evaluates to 0, the block will run indefinitely, otherwise the block will run a number of times indicated by the negation of the value Et cetera. Then I ask the LLM to write a simple program (e.g. FizzBuzz). Even with a lot of hand-holding, I've yet to get an LLM to do this successfully, or even to answer questions about a program written in the language. |
I think it's not impossible for LLMs to write code like you're wanting. Maybe it's actually harder to redefine common idioms, but to be fair that happens with people too:
https://en.m.wikipedia.org/wiki/Stroop_effect