| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dotancohen 387 days ago
	I realized this too, and it led me to the conclusion that LLMs really can't program. I did some experiments to find what a programming language would look like, instead of e.g. python, if it were designed to be written and edited by an LLM. It turns out that it's extremely verbose, especially in variable names, function names, class names, etc. Actually, it turned out that classes were very redundant. But the real insight was that LLMs are great at naming things, and performing small operations on the little things they named. They're really not good at any logic that they can't copy paste from something they found on the web.

2 comments

short_sells_poo 387 days ago

Is this really a surprise? I'd hazard a guess that the ability to program and beyond that - to create new programming languages - requires more than just probabilistic text prediction. LLMs work for programming languages where they have enough existing corpus to basically ape a programmer having seen similar enough text. A real programmer can take the concepts of one programming language and express them in another, without having to have digested gigabytes of raw text.

There may be emergent abilities that arise in these models purely due to how much information they contain, but I'm unconvinced that their architecture allows them to crystallize actual understanding. E.g. I'm sceptical that there'd be an area in the LLM weights that encodes the logic behind arithmetic and gives rise to the model actually modelling arithmetic as opposed to just probabilistically saying that the text `1+1=` tended to be followed by the letter `2`.

link

weird-eye-issue 387 days ago

> I did some experiments to find what a programming language would look like, instead of e.g. python, if it were designed to be written and edited by an LLM.

Did your experiment consist of asking an LLM to design a programming language for itself?

link

dotancohen 387 days ago

Yes. ChatGPT 4 and Claude 3.7. They led me to similar conclusions, but they produced very different syntax, which led me to believe that they were not just regurgitating from a common source.

link

weird-eye-issue 387 days ago

Great so your experiment just consisted of having an LLM hallucinate

That's not really an experiment is it? You basically just used them to create a hypothesis but you never actually proved anything

They're great at writing text and code so the fact that the other LLM was able to use that syntax to presumably write code that worked (which you had no way of proving since you can't actually run that code) doesn't really mean anything

It would be similar to having it respond in a certain JSON format, they are great at that too. Doesn't really translate to a real world codebase

link

dotancohen 386 days ago

  > That's not really an experiment is it? You basically just used them to create a hypothesis but you never actually proved anything

The experiment was checking how well another unrelated LLM could write code using the syntax. And then in the reverse direction in new sessions.

  > They're great at writing text and code so the fact that the other LLM was able to use that syntax to presumably write code that worked (which you had no way of proving since you can't actually run that code) doesn't really mean anything

Of course I could check the code. I had no compiler for it, but "running" code in one's head without a compiler is something first year students get very good at in their Introduction To C course. And checking how they edit and modify the code.

This isn't a published study, it was an experiment. And it influenced how I use LLMs for work, for the better. I'd even call that a successful experiment, now that I better understand the strengths and limitations of LLMs in this field.

link

weird-eye-issue 386 days ago

> And it influenced how I use LLMs for work, for the better

How so?

link

dotancohen 386 days ago

I let the LLM come up with all the boiler plate classes, functions, modules, etc that it wants. I let it name things. I let it design the API. But what I don't let it do, is design the flow of operations. I come up with a flow chart as a flow of operations, and explain that to the LLM. Almost any if statement is a result of something I specifically mentioned.

link

QuercusMax 387 days ago

Is there a reason you believe the models can accurately predict this sort of thing?

link

dotancohen 387 days ago

There wasn't, but after taking the syntax that I developed with one model to another model, and having it write some code in that syntax, it did very well. Same in the other direction.

LLMs need all their context within easy reach. An LLM-first (for editing) language still has code comments and docstrings. Identifier names are long, and functions don't really need optional parameters. Strict typing is a must.

link