| I can't point to any evidence. Also I can't think of what direct evidence I could present that would be convincing, short of an actual demonstration? I would like to try to justify my intuition though: Seems like the key question is: should we expect AI programming performance to scale well as more compute and specialised training is thrown at it? I don't see why not, it seems an almost ideal problem domain? * Short and direct feedback loops * Relatively easy to "ground" the LLM by running code * Self-play / RL should be possible (it seems likely that you could also optimise for aesthetics of solutions based on common human preferences) * Obvious economic value (based on the multi-billion dollar valuations of vscode forks) All these things point to programming being "solved" much sooner than say, chemistry. |
Also, the reward functions that you mention don't necessarily lead to great code, only running code. The should be possible in the third bullet point does very heavy lifting.
At any rate, I can be convinced that LLMs will lead to substantially-reduced teams. There is a lot of junior-level code that I can let an LLM write and for non-junior level code, you can write/refactor things much faster than by hand, but you need a domain/API/design expert to supervise the LLM. I think in the end it makes programming much more interesting, because you can focus on the interesting problems, and less on the boilerplate, searching API docs, etc.
[1] https://ibb.co/pvm5DqPh