|
|
|
|
|
by pron
30 days ago
|
|
A workable C compiler is a ~10-50KLOC program, and a fairly simple one at that (batch, with no concurrency or interaction). That Anthropic's swarm of agents wrote 100KLOC before failing is a symptom of the problem. It's certainly possible that many programs are in the sub 5KLOC range, but it's definitely not "most software". Plus, almost no software has this level of detailed spec, ready-made tests, and a selection of existing implementations of the same spec. My first thought when reading Anthropic's description of the experiment was that it is unrealistically easy. It's hard to come up with realistic jobs in the 10-50KLOC range that would be this easy for an LLM. That it failed only shows how much further we still have to go. |
|