| HN Mirror

I'm willing to bet 10% of my net worth on this. But my claim was not about any given untrained child (for instance, a child who does not want to program would do poorly): a fair bet would allow me to choose the child, you to choose the LLM, use a task and programming language of the child's choice, and have a neutral third-party familiar with the programming language judge "better code". (I would, of course, want to ensure that the judge used an appropriate rubric: RLHF can produce a sophisticated turd-polisher. Perhaps the evaluation process could involve modifications made to the program?)

It is (rightly) difficult to get hold of one uninvolved child, for safeguarding reasons, so it would be better to run it as a school (or interschool) competition, where multiple children may participate. For fairness, you may also provide multiple LLM participants (however you define that). The winner of the contest, as determined by the judge, would then determine the winner of the bet – unless the winning child had been trained, in which case we would fall back to the next-highest-ranked participant. The number of LLM candidates would be equal to the number of eligible children.

However, I don't see a good way to allow each child to pick a programming language and task, without leaving the competition results incomparable. So perhaps each child should be paired with an LLM, and the judge should determine which submission from each pair is better? But then if I only need one victory (to support my claim), this is clearly unfair. So each pair should be tested enough to determine whether they're consistently better than the LLM… but then we are demanding a lot of the child participants, for no real benefit to them.

If we can agree on a workable protocol, I can try to pull some strings and see if we can make this happen. I could use the money.