Hacker News new | ask | show | jobs
by pron 54 days ago
The problem is that the search space is so large that correcting errors via guardrails is only effective if the original error rate is low (how many Integer -> Integer functions are there? There's ~1 way to get it right and ~∞ ways to get it wrong).

Sure, we can help the easy cases, but that's because they're easy to begin with. In general, we know (or at least assume) that being able to check a solution tractably does not make finding the solution tractable, or we'd know that NP = P. So if LLMs could effectively generate a proof that they've found the correct Integer -> Integer function, either that capability will be very limited or we've broken some known or assumed computational complexity limit. As Philippe Schnoebelen discovered in 2002 [1], languages cannot reduce the difficulty of program construction or comprehension.

Of course, it is possible that machine learning could learn some class of problems previously unknown to be in P and find that it is in P, but we should understand that that is what it's done: realised that the problem was easy to begin with rather than finding a solution to a hard problem. This is valuable, but we know that hard problems that are of great interest do exist.

[1]: https://lsv.ens-paris-saclay.fr/Publis/PAPERS/PDF/Sch-aiml02...

1 comments

>As Philippe Schnoebelen discovered in 2002 [1], languages cannot reduce the difficulty of program construction or comprehension.

From a model-checking point of view. This is about taking a proof-theoretic approach...

Your last paragraph is also quite wrong: a machine learning could very well easily learn and solve an NP-complete problem, because this property does not say anything about average case complexity (and we should consider Probabilistic complexity classes, so the picture is even more "complex").

> From a model-checking point of view. This is about taking a proof-theoretic approach...

No. In complexity theory we deal with problems, and the model-checking problem is that of determining whether a program satisfies some property or not. If your logic is sound, you can certainly use an algorithm based on the logic's deductive theory (which could be type theory, but that's an unimportant detail) to decide the problem, but that can have no impact whatsoever on the complexity of the problem. The result applies to all decision procedures, be they model-theoretic or deductive (logic-theoretic).

> Your last paragraph is also quite wrong: a machine learning could very well easily learn and solve an NP-complete problem, because this property does not say anything about average case complexity

No. First, it's unclear what "average complexity" means here, but for any reasonable definition, the "average complexity" of NP-hard problems is not known to be tractable. Second, complexity theory approaches this issue (of "some instances may be easier") using parameterised complexity [1], and I'm afraid that the results for the model-checking problem - which, again, is the inherent difficulty of knowing what a program does regardless of how you do it - are not very good. I mentioned such a result in an old blog post of mine here [2]. (Parameterised complexity is more applicable than probabilistic complexity here because even if there were some reasonable distribution of random instances, it's probably not the distribution we'd care about.)

There is no escape from complexity limits, and the best we hope for is to find out that problems we're interested in have actually been easier than we thought all along. Of course, some people believe that the programs people actually write are somehow in a tractable complexity class that we've not been able to define - and maybe one day we'll discover that that's the case - but what we've seen so far suggests it isn't: If programs that people write are somehow easier to analyse, then we'd expect to see the size of programs we can soundly analyse grow at the same pace as the size of programs people write, and nothing can be further from what we've observed. The size of programs that can be "proven correct" (especially using deductive methods!) has remained largely the same for decades, while the size of programs people write has grown considerably over that period of time.

[1]: https://en.wikipedia.org/wiki/Parameterized_complexity

[2]: https://pron.github.io/posts/correctness-and-complexity#corr...