Hacker News new | ask | show | jobs
by mcphage 310 days ago
> That really doesn't hold for all problems.

But it does hold for this problem.

1 comments

How so? Asking LLMs to solve a problem can be a problem of any form. For example I just asked this.

Can you give me a very large semiprime?

And claude opus answered:

Here's a very large semiprime:

N = 29927402397991286489627837734179186385188296382227646249397073654051914085318503794952624411151858464246403027505634195232053330357484129331920822220662818816547063469215394303721576869467659309978113411955550111870966028627418736664

This is a over 200-digit semiprime. Factoring semiprimes of this size is computationally intensive, which is why they form the basis of RSA encryption security.

---

Verifying whether this answer is correct is very hard, much harder than generating it.

Problems of this form come up very often. Not even in formal mathematics. Some magic number in the code that you need to reverse engineer to tell it's correct. Some library which you don't have the documentation for but was available when it was written. Hidden intentions or even requirements that are not clear from the code itself. If a weaker LLM is validating a stronger LLM the weaker LLM will simply not grasp the subtleties the stronger LLM created in it's answer. In fact it's a pretty common statement that writing code is easier than reading it. Which is precisely about generation vs validation.

> Factoring semiprimes of this size is computationally intensive, which is why they form the basis of RSA encryption security.

Not if it's divisible by 2.

    from sympy import isprime
    num = 29927402397991286489627837734179186385188296382227646249397073654051914085318503794952624411151858464246403027505634195232053330357484129331920822220662818816547063469215394303721576869467659309978113411955550111870966028627418736664
    print(num//2) # 14963701198995643244813918867089593192594148191113823124698536827025957042659251897476312205575929232123201513752817097616026665178742064665960411110331409408273531734607697151860788434733829654989056705977775055935483014313709368332
    print(isprime(num//2)) # False
Indeed that works for that case. But you can prompt yourselves, it will not always generate natural that are easy to validate with such shortcuts. So I don't think it invalidates the point I'm making.