Hacker News new | ask | show | jobs
by kadoban 779 days ago
> This is where things start to get interesting. At first, I found the concept of probabilistic primality tests strange and tried to look for deterministic algorithms that could handle huge numbers. I did find two - APR-CL and ECPP. Both of these are so mathematically complex that I could not make sense of their research papers at all, and there isn't much accessible information about them on the internet for someone like me who is bad at math.

> After taking a look at discussions online, OpenSSL's source code and recommendations by NIST, I realized that almost everyone including RSA uses probabilistic algorithms. The catch is that if implemented properly, these algorithms have an extremely low error rate which is negligible.

For a given maximum number range, it's trivial to make Miller-Rabin actually deterministic. You just choose bases that have been proven to together exclude all pseudoprimes in the given range.

(It doesn't even end up being a long list, Miller-Rabin kicks ass)

2 comments

>For a given maximum number range, it's trivial to make Miller-Rabin actually deterministic. You just choose bases that have been proven to together exclude all pseudoprimes in the given range.

What are the bases for the range of 1024-bit numbers? I couldn't find an answer online.

You can find the bases for N < 3x10^21 on wikipedia [0] which make Miller-Rabin deterministic. 1024 bit numbers have ~300 digits and as far as i know, no known bases exist for that range.

[0]: https://en.wikipedia.org/wiki/Miller%E2%80%93Rabin_primality...

Just testing all the bases from 2 through 2 log(N)^2 doesn't actually seem to be too bad for N around 2^1024. It would be a little under 1007600 bases.

I've got a Miller-Rabin implementation in Python 3. Testing 2^1024 + 643, which is the first prime greater than 2^1024 with 65 random bases takes about 0.25 seconds on my Mac Studio. This is with no attempts at optimization or at using multiple cores.

At that rate testing 1007600 bases would take under 65 minutes.

Certainly way longer than you'd want to wait around for something like key generation, but if you didn't actually need your prime for an hour and really wanted to be certain it was a prime its not outrageous.

EDIT: I modified my Miller-Rabin to be the deterministic version, and made it take two parameters: cores and worker. I modified the loop that runs through the bases from 2 to floor(2 log(N)^2) to only actually test when "cores == 1 or base % cores == worker". I made my test program take cores and worker on the command line.

Then I ran this shell script, where prime.py is a program that checks 2^1024 + 643 for primality with the aforementioned deterministic Rabin-Miller:

  ./prime.py 8 0 &
  ./prime.py 8 1 &
  ./prime.py 8 2 &
  ./prime.py 8 3 &
  ./prime.py 8 4 &
  ./prime.py 8 5 &
  ./prime.py 8 6 &
  ./prime.py 8 7 &
  wait
Each of those 8 copies is testing a different 1/8th of the bases that need to be tested. My computer has 8 performance cores and the hope was that MacOS would run each on a different core. That did seem to be the case.

It took 392 seconds for them all to complete, with each correctly reporting that 2^1024 + 643 has passed for all their bases.

> Just testing all the bases from 2 through 2 log(N)^2 doesn't actually seem to be too bad for N around 2^1024. It would be a little under 1007600 bases.

Unfortunately this limit depends on the currently unproven conjecture (a subset of the generalized Riemann hypothesis). You have to check all n bases to be unconditionally correct.

Going up to n is surely excessive. You need a version of the Riemann hypothesis to get any sort of logarithmic bound, but a sqrt(n) bound can be achieved using unconditional Polya-Vinogradov type inequalities. (A successful Miller-Rabin test corresponds to a nontrivial value of a Dirichlet character, and Polya-Vinogradov ensures that the least such value cannot exceed sqrt(n) * ln(n), unconditionally without the need for any Riemann hypothesis.)

A reference for this material is "Explicit bounds for primality testing and related problems" by Eric Bach, Math. Comp. 55 (191), July 1990, pp. 355-380.

The best asymptotic bound is somewhere between n^(1/8sqrt(e) + eps) and n^(1/6.568sqrt(e) + eps), per [1].

[1] https://doi.org/10.1007/BFb0030409

You only need to try the _prime_ bases up to 2 log^2(n). So the total number of bases here would be 79060.
Here is prime.py if anyone wants to play around with it: https://pastebin.com/FFw4qrvU
Oh, woops. I thought the list went way higher, my bad.

I did read that I think if you do up to 2*ln(n)^2 you're always good, but that's not _that_ short of a list. Probably better off just keeping it probabalistic, negating my original post :(

Besides, when you're just looking for a prime, you can spot things that look like a prime and test them deterministically.