Hacker News new | ask | show | jobs
by abetusk 1927 days ago
OK, here is a brief overview for people:

To factor a number N (assumed to essentially be the product of two very large primes), find a 'short' lattice vector [0] using LLL [1] (and BKZ reduction? [2]) that finds many relations of the form:

    (u_i) = p_i,0 * p_{i,1} * ... * p_{i,n-1}
    (u_i - v_i * N) = q_{i,0} * q_{i,1} * ... * q_{i,n-1}
where p,q are small primes.

Numbers that have all their factors less than some prime, B, are said to be "B-smooth". In the above, both (u_i) and (u_i - v_i * N) are p_{i,n-1}-smooth and q_{i,n-1}-smooth, respectively.

Construct many u_i and (u_i - v_i * N), so much so that you can create a product of primes, r_i, of the form:

    r_0^{2 b_0} * r_1^{2 b_1} * ... * r_{n-1}^{2 b_{n-1}} = 1 mod N
where each b_i are integers.

Since all exponents (2b_i) are even, we have the potential to find the square root of 1 which has the potential to resolve to two different numbers since N is composite. One of those is the product of r_i^{b_i} and the other is -1. Since y^2 = 1 mod N, we get (y-1)(y+1) = 0 mod N. If (y-1) or (y+1) are not 0, then then must share a factor of N and we've successfully factored.

The trick is, of course, finding the smooth numbers. To do this, a lattice basis is made such that you find a short integer relation of the form

    a_0 ln(p_0) + a_1 ln(p_1) + ... + a_{n-1} ln(p_{n-1}) ~= ln(N)
where ~= means "approximately equal to".

u is chosen as the product of primes of all a_i > 0 and v is chosen to be the product of all primes where a_i < 0. The hope is that (u - v*N) is also p_{n-1}-smooth, which, as far as I understand, most of the math in the paper is trying to justify.

The main innovation here, as far as I can tell, is that Schnorr is fiddling with the 'weighting' of the main diagonal when constructing the lattice basis. I interpret this as basically trying to randomize the initial lattice basis so that the chances of getting a different integer relation (for eventual construction of u,v) is more probable.

I've been confused about this for over a decade as variants of this algorithm, and Schnorr's work in general, have been well published. For example, there's a paper from 2010 on "A Note on Integer Factorization Using Lattices" by Antonio Vera which discusses Schnorr's [3] construction.

Is Schnorr trying to shout louder so people will listen or is there something else fundamentally flawed with this type of algorithm?

Just a word of warning, LLL solves polynomial factorization in polynomial time (given a polynomial with integer coefficients, find it's factor polynomials also with integer coefficients) [4] and has been used to break other (now very old) cryptosystems [5]. If there's a candidate algorithm to solve integer factoring, lattice reduction (LLL, PSLQ, etc.) are it.

I know of fplll that's a stand alone (FOSS) implementation of LLL and some extensions (BKZ, etc.) [6].

[0] https://en.wikipedia.org/wiki/Lattice_reduction

[1] https://en.wikipedia.org/wiki/Lenstra%E2%80%93Lenstra%E2%80%...

[2] https://www.newton.ac.uk/files/seminar/20140509093009501-202...

[3] https://arxiv.org/pdf/1003.5461.pdf

[4] https://en.wikipedia.org/wiki/Factorization_of_polynomials#F...

[5] https://web.eecs.umich.edu/~cpeikert/lic13/lec05.pdf

[6] https://github.com/fplll/fplll

3 comments

OK, a little more context. Sophie Schmieg has a twitter thread discussing some of these issues:

https://twitter.com/SchmiegSophie/status/1367197172664389635

They mostly mirror the article this post is under, namely "show me the factors". Schnorr claims spectacular run times but it isn't clear he provides actual data from runs he's done. Schnorr also doesn't discuss complexity considerations in the detail they would deserve while focusing on basic details that I suppose wouldn't show up in a paper like this normally.

Thanks for summarizing, and talking about what's novel here.

In the paper Schnorr suggests that this algorithm factors 800-bit integers in ~10^11 operations [36 bits], whereas the Number Field Sieve uses ~10^23 [76 bits]. Does that 76-bit figure represent the current state of the art, more or less?

Also, since the paper talks only in terms of specific sizes of integers, I assume there's no claimed asymptotic speedup over existing methods?

LLL is effectively a polynomial time algorithm, though the exponent is large (it's like O(n^5) or O(n^6), also depending on bit length, etc.), so it might fall into the 'galactic algorithm' [0] territory.

The runtime depends on the frequency of primes, which is how you know you can run the algorithm and have a good chance of finding a "B-smooth" number. So that frequency might decrease too fast as to make it not a polynomial time or quasi-polynomial time algorithm.

In terms of my own opinion, in general practical large exponent runtime algorithms have a way of becoming increasingly more efficient so this isn't usually a big blocking point, especially not for long. For example the interior point method eventually lent itself to faster implementations. In terms of frequency of primes, I suspect this is also not a big slowdown factor.

Anyway, like I said, I've been confused about this point for a long time. It seems like this method is providing effectively a polynomial time algorithm for integer factoring. I read this paper and others by Schnorr as "this is 'efficient' in the theoretical sense but the polynomial exponent is so large as to make it computationally infeasible right now" but then I don't know why this hasn't been bigger news. Maybe because the algorithm has a randomness component to it?

I don't know. If anyone is wiser than me, I would appreciate some enlightenment.

[0] https://en.wikipedia.org/wiki/Galactic_algorithm

My impression was the big optimization was the success rate threshold to guide the enumeration processing more quickly to the actually correct vector, rather than wasting cycles on smaller iterative improvements, but I'm still digesting the paper though, and my linear algebra intuition is so rusty, I may need a tetanus shot from using it, and what little number theory I've got doesn't amount to enough to get a shoe shine in 1930.