On November 26, 1994 we (Scott Contini, Bruce Dodson, Arjen Lenstra,
and Peter Montgomery) completed the factorization of a 119-digit
cofactor of the 123-digit 13171th partition number p(13171) into two
primes of 52 and 67 digits using the general number field sieve factoring
method (GNFS). This beats the 116-digit GNFS record that was set on
July 18, 1994. We spent only slightly more computer time on this
factorization than we spent on the previous GNFS record and considerably
less time than we would have spent had we used the Quadratic Sieve
factoring method (QS).
p(13171)
= 7713594995040056283145175782904045025125105527410953685909012417\
46956340815495857571662159517039410048812861195353031293508
= 2 * 2 * 7 * 1871 * 5467135708383335903092891092416282702670639197573477
* 2693178626463799175226134527173162372360502737088674714323536124533
Details on this factorization will be given in [DL]. See [LL] or [LLMP]
for a description of GNFS.
We used the polynomial
229712700994770930 X^5 - 75244909504476954 X^4 - 349192234242831010 X^3
+ 213343035765348142 X^2 - 133732623127145009 X - 83887843530130136
and its root 144999999598687668083 modulo the 119-digit number to be
factored.
Sieving and trial division was done on workstations at Lehigh University
and Bellcore in approximately 250 mips years. This is about 2.5 times
faster than sieving for a comparable number would take using QS. This
suggests that the sieving for RSA-129 with GNFS could be completed in
about a quarter of the time that was spent on it using QS (cf. [GLM]).
The polynomial selection took three days on a DEC 3000 workstation.
The linear algebra took 2.5 days on a 16384 processor MasPar MP-1.
The square root took about one day per dependency using one processor
of an SGI Challenge. The factorization was found on the third dependency.
We used the implementation as described in [GLM] with a factor base
size of 460,002 (100,001 primes on the rational side, 360,001 on the
algebraic side) and a large prime bound of 2^30. It produced 38,741
full relations and 35,763,524 partial relations with at most four
large primes (the percentages are with respect to the total number
of partial relations):
Number of | Number of large primes on algebraic side
relations |------------------------------------------
----------------| 0 1 2
| --------------------------------------
| 0 | 38,741 459,082 1,254,636
| | 1.284% 3.508%
Number of large | |
primes on | 1 | 303,127 3,603,958 9,746,884
rational side | | 0.848% 10.077% 27.254%
| |
| 2 | 462,715 5,476,700 14,456,422
| | 1.294% 15.314% 40.422%
The partials led to 472,426 `cycles', where a cycle is a product of
partial relations in which all large primes occur an even number of
times. Since 38,741 + 472,426 - 460,010 = 51,157, we had more than
50,000 more fulls + cycles than we needed (where 460,010 = 460,002 + 8;
we used 8 columns for the quadratic signatures). There were on average
590 partials per cycle. Of the 35,763,524 partials 4,725,203 were
actually useful, i.e., occurred in one or more cycles. The breakdown
of these 4,725,203 partials is as follows (the first percentage
is with respect to the total number of partial relations of the same
type, the second percentage is with respect to the total number
of useful partial relations):
Number of | Number of large primes on algebraic side
relations |------------------------------------------
----------------| 0 1 2
| --------------------------------------
| 0 | 207,979 314,563
| | 45.303% 25.072%
| | 4.401% 6.657%
Number of large | |
primes on | 1 | 148,271 794,788 1,197,510
rational side | | 48.914% 22.053% 12.286%
| | 3.138% 16.820% 25.343%
| |
| 2 | 143,305 770,361 1,148,426
| | 30.970% 14.066% 7.944%
| | 3.038% 16.303% 24.304%
As in our 116-digit GNFS factorization we witnessed an explosion in
the number of cycles: when we had 30,101,922 partials relations there
were only 34,728 cycles, at 34,567,876 partials (14.8% more) the number
of cycles was 337,351 cycles (871.4% more).
Restriction to the 4,366,167 partials with at most one large prime per
side led to only 21,928 cycles, restriction to the 6,083,518 partials
with at most two large primes in total led to only 31,020 cycles,
restriction to the 21,307,102 partials with at most three large primes
in total led to only 59,925 cycles. So, the 14,456,422 partials with
two large primes on both sides led to an additional 412,501 cycles.
Restriction to the cycles of length at most 1374 led to 421,869 cycles.
Combined with the 38,741 fulls these sufficed. The average length of
these 421,869 cycles was 387.25.
Combination of the 38,741 fulls and the 421,869 `short' cycles would
lead to a not-so-sparse sparse 460,610 x 460,010 bit-matrix. We never
built this matrix. Instead we extended the factor base with 499,134
of the rational and 513,463 of the algebraic large primes that occurred
in the cycles, namely those large primes that occurred at least w=4
times in useful partial relations. As a result we got 609,166 extra full
relations (for a total of 647,907), but a factor base size of 499,134 +
513,463 + 460,010 = 1,472,607. The remaining large primes, those that
occurred less than w times in useful partial relations, were combined
into 875,857 cycles of average length 6.85. Notice that 647,907 +
875,857 - 1,472,607 = 51,157, the same number of excess relations that
we had before, as expected. Finally, restriction to the cycles of
length <= 21 led to 827,991 cycles, of average length 5.17, still
with sufficiently many excess relations (647,907 + 827,991 - 1,472,607 =
3,291).
Free relations are available, but they were not used. They would have
made the matrix somewhat sparser.
For w=3 or less the resulting matrix would have had substantially more
rows (about 2,280,000 for w=3) but it would have been sparser; while for
w>4 if would have had fewer rows (about 1,090,000 for w=5) but it would
have been denser. We used w=4 because this led to the matrix for which
the number of rows times the total weight was minimal, thus minimizing
the time needed to find a dependency using the blocked Lanczos algorithm
from [PM94]. Using the first experimental MasPar implementation of
this algorithm from [CL], it took 2.5 CPU days on a 16K MasPar MP-1 to
find 151 dependencies among the rows of our 1,475,898 x 1,472,607 matrix
with 79,661,432 non-zero entries, but disregarding the first (and
densest) 102 columns. These 151 dependencies were easily combined
into 49 true dependencies.
Our structured Gaussian elimination code would have reduced the
sparse 1,475,898 x 1,472,607 matrix to a dense 362,597 x 362,397
matrix. This dense matrix would have required more than 15GBytes
of storage and it would have taken about two CPU weeks on the MasPar
to find the dependencies.
The dependencies, consisting of about 2 million relations each, were
processed on an SGI workstation at the Centrum voor Wiskunde en
Informatica in Amsterdam. The program constructed a product
+- 1
P = product (a_i - b_i * alpha) ,
i
where alpha is a root of the earlier polynomial of degree 5. The
exponents +- 1 were chosen in an attempt to maximize the amount
of cancellation between the numerator and the denominator, if the
latter were reduced to ``lowest terms'' (a slight inaccuracy since
the algebra Z[alpha] is not a unique factorization domain). The
resulting fraction for P would have numerator and denominator
norms each 4.1 million decimal digits if it were written out.
After 43000 iterations, each of which reduced the numerator or
denominator of P by about 10^200, the algebraic square root was reduced
to the trivial sqrt(1). See [PM93] for a simplified version of the
algorithm. This took 25 CPU-hours on a single MIPS R4400 processor
per dependency. The third dependency resulted in the above factorization.
The time was later reduced to 20 CPU-hours by improving the strategy for
choosing +-1 exponents (so that the numerator and denominator of P had
10% fewer decimal digits initially), and by changing the reduction per
iteration from 10^200 to 10^350.
Compared to the 116-digit GNFS factorization, we still did not use
an intelligent method to choose the polynomial, we chose a much better
factor base size and hit upon the cycle explosion at about the same time
that we were done sieving, and we used a much better method to find a
dependency in the matrix.
Scott Contini, University of Wisconsin at Milwaukee, contini@bellcore.com
Bruce Dodson, Lehigh University, bad0@lehigh.edu
Arjen K. Lenstra, Bellcore, lenstra@bellcore.com
Peter L. Montgomery, CWI Amsterdam, pmontgom@cwi.nl
[CL] S. Contini, A.K. Lenstra, Implementations of blocked Lanczos
and Wiedemann algorithms, in preparation
[DL] B. Dodson, A.K. Lenstra, NFS with four large primes: an
explosive experiment, in preparation
[GLM] R. Golliver, A.K. Lenstra, K.S. McCurley, Lattice sieving
and trial division, Algorithmic number theory symposium,
proceedings, Cornell, 1994
[LL] A.K. Lenstra, H.W. Lenstra, Jr., The development of the
number field sieve, Lecture Notes in Math. 1554, Springer-
Verlag, Berlin, 1993
[LLMP] A.K. Lenstra, H.W. Lenstra, Jr., M.S. Manasse, J.M. Pollard,
The factorization of the ninth Fermat number,
Math. Comp. 61 (1993) 319-349.
[PM93] Peter L. Montgomery, Square roots of products of algebraic
numbers, in Proceedings of Symposia in Applied Mathematics,
Mathematics of Computation 1943-1993, Vancouver, 1993,
Walter Gautschi, ed.
[PM94] Peter L. Montgomery, A block Lanczos algorithm for finding
dependencies over GF(2), manuscript, November 1994