Integer factorization

In number theory, integer factorization or prime factorization is the breaking down of a composite number into smaller non-trivial divisors, which when multiplied together equal the original integer.

When the numbers are very large, no efficient integer factorization algorithm is publicly known; an effort concluded in 2009 by several researchers factored a 232-digit number (RSA-768) utilizing hundreds of machines over a span of 2 years.[1] The presumed difficulty of this problem is at the heart of certain algorithms in cryptography such as RSA. Many areas of mathematics and computer science have been brought to bear on the problem, including elliptic curves, algebraic number theory, and quantum computing.

Not all numbers of a given length are equally hard to factor. The hardest instances of these problems (for currently known techniques) are semiprimes, the product of two prime numbers. When they are both large, randomly chosen, and about the same size (but not too close), even the fastest prime factorization algorithms on the fastest computers can take enough time to make the search impractical.

Many cryptographic protocols are based on the difficulty of factoring large composite integers or a related problem, the RSA problem. An algorithm which efficiently factors an arbitrary integer would render RSA-based public-key cryptography insecure.

Contents

Prime decomposition

This image demonstrates the prime decomposition of 864. A short-hand way of writing the resulting prime factors is 2^5 \times 3^3

By the fundamental theorem of arithmetic, every positive integer has a unique prime factorization. (A special case for 1 is not needed using an appropriate notion of the empty product.) However, the fundamental theorem of arithmetic gives no insight into how to obtain an integer's prime factorization; it only guarantees its existence.

Given an algorithm for integer factorization, one can factor any integer down to its constituent prime factors by repeated application of this algorithm.

Current state of the art

A team at the German Federal Agency for Information Technology Security (BSI) holds the record for factorization of semiprimes in the series proposed by the RSA Factoring Challenge sponsored by RSA Security. On May 9, 2005, this team announced factorization of RSA-200, a 663-bit number (200 decimal digits), using the general number field sieve.

The same team later announced factorization of RSA-640, a smaller, 640-bit number (193 decimal digits), on November 4, 2005.

Both factorizations required several months of computer time using the combined power of 80 AMD Opteron CPUs.

In January 2010, the factorization of RSA-768 was announced.[2]

Difficulty and complexity

If a large, b-bit number is the product of two primes that are roughly the same size, then no algorithm has been published that can factor in polynomial time, i.e., that can factor it in time O(bk) for some constant k. There are published algorithms that are faster than O((1+ε)b) for all positive ε, i.e., sub-exponential.

The best published asymptotic running time is for the general number field sieve (GNFS) algorithm, which, for a b-bit number n, is:

O\left(\exp\left(\left(\begin{matrix}\frac{64}{9}\end{matrix} b\right)^{1\over3} (\log b)^{2\over3}\right)\right).

For an ordinary computer, GNFS is the best published algorithm for large n (more than about 100 digits). For a quantum computer, however, Peter Shor discovered an algorithm in 1994 that solves it in polynomial time. This will have significant implications for cryptography if a large quantum computer is ever built. Shor's algorithm takes only O(b3) time and O(b) space on b-bit number inputs. In 2001, the first 7-qubit quantum computer became the first to run Shor's algorithm. It factored the number 15.[3]

When discussing what complexity classes the integer factorization problem falls into, it's necessary to distinguish two slightly different versions of the problem:

It is not known exactly which complexity classes contain the decision version of the integer factorization problem. It is known to be in both NP and co-NP. This is because both YES and NO answers can be trivially verified given the prime factors (we can verify their primality using the AKS primality test, and that their product is N by multiplication). In fact, providing we require the factors to be listed in order, the fundamental theorem of arithmetic will guarantee that there is only one possible string that will be accepted; this shows that the problem is in both UP and co-UP.[4] It is known to be in BQP because of Shor's algorithm. It is suspected to be outside of all three of the complexity classes P, NP-complete, and co-NP-complete. If it could be proved that it is in either NP-Complete or co-NP-Complete, that would imply NP = co-NP. That would be a very surprising result, and therefore integer factorization is widely suspected to be outside both of those classes. Many people have tried to find classical polynomial-time algorithms for it and failed, and therefore it is widely suspected to be outside P.

In contrast, the decision problem "is N a composite number?" (or equivalently: "is N a prime number?") appears to be much easier than the problem of actually finding the factors of N. Specifically, the former can be solved in polynomial time (in the number n of digits of N) with the AKS primality test. In addition, there are a number of probabilistic algorithms that can test primality very quickly in practice if one is willing to accept the small possibility of error. The ease of primality testing is a crucial part of the RSA algorithm, as it is necessary to find large prime numbers to start with.

Factoring algorithms

Special-purpose

A special-purpose factoring algorithm's running time depends on the properties of the number to be factored or on one its unknown factors: size, special form, etc. Exactly what the running time depends on varies between algorithms. For example, trial division is considered special purpose because the running time is roughly proportional to the size of the smallest factor.

General-purpose

A general-purpose factoring algorithm's running time depends solely on the size of the integer to be factored. This is the type of algorithm used to factor RSA numbers. Most general-purpose factoring algorithms are based on the congruence of squares method.

Other notable algorithms

Notes

References

External links