Generating Prime Numbers in a Range (Was: LINQ Challenge), Part 2 – Simple Optimizations
There are two rather trivial optimizations to the problem of generating prime numbers that we saw in the previous post.
First there’s the observation that it suffices to test all potential divisors in the range {2…Floor[Sqrt[K]]} to ascertain that K is prime. This is not much harder to accomplish and is a significant running time optimization:
static int[] Sqrt(IEnumerable<int> range)
{
return range.Where(n =>
{
if (n == 2) return true;
int sqrt = (int)Math.Sqrt(n);
for (int i = 2; i <= sqrt; ++i)
if (n % i == 0)
return false;
return true;
}).ToArray();
}
The idea is still the same – we iterate over lots of numbers and check if own target is divisible by these numbers. However, this “lots of numbers” is much less than the “lots of numbers” we had earlier.
Specifically, we’re still not using any memory other than a constant amount, but the runtime complexity has improved significantly. This time, for each of the prime numbers in the range K we do at most Sqrt[K] iterations of the inner loop. Assuming as earlier that there are roughly 2Sqrt[N]/Log[N] numbers, the biggest possible number is N, and the amount of work per number is Sqrt[N], we obtain an upper bound of 2N/Log[N] on the amount of work. (This disregards the work needed for the composite numbers; this is an assumption we’ve made before.)
This is a big improvement – the difference between N^2 and N^(3/2)/Log[N] is quite telling, and so is the difference between N^(3/2)/Log[N] and N/Log[N]. Here’s a graph:
Now we can introduce another optimization. It won’t reap us as many fruits as the previous algorithmic improvement, but it’s still worthwhile. Specifically, we note that there’s already a special treatment of the number 2 in our method. Why don’t we automatically discard all even numbers before embarking on the Math.Sqrt call, which is quite expensive?
Furthermore, if we discard all the even numbers in advance, we can start the loop from 3 and advance through only the odd numbers. (Obviously if K is divisible by 2k then K is divisible by 2, hence there’s no need to test even divisors other than 2.)
This brings us to:
static int[] SqrtN2(IEnumerable<int> range)
{
return range.Where(n =>
{
if (n != 2 && n % 2 == 0)
return false;
int sqrt = (int)Math.Sqrt(n);
for (int i = 3; i <= sqrt; i += 2)
if (n % i == 0)
return false;
return true;
}).ToArray();
}
Not much complicated, but has a potential for running quite faster. For the even numbers, we’re not performing the expensive Math.Sqrt call at all; for prime numbers, we’re going over half as many potential divisors as we did earlier. This is not an improvement if you’re analyzing theoretical runtime complexity, but from a practical standpoint it’s quite an improvement indeed: [times in ms, “Full” stands for the algorithm from the previous post]
| N | Full | Sqrt | SqrtN2 |
| 9998 | 52 | 2 | 1 |
| 12998 | 81 | 3 | 1 |
| 16898 | 139 | 4 | 2 |
| 21968 | 277 | 7 | 3 |
| 28559 | 371 | 8 | 4 |
| 37127 | 603 | 13 | 7 |
| 48265 | 954 | 17 | 9 |
| 62745 | 1554 | 24 | 13 |
| 81569 | 3051 | 38 | 21 |
| 106040 | | 55 | 30 |
| 137852 | | 81 | 43 |
| 179208 | | 117 | 62 |
| 232971 | | 173 | 89 |
| 302862 | | 237 | 132 |
| 393721 | | 346 | 182 |
| 511837 | | 482 | 250 |
| 665388 | | 680 | 352 |
| 865005 | | 963 | 506 |
| 1124507 | | 1452 | 775 |
| 1461859 | | 1943 | 999 |
| 1900417 | | 2743 | 1493 |
Predictably enough, for values of N both small and large, the new version is almost twice as fast as the first one. Still, it takes us 1.5 seconds to generate less than 2,000,000 prime numbers. This scales linearly, but it’s still far from ideal.
The massive breakthrough can be obtained by using the Sieve of Eratosthenes, which is a completely different approach for generating prime numbers in a range. We’ll examine it in the next installment.