Generating Prime Numbers in a Range (Was: LINQ Challenge), Part 3 – Sieve Me One - All Your Base Are Belong To Us

# All Your Base Are Belong To Us

Mostly .NET internals and other kinds of gory details

# Generating Prime Numbers in a Range (Was: LINQ Challenge), Part 3 – Sieve Me One

It’s time for the big guns. We’ve already seen simple optimizations to the problem of generating the prime numbers in a range.

A completely different approach, the Sieve of Eratosthenes, is based on a very simple idea. Write down all the numbers in the range {1…N}. Mark 2 as prime. Now strike out every multiple of 2 as composite. Move on to the next number that is not marked as composite yet, and strike out every multiple of it as composite. (The next number, incidentally, is 3; then 5; then 7; and so on.)

After Sqrt[N] steps you have a list of all prime numbers; these are the numbers you did not strike out as composite. An implementation:

static int[] Sieve(IEnumerable<int> range)
{
int highBound = range.Last();
bool[] sieve = new bool[highBound];
int sqrt = (int)Math.Sqrt(highBound);
for (int i = 2; i <= sqrt; ++i)
{
if (sieve[i - 1])//this was already marked as composite
continue;

for (int j = 2 * i; j <= sieve.Length; j += i)
sieve[j - 1] = true;//composite
}
sieve[1 - 1] = true;//1 is not prime
List<int> numbers = new List<int>();
for (int i = 1; i <= sieve.Length; ++i)
if (!sieve[i - 1])
return numbers.ToArray();
}

The biggest drawback of using a sieve is its memory utilization. We’re using O(N) bytes of memory to store Boolean variables representing whether the number is prime or composite. [To save initialization costs, I assume that sieve[i] is false if the number is prime; otherwise, we’d have to initialize the sieve to true, which is a costly operation. BTW, it’s possible to use a single bit per integer instead of a whole bool. This is left as an exercise for the reader.]

As for the runtime complexity, well, we’re doing at most Sqrt[N] iterations through the sieve with values {2…Sqrt[N]}. For each value k in this range, we perform (N/k)-1 operations, marking the elements in the sieve as composite. One attempt to arrive to an upper bound is by saying that we have at most Sqrt[N] iterations and each iteration performs at most N/2 operations, hence we have ourselves an O(N^(3/2)) algorithm. However, this upper bound is very far from being tight, and the real upper bound is somewhere between O(N) and O(N(Log[Log[N]])).

Fortunately, the sieve fares very well in comparison to the other alternatives, because the whole purpose of our little exercise is to generate prime numbers in a big range. If we were concerned with primality testing of a small set of numbers, a sieve wouldn’t fit our purposes*; here, however, it scales much better than the iterative alternatives.

Here are some results: [run times in ms, “Full”, “Sqrt” and “SqrtN2” refer to the algorithms presented in the previous posts]

 N Full Sqrt SqrtN2 Sieve 9998 52 2 1 0 12998 81 3 1 0 16898 139 4 2 0 21968 277 7 3 0 28559 371 8 4 0 37127 603 13 7 1 48265 954 17 9 1 62745 1554 24 13 2 81569 3051 38 21 3 106040 55 30 4 137852 81 43 5 179208 117 62 7 232971 173 89 9 302862 237 132 12 393721 346 182 17 511837 482 250 23 665388 680 352 31 865005 963 506 41 1124507 1452 775 74 1461859 1943 999 76 1900417 2743 1493 133 2470542 2008 134 3211705 187 4175217 349 5427782 651 7056117 1048 9172952 1680 11924838 2224

Note that the sieve beats the iterative algorithms by a factor of 10 and sometimes more, making it a much more likely option for generating prime numbers in a large range (more than 11,000,000 numbers scanned in just a little more than 2 seconds). In our final installment, we’ll examine the Sieve of Atkin, a final refinement to the idea of generating primes using a sieve.

* The whole idea of a sieve is feasible if you’re testing a very large range of numbers. If the method was called to calculate prime numbers in a range {K…K+100} where K is very large, creating a sieve up to K+100 would be prohibitive. Hence, for the measurements I used 1-based ranges.

#### GalacticJello said:

Try the Sieve of atkin, I found it to be full of crunchy goodness.

private List<ulong> FindPrimes(ulong limit)

{

List<ulong> primes = new List<ulong>();

var isPrime = new bool[limit + 1];

var sqrt = Math.Sqrt(limit);

for (ulong x = 1; x <= sqrt; x++)

for (ulong y = 1; y <= sqrt; y++)

{

var n = 4 * x * x + y * y;

if (n <= limit && (n % 12 == 1 || n % 12 == 5))

isPrime[n] ^= true;

n = 3 * x * x + y * y;

if (n <= limit && n % 12 == 7)

isPrime[n] ^= true;

n = 3 * x * x - y * y;

if (x > y && n <= limit && n % 12 == 11)

isPrime[n] ^= true;

}

for (ulong n = 5; n <= sqrt; n++)

if (isPrime[n])

{

var s = n * n;

for (ulong k = s; k <= limit; k += s)

isPrime[k] = false;

}

for (ulong n = 5; n <= limit; n++)

if (isPrime[n])

return primes;

}

# January 14, 2010 5:16 PM

#### Igor Ostrovsky said:

As an optimization, you can start the inner loop at j = i * i.

# January 14, 2010 8:08 PM

#### Dew Drop – January 15, 2009 | Alvin Ashcraft's Morning Dew said:

Pingback from  Dew Drop &#8211; January 15, 2009 | Alvin Ashcraft&#039;s Morning Dew

# January 15, 2010 5:00 PM