DCSIMG
Generating Prime Numbers in a Range (Was: LINQ Challenge), Part 3 – Sieve Me One - All Your Base Are Belong To Us

All Your Base Are Belong To Us

Mostly .NET internals and other kinds of gory details

Generating Prime Numbers in a Range (Was: LINQ Challenge), Part 3 – Sieve Me One

It’s time for the big guns. We’ve already seen simple optimizations to the problem of generating the prime numbers in a range.

A completely different approach, the Sieve of Eratosthenes, is based on a very simple idea. Write down all the numbers in the range {1…N}. Mark 2 as prime. Now strike out every multiple of 2 as composite. Move on to the next number that is not marked as composite yet, and strike out every multiple of it as composite. (The next number, incidentally, is 3; then 5; then 7; and so on.)

After Sqrt[N] steps you have a list of all prime numbers; these are the numbers you did not strike out as composite. An implementation:

static int[] Sieve(IEnumerable<int> range)
{
    int highBound = range.Last();
    bool[] sieve = new bool[highBound];
    int sqrt = (int)Math.Sqrt(highBound);
    for (int i = 2; i <= sqrt; ++i)
    {
        if (sieve[i - 1])//this was already marked as composite
            continue;

        for (int j = 2 * i; j <= sieve.Length; j += i)
            sieve[j - 1] = true;//composite
    }
    sieve[1 - 1] = true;//1 is not prime
    List<int> numbers = new List<int>();
    for (int i = 1; i <= sieve.Length; ++i)
        if (!sieve[i - 1])
            numbers.Add(i);
    return numbers.ToArray();
}

The biggest drawback of using a sieve is its memory utilization. We’re using O(N) bytes of memory to store Boolean variables representing whether the number is prime or composite. [To save initialization costs, I assume that sieve[i] is false if the number is prime; otherwise, we’d have to initialize the sieve to true, which is a costly operation. BTW, it’s possible to use a single bit per integer instead of a whole bool. This is left as an exercise for the reader.]

As for the runtime complexity, well, we’re doing at most Sqrt[N] iterations through the sieve with values {2…Sqrt[N]}. For each value k in this range, we perform (N/k)-1 operations, marking the elements in the sieve as composite. One attempt to arrive to an upper bound is by saying that we have at most Sqrt[N] iterations and each iteration performs at most N/2 operations, hence we have ourselves an O(N^(3/2)) algorithm. However, this upper bound is very far from being tight, and the real upper bound is somewhere between O(N) and O(N(Log[Log[N]])).

Fortunately, the sieve fares very well in comparison to the other alternatives, because the whole purpose of our little exercise is to generate prime numbers in a big range. If we were concerned with primality testing of a small set of numbers, a sieve wouldn’t fit our purposes*; here, however, it scales much better than the iterative alternatives.

Here are some results: [run times in ms, “Full”, “Sqrt” and “SqrtN2” refer to the algorithms presented in the previous posts]

N Full Sqrt SqrtN2 Sieve
9998 52 2 1 0
12998 81 3 1 0
16898 139 4 2 0
21968 277 7 3 0
28559 371 8 4 0
37127 603 13 7 1
48265 954 17 9 1
62745 1554 24 13 2
81569 3051 38 21 3
106040   55 30 4
137852   81 43 5
179208   117 62 7
232971   173 89 9
302862   237 132 12
393721   346 182 17
511837   482 250 23
665388   680 352 31
865005   963 506 41
1124507   1452 775 74
1461859   1943 999 76
1900417   2743 1493 133
2470542     2008 134
3211705       187
4175217       349
5427782       651
7056117       1048
9172952       1680
11924838       2224

Note that the sieve beats the iterative algorithms by a factor of 10 and sometimes more, making it a much more likely option for generating prime numbers in a large range (more than 11,000,000 numbers scanned in just a little more than 2 seconds). In our final installment, we’ll examine the Sieve of Atkin, a final refinement to the idea of generating primes using a sieve.


* The whole idea of a sieve is feasible if you’re testing a very large range of numbers. If the method was called to calculate prime numbers in a range {K…K+100} where K is very large, creating a sieve up to K+100 would be prohibitive. Hence, for the measurements I used 1-based ranges.

Comments

GalacticJello said:

Try the Sieve of atkin, I found it to be full of crunchy goodness.

       private List<ulong> FindPrimes(ulong limit)

       {

           List<ulong> primes = new List<ulong>();

           var isPrime = new bool[limit + 1];

           var sqrt = Math.Sqrt(limit);

           for (ulong x = 1; x <= sqrt; x++)

               for (ulong y = 1; y <= sqrt; y++)

               {

                   var n = 4 * x * x + y * y;

                   if (n <= limit && (n % 12 == 1 || n % 12 == 5))

                       isPrime[n] ^= true;

                   n = 3 * x * x + y * y;

                   if (n <= limit && n % 12 == 7)

                       isPrime[n] ^= true;

                   n = 3 * x * x - y * y;

                   if (x > y && n <= limit && n % 12 == 11)

                       isPrime[n] ^= true;

               }

           for (ulong n = 5; n <= sqrt; n++)

               if (isPrime[n])

               {

                   var s = n * n;

                   for (ulong k = s; k <= limit; k += s)

                       isPrime[k] = false;

               }

           primes.Add(2);

           primes.Add(3);

           for (ulong n = 5; n <= limit; n++)

               if (isPrime[n])

                   primes.Add(n);

           return primes;

       }

# January 14, 2010 5:16 PM

Igor Ostrovsky said:

As an optimization, you can start the inner loop at j = i * i.

# January 14, 2010 8:08 PM

Dew Drop – January 15, 2009 | Alvin Ashcraft's Morning Dew said:

Pingback from  Dew Drop &#8211; January 15, 2009 | Alvin Ashcraft&#039;s Morning Dew

# January 15, 2010 5:00 PM
Leave a Comment

(required) 

(required) 

(optional)

(required) 


Enter the numbers above: