The primes are the atoms among numbers. But how can we find them all?
The prime numbers — so simple, yet so mysterious. Primes are those numbers that are divisible only by themselves and the number 1. They are the atoms amongst the integers because every whole number can be written as a product of a unique set of primes. We have known since the time of Euclid that there are infinitely many primes — there is no largest prime — but there is no general formula that generates all of them. Their distribution among the other numbers is still a mystery and motivates some of the biggest open questions in mathematics (see the Plus articles A whirlpool of numbers and Mathematical mysteries: the Goldbach conjecture). As the eighteenth century mathematical genius Leonhard Euler put it, "Mathematicians have tried in vain to this day to discover some order in the sequence of prime numbers, and we have reason to believe that it is a mystery into which the mind will never penetrate."
However, all is not lost: we do have algorithms that can find all the primes up to a given number. In this article we'll look at an example of such an algorithm, in which the primes pop out of a very simple, and highly regular, structure as if by magic. It's called the sieve of Sundaram, after an obscure East Indian mathematician by the name of S.P. Sundaram, who discovered it in the 1930s. Sundaram's algorithm is not a real sieve, of course, but a clever way of spotting numbers that are not prime, thus "sieving out" those that are.
Sundaram's sieve is based on an array of numbers formed from arithmetic progressions, in other words, sequences of numbers in which successive numbers are a given fixed distance apart.
We start with the infinite sequence in which successive numbers are exactly three steps apart, and which starts with the number 4:
Next up is the sequence starting with the number 7 and with successive numbers exactly 5 steps apart:
Now consider the sequence starting with the number 10 and with a difference of 7 between successive numbers:
You can see the pattern emerging: the starting point of each sequence is three steps on from that of the previous one, and the distance between successive numbers is the distance of the previous sequence plus 2. Writing the sequences beneath each other, we get the following doubly infinite array:
4 | 7 | 10 | 13 | 16 | 19 | 22 | 25 | ... |
7 | 12 | 17 | 22 | 27 | 32 | 37 | 42 | ... |
10 | 17 | 24 | 31 | 38 | 45 | 52 | 59 | ... |
13 | 22 | 31 | 40 | 49 | 58 | 67 | 76 | ... |
16 | 27 | 38 | 49 | 60 | 71 | 82 | 93 | ... |
.. | .. | .. | .. | .. | .. | .. | .. | ... |
The array exhibits a lot of structure: the first row is equal to the first column, each row is an arithmetic progression, and the differences between entries in two consecutive rows form the sequence
Now pick any number in this array, double it, add 1, and ask yourself if the resulting number is a prime. Repeat this a few times and you'll find that your answer, no matter which number from the array you pick, is always "no". So what about the numbers not in the array, for example 5, 6, 8, etc? A few calculations should convince you that for a number N not in the array, the number 2N+1 is prime.
That is, the first set of calculations suggests that
We can roll the two statements into one as follows:
Euclid of Alexandria, ca 325 BC - 265 BC. He proved that there are infinitely many primes.
If this statement is indeed true for all numbers N, then it gives us a way of deriving the sequence of primes, with all its well-known irregularity, from an array with a high degree of mathematical structure, and the simplest structure as well: that of arithmetic progressions. Quite an amazing result!
But can we be sure that the statement is true for all N? We can indeed, because it is possible to prove it in its full generality. But rather than giving you the proof to read off the page, we challenge you to come up with your own version. It doesn't require any mathematical knowledge beyond a familiarity with multiplication and division, and a tiny bit of algebra. If you get stuck, the three steps below guide you through the process, and there are hints too, in case you need them.
Conclusion
If you've found your way through our three steps, or have found your own version of the proof, then you have shown that N lies in the array if and only if 2N+1 is not prime. This proves that Sundaram's sieve works: to find the kth prime, simply find the (k-1)st number not in the array, double it, and add 1. (It's the (k-1)st missing number, because the array doesn't generate the first prime, which is 2.)
Of course, there are other methods for "sieving out" primes, including the famous sieve of Eratosthenes (see the Plus article Goldbach revisited) and the lesser-known visual sieve (see the Plus article Catching primes). These methods are altogether different, although their workings are equally mysterious and the mathematics involved is similarly elementary.
About the author
Julian Havil is a prematurely retired mathematics teacher who for 33 years taught the subject at Winchester College. He spends his retirement as people should, enjoying the freedom it brings and trying to make good use of the time gifted to him: writing books and articles, giving talks, and travelling both for teaching and for leisure.
Havil's books Impossible?, Nonplussed!, and Gamma: Exploring Euler's Constant, have all been reviewed in Plus.
Havil's latest book has the working title The Irrationals. It will give an historical account of their discovery and development throughout the centuries, spanning ancient Greece to 19th century Europe. It will explore the importance of these numbers, as well as many significant results associated with them. The book will be aimed at good sixth form students and above, and will be published by Princeton University Press.
Comments
Sieve Error?
This is probably the clearest guide to understanding Sundaram's Sieve but I feel like I'm either interpreting the instructions incorrectly or their's a minor error on this post.
First off, you state that I start with 4 and increment by 3, then I start at 7 ( adding 3 ) and increment by 5 ( adding 2 ). What's the need for any further incrementation to produce a third or more array of numbers when, through the subtracting you described, we will still get the same array of numbers ( 3, 5, 7, 9, 11, 13, etc. )
You say that if I apply 2n + 1 to the array of numbers generated ( 3, 5, 7, 9, 11, 13, etc. ), the answer will always be no, i.e. the number won't be prime. However, if I apply 2n + 1 to, lets say, 3, I get 7 which IS a prime number.
I'd also like to point out that the first number you state this is "not in the array" ( 5 ), is actually in the array! Applying 2n + 1 to the numbers not in the array still gives composite numbers as well such as 4. 2(4) + 1 = 9, which isn't a prime number.
I'd be extremely grateful if the explanation was either fixed or if my error in interpretation could be corrected.
RE: Sieve Error
The fact that the difference of the numbers in consecutive rows forms the arithmetic progression (3, 5, 7, 9, 11,...) is a side note - those are not the numbers of primary interest.
The numbers of interest are the ones in the 2-d array:
4, 7, 10, 13, ...
7, 12, 17, 22, ...
10, 17, 24, 31, ...
...
For any number N in the 2-d array, 2N + 1 is composite. Similarly for any number N not in the array, 2N + 1 is prime.
sundaram sieve
An explanation plus the equation to test primality
By numbering by two Natural numbers, the first in odd numbers and the second in even numbers are arranged in two columns, giving rise to the following positions: p Odd Even
0 1 2
1 3 4
2 5 6
3 7 8
4 9 10
5 11 12
6 13 14
7 15 16
8 17 18
9 19 20
10 21 22
.....................
The following relation exists between the positions corresponding to the composed numbers and the numbers themselves:
n p n+p n+p+n
3 1 4 7 10 13 16 ......................................
5 2 7 12 17 22 27
7 3 10 17 24 31 38
9 4 13 22 31 40 49
11 5 16 27 38 49 60
13 6 19 32 45 58 71
15 7 22 37 52 67 82
17 8 25 42 59 76 93
19 9 28 47 66 85 104
.............................................................................................
opportunely reiterated it can develop, through simple programs, all the positions occupied by numbers composed up to a given n and, therefore, also all those missing corresponding to prime numbers:
Positions occupied by numbers composed by n between 3000001 and 3000100 obtained through the above report with repetitions:
1500000, 1500000, 1500001, 1500001, 1500001, 1500001, 1500001, 1500001, 1500001, 1500001, 1500001, 1500001, 1500001, 1500001, 1500001, 1500001, 1500001, 1500001, 1500001, 1500001, 1500001, 1500001, 1500001, 1500001, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500002, 1500003, 1500003, 1500003, 1500003, 1500003, 1500003, 1500003, 1500003, 1500004, 1500005, 1500005, 1500005, 1500005, 1500005, 1500005, 1500005, 1500005, 1500005, 1500005, 1500005, 1500005, 1500005, 1500006, 1500006, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500007, 1500009, 1500009, 1500009, 1500009, 1500009, 1500009, 1500009, 1500009, 1500009, 1500010, 1500010, 1500010, 1500010, 1500010, 1500010, 1500011, 1500011, 1500011, 1500011, 1500011, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500012, 1500013, 1500013, 1500013, 1500013, 1500013, 1500015, 1500015, 1500016, 1500016, 1500017, 1500017, 1500017, 1500017, 1500017, 1500018, 1500018, 1500018, 1500018, 1500018, 1500018, 1500019, 1500019, 1500019, 1500019, 1500019, 1500019, 1500019, 1500019, 1500019, 1500019, 1500019, 1500019, 1500019, 1500020, 1500020, 1500020, 1500020, 1500020, 1500020, 1500020, 1500020, 1500020, 1500020, 1500020, 1500020, 1500021, 1500022, 1500022, 1500022, 1500024, 1500025, 1500025, 1500025, 1500025, 1500025, 1500025, 1500025, 1500025, 1500025, 1500025, 1500025, 1500026, 1500027, 1500028, 1500028, 1500028, 1500028, 1500028, 1500029, 1500029, 1500031, 1500031, 1500031, 1500031, 1500032, 1500032, 1500032, 1500032, 1500032, 1500033, 1500033, 1500033, 1500033, 1500033, 1500034, 1500034, 1500035, 1500035, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500037, 1500039, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500040, 1500041, 1500041, 1500042, 1500042, 1500042, 1500042, 1500043, 1500043, 1500043, 1500043, 1500043, 1500043, 1500043, 1500045, 1500045, 1500046, 1500046, 1500046, 1500046,]
All positions between the two limits (without repetitions):
[1500001, 1500002, 1500003, 1500004, 1500005, 1500006, 1500007, 1500008, 1500009, 1500010, 1500011, 1500012, 1500013, 1500014, 1500015, 1500016, 1500017, 1500018, 1500019, 1500020, 1500021, 1500022, 1500023, 1500024, 1500025, 1500026, 1500027, 1500028, 1500029, 1500030, 1500031, 1500032, 1500033, 1500034, 1500035, 1500036, 1500037, 1500038, 1500039, 1500040, 1500041, 1500042, 1500043, 1500044, 1500045, 1500046]
For difference the positions occupied by prime numbers:
[set([1500036, 1500038, 1500008, 1500044, 1500014, 1500023, 1500030])]
Being n = 2p + 1 we can substitute n in the relation that binds the above positions and numbers with 2p + 1, we obtain:
n p n+p n+p+n
3p+1 3 1 4 7 10 13 16 19 ..................................................
5p+2 5 2 7 12 17 22 27 32
7p+3 7 3 10 17 24 31 38 45
9p+4 9 4 13 22 31 40 49 58
11p+5 11 5 16 27 38 49 60 71
13p+6 13 6 19 32 45 58 71 84
......................................................................................................................................
Tabella 1
As p varies from one to six in the figure, the table provides all but the composed numbers, we can also note that there is a relationship between 3 and 1, 5 and 2, and so on: if we call "a" the first term, the second will be equal to the whole part of division by 2 of a: ap + int (a / 2). It is natural to wonder if the positions other than those above, which we remember give us those of all and only the composed numbers, will give all the positions of the prime numbers, unfortunately as it was easy to expect is not so:
2p+1 3 5 7 9 11 13 15 17 .........................................................
4p+2 6 10 14 18 22 26 30 34
6p+3 9 15 21 27 33 39 45 51
8p+4 12 20 28 36 44 52 60 68
10p+5 15 25 35 45 55 65 75 85
12p+6 18 30 42 54 66 78 90 102
.........................................................................................................................................
It is interesting to note, however, that some of the positions belonging to the second table are not present in the first, while others are common to both. It is also obvious that a similar relation is valid for the second table: a'p + a '/ 2 between 2 and 1, 4 and 2 respectively, and when there is not an "a" for which the following linear relation, although parametric, is valid the position a'p + a '/ 2 will indicate a number definitely prime with a' belonging to the subset of the even numbers of the Naturals and a to the subset of the odd numbers of the Naturals, the position p can be arbitrarily fixed in a'p + a ' / 2 and vary appropriately in the set of Naturals in ap + int (a / 2):
ap+int(a/2)=a’p+a’/2
For example, we set p = 12 and a '=4 in a'p + a' / 2 we will have:
ap + int (a / 2) = 50 not verifiable for each a in the odd numbers of the Naturals and p in the Naturals, position 50corresponds in fact to the prime number 101
again, fixed p = 2 and a '= 4 we will have:
ap + int (a / 2) = 10 verifiable for a = 3 and p = 3, position 10 corresponds to the composed number 21.
Position to be tested:987654321111
141093474444.0 7 3
time taken 0.0160000324249 secondi
Position to be tested: 10**30+1
3.44827586207e+028 29 14
time taken 0.0160000324249 seconds
Position to be tested: 10**30+11
2.70929287456e+026 3691 1845
time taken 0.0469999313354 seconds
Position to be tested: 10**30+13
Probable position of prime number
time taken 120.006000042 seconds
Above we have some examples of tests of the position with the relative execution times with a basic laptop, the first number is the value of p, the second of a and the third the whole part of a / 2, the time was limited to 120 seconds but in fact the last issue tested 2000000000000000000000000000027 is not prime.
sundaram sieve
Dear Julian, I'm a mathematician for hobby and I was studying the prime numbers looking for some property I came across a nice relationship between the position and the number, so since it seemed not possible that no one had ever highlighted this relationship I wanted to do a search on the internet and so I discovered I had rediscovered the Sieve of Sundaram! If you are interested, I'd like to send you my small study, which if you think fit, could partially supplement your article giving the possible rational of the sieve. Your sincerely Silvio Gabbianelli