Coding with linear codes (or how to avoid being eaten by crocodiles)

Andrew Chambers Share this page

To guide you to the treasure your friend needs to tell you take one step North and one step West.

You find yourself lost on a dangerous island in search of a buried treasure. Luckily your friend sitting safely on your boat has the map. They will send you smoke signals to tell you the correct way. Be careful however, one false step and disaster awaits: there are crocodiles and land mines everywhere!

Your friend will use only two types of signal: a short puff of smoke and a longer one. We will write 0 for a short puff and 1 for a long one. This amounts to a binary code made up of only two symbols. The code you agree on is as follows:

Code 1

00 means North 01 means West
10 means East 11 means South

Is this a good code to choose? Well, your transmission system is not especially reliable – around 10% of the time you will mistake a short puff of smoke for a long one (a 0 for 1) or vice versa. The probability that your friend sends a 0 and you receive a 0 is $1-0.1 = 0.9$. The probability that your friend sends 00 and you also receive 00 is therefore $(0.9)^2 = 0.81$. Therefore, you have a probability of $1-0.81 = 0.19$, translating to a 19% chance, of interpreting 00 as a direction other than North. In order for you to arrive at the treasure you need to receive two messages correctly: North, West. 00, 01. (See the picture.) The probability of this is only $(0.9)^4 = 0.6561$. In other words you have around a 34% chance of encountering either a crocodile or land mine. Not great odds!

What is it about this code that makes it so dangerous for you? Well, it's very susceptible to errors – if there is any error in transmission you will receive completely the wrong information, turning (say) West rather than North. A better code is an error detecting code. For example:

Code 2

000 means North 011 means West
101 means East 110 means South

In this case if there was a single error in transmission, say 000 turned into 010, then you would immediately know that there was a problem, as 010 is not a code word. However in this case you would not know which the original code word was – 010 is 1 digit away from 000, 110 and 011 – so if we assume one error then it could be any of these three. Unfortunately you are not in a position to ask your friend to repeat their message.

Code 3

Even better still is a code that can actually correct an error, such as follows:

00000 means North 01101 means West
10110 means East 11011 means South


This is what you need to avoid.

If there was one error in the transmission of 00000, say to give 01000, you would still decode as 00000 because this is the closest code word to the message received. This new code can correct any single error, or detect two errors. The downside of this code is that it will take longer to transmit than the other codes. However when faced with crocodiles you are probably going to accept a loss of speed for greater accuracy.

This highlights the key considerations in designing codes. You want a code that can be transmitted quickly (a short length), that has a large number of different code words (to allow for more information to be conveyed) but that also has the capability to minimise errors in interpretation. This trade off between speed and accuracy will depend on what the data is to be used for. The branch of mathematics called coding theory is concerned with creating optimal codes for data transmission – and in our current digital age has become hugely important.

Linear codes

Binary strings of symbols come with a notion of addition. To add two strings, simply add corresponding digits, i.e. add the first digit of one to the first digit of the other, the second of one to the second of the other, and so on. However, to make sure the resulting strings only have 0s and 1s as their entries, we use the rule that 0+0=0, 0+1=1+0=1 and 1+1=0.

For example from code 2:

011 (West) + 101 (East) = 110 (South).

In this case the sum of two code words is also a code word. A binary code which has this property is an example of a linear code.

We can decode linear codes by arranging all the permutations of received messages into a standard array. Let's see how that works with code 3:

We start by writing the code words in the first row (starting with 00000):


Where's the treasure?

Next we choose another binary five digit string which has the smallest number of 1s possible from all our choices. (This is called minimum weight. Weight 0 means all the digits are 0s. Weight 1 means there is only one 1 etc.) We could choose 10000, 01000, 00100, 00010, 00001 because none of these are our code words, and all have weight 1. Let's choose 10000. We write this under 00000.


We then add 10000 to our remaining code words writing the result underneath the code word:


We then repeat the process, writing the other code words of weight 1 that haven't yet appeared in the array at the beginning of each row, adding them to the code words in the first row, and writing the result in the appropriate column:


What's the use of this so far? Adding a string of weight 1 to a code word has the effect of flipping one of the digits into its opposite. For example, adding 10000 flips the first digit, adding 01000 flips the second digit, and so on. The array as it stands so far therefore represents the correct code words (first row) and all those you get from flipping one of the five digits, that is, from making one mistake in transmission.

Since strings of weight 1 have been used up, let's continue with strings of weight 2 that have not yet occurred in the array, adding them one by one. In our example we can first choose 11000 and then 10001. Once we have done that the table contains all possible binary five digit strings. The last two rows of the array give you the strings you get when two errors are made in the locations specified by 11000 (first and second digit) and 10001 (first and last digit).

Standard array for code 3

To decode a string you simply locate it in the array and then decode it as the code word at the top of the column. For example if you receive 00111 (third column bottom row) then you decode it as 10110. Because we are dealing with a linear code there is no ambiguity: a string never appears in more than one column (see here for a proof).

Morse code

Morse code is a binary code in which the two symbols are dots and dashes.

This method doesn't guarantee that you decode each string you receive correctly. For example, it assumes that no more that two errors were made: it decodes the string 00111 as 10110 because it arises from 10110 by flipping the first digit and the last digit, but it could also have arisen from 00000 by flipping four digits. What is more, if we had changed the order of the strings we put in the first column we could have ended up with a different array and therefore a different way of decoding. However, this method is based on the general assumption that few errors are more likely than many errors and therefore gives us a good probability of decoding strings correctly. This is what we will move on to next.

First, let's quickly look at the standard array for code 2 from above, which is much easier to compile:

Standard array for code 2
000 011 101 110
100 111 001 010

Chances of success

Let’s work out the probability that we reach the treasure using our second and third codes. We’ll write $p$ for the probability that an individual digit is transmitted wrongly.


Computers work with the binary bits 0 and 1.

For code 3 the probability $P(correct)$ that a received vector is decoded correctly as the intended code word is the probability that errors were made as assumed by the array. It consists of several terms. First, there’s the probability that no error was made at all, which is $(1-p)^5$. Then there is the probability that one error was made in the position specified by the second row of the array. That probability is $p(1-p)^{4}$. Next there is the probability that one error was made in the position specified by the third row of the array. That probability is again $p(1-p)^{4}$. In fact, we get $p(1-p)^{4}$ a total of five times, since there are five rows corresponding to one error. The remaining possibilities are that two errors were made corresponding to the positions specified by the last two rows of the array. For each of these two rows this probability is $p^2(1-p)^3$. Adding these terms gives

  \[  P(correct) =(1-p)^5+5p(1-p)^4+2p^2(1-p)^3. \]    

Since we had set $p=0.1$ at the start of the article, this works out to

  \[ P(correct) =(0.9)^5 + 5(0.1)(0.9)^4+2(0.1)^2(0.9)^3 = 0.93312. \]    

Therefore the probability that an individual received vector is decoded incorrectly is $1-0.93312 = 0.06688.$ Given our treasure map requires us to get two code words successively correct, our chance of meeting a sticky end is $1-0.93312^2 = 0.129$ or around 13%.

In a similar way we can work out $P(correct)$ for code 2 from above:

  \[ P(correct) = (0.9)^3 + 0.1(0.9)^2 = 0.81. \]    

Therefore the probability of making at least one error in decoding two vectors is $1-(0.81)^2 = 0.3439.$

The general formula for P(correct)

Suppose our binary code involves code words of length $n$. We’ll write $a_1$ for the number of strings in the first column of the standard array that have weight 0, $a_1$ for the number of strings in the first column that have weight 1, $a_2$ for the number of strings in the first column that have weight 2, and so on. Write $p$ for the probability that an individual digit is transmitted wrongly. Then

  $\displaystyle P(correct)  $ $\displaystyle  =  $ $\displaystyle a_0p^0(1-p)^{n-0} $    
  $\displaystyle  $ $\displaystyle  + $ $\displaystyle a_1p(1-p)^{n-1} $    
  $\displaystyle  $ $\displaystyle + $ $\displaystyle a_2p^2(1-p)^{n-2} $    
  $\displaystyle  $ $\displaystyle + $ $\displaystyle  a_3p^3(1-p)^{n-3} $    
  $\displaystyle  $ $\displaystyle + $ $\displaystyle  ... $    
  $\displaystyle  $ $\displaystyle + $ $\displaystyle  a_ np^ n(1-p)^0 $    

Surprisingly this is exactly the same probability as in code 1. We increased our chance of decoding correctly through using the standard array, but increased the length of the code word, thus making an error more likely. This can be worked out algebraically. See here to find out how.

Using similar reasoning as we did for codes 2 and 3 we can work out $P(correct)$ for any binary code of any length $n$. It’s shown in the box on the right.

To summarise, code 1 has the benefit of being both simple and quick to transmit, but we have around a 34% chance of death by crocodile or land mine. The main benefit of code 2 is that we are able to detect an error. Unfortunately we have also shown that this knowledge won’t reduce our chance of death. Perhaps blissful ignorance might be best! Clearly the best code for this situation is code 3. It will take longer to transmit, but we have reduced our chances of tragedy to around 13%, and have an encouraging 87% chance of untold riches. Clearly it pays to be good at maths!

About the author

Andrew Chambers has an MSc in mathematics and is a teacher at the British International School Phuket, Thailand. He has written news and features articles for the Guardian, Observer and Bangkok Post, and was previously shortlisted for the Guardian's International Development Journalist of the Year Awards. If you enjoy cracking codes then you can try out your code breaking skills at our school website: www.schoolcodebreaking.com - over 10,000 students from countries around the world have made it onto the leaderboard - how about you? For those interested in a more detailed mathematical introduction to coding, you should try A First Course in Coding Theory by Raymond Hill - a very readable book which discusses in much more detail the ideas discussed above.



A very interesting article, Andrew. Thank you for sharing.