Nov 2001

A substitution cipher replaces each lettter with a unique symbol and a message encoded with such a cipher can usually be broken as long as the encoded message - or cipher text - is long enough. This is due to the structure of the english language.

In a normal page of english text some letters will appear more often, and some less often. The letters E, T, A, O, N and I occur most frequently while the letters J, K, X, Q and Z seldom appear. Try taking a page from a book and do a frequency analysis on the page, that is count how often each letter appears, to see that this is true.

You can use frequency analysis to decode the message in the same way. Find which symbol appears the most often in the cipher text - this symbols is probably representing E. What is the second most frequent symbol? - this is likely to be the letter T.

As mentioned on the previous page, you can also look out for common groups of symbols. The most frequently occuring group of three symbols together is likely to be the word THE. Similarly groups comprising of only one symbol should be either A or I.

Once you have figured out some of the symbols, you should have a good chance at decoding the whole message. Replacing the symbols with the appropriate letter throughout the cipher text as you discover them should make the values for other symbols more obvious.

Back to the Puzzle page