Tuesday, June 14, 2011

How The Zodiac enciphered the Zodiac 408 cipher

I was looking through the Zodiac Killer ciphers the other day and woke in the middle of the night wondering how The Zodiac actually enciphered the first message (the one that was decoded).

The message consists of 408 symbols; there are 54 different symbols used for the alphabet. So there are multiple symbols for each letter of the alphabet. The Zodiac used a homophonic cipher to disguise the most common letters of the alphabet by using multiple symbols for common letters. For example, for the letter E The Zodiac used seven different symbols.

I started to wonder how when The Zodiac was writing out the message he picked which symbol to use. And it occurred to me that he might have used a really simple system: cycling through the symbols for each letter in the same order.

A quick look at the cipher showed that it was likely that the simple scheme was used. Here I looked at the letter E and the letter N and discovered that a simple pattern was used for each.

Using a small program I wrote I identified the sequence used by The Zodiac for all the repeated letters (note that I've used lowercase letters for the reversed or upside-down capitals used by The Zodiac):
PlaintextCipher symbols in order
EZ p W + ◉ N E
TH I L
AG S
I△ P U k
OX * T d
NO ∧ D ⦶
SF K ▣
HM ⦵
Rt r \
Df z
L B
FJ Q
(In the above table * is The Zodiac's character that looks like pi with curved legs). The Zodiac mostly kept to this scheme with occasional errors (either deliberate or unintentional).

This leads to a possible way to attack the unsolved Zodiac 340 cipher. If The Zodiac used a similar scheme where he repeated the same sequence over and again for each letter it should be possible to find those sequences and reduce the decryption to something close to a classic substitution cipher.

An attack would consist of the following:

1. Split the Zodiac 340 cipher on each of the symbols to obtain a list of lists of sequences

2. Remove any duplicated characters from the lists (since we can assume that there will be none if the characters are being used in order as in the Zodiac 408)

3. Generate all combinations of the first list and look for the same subsequence occurring in the other lists.

Looking at the Zodiac 340 it looks like it splits on average into deduplicated strings of between 20 and 30 symbols. If we look for subsequences of length say 6 then each search would require 38,760 and 593,775 combinations to be generated. For 63 characters that sets an upper bound of roughly 37m combinations. That leads me to think that this approach could be used.

Labels:

If you enjoyed this blog post, you might enjoy my travel book for people interested in science and technology: The Geek Atlas. Signed copies of The Geek Atlas are available.

<$BlogCommentBody$>

<$BlogCommentDateTime$> <$BlogCommentDeleteIcon$>

Post a Comment

Links to this post:

<$BlogBacklinkControl$> <$BlogBacklinkTitle$> <$BlogBacklinkDeleteIcon$>
<$BlogBacklinkSnippet$>
Create a Link

<< Home