Tuesday, June 14, 2011

How The Zodiac enciphered the Zodiac 408 cipher

I was looking through the Zodiac Killer ciphers the other day and woke in the middle of the night wondering how The Zodiac actually enciphered the first message (the one that was decoded).

The message consists of 408 symbols; there are 54 different symbols used for the alphabet. So there are multiple symbols for each letter of the alphabet. The Zodiac used a homophonic cipher to disguise the most common letters of the alphabet by using multiple symbols for common letters. For example, for the letter E The Zodiac used seven different symbols.

I started to wonder how when The Zodiac was writing out the message he picked which symbol to use. And it occurred to me that he might have used a really simple system: cycling through the symbols for each letter in the same order.

A quick look at the cipher showed that it was likely that the simple scheme was used. Here I looked at the letter E and the letter N and discovered that a simple pattern was used for each.

Using a small program I wrote I identified the sequence used by The Zodiac for all the repeated letters (note that I've used lowercase letters for the reversed or upside-down capitals used by The Zodiac):
PlaintextCipher symbols in order
EZ p W + ◉ N E
TH I L
AG S
I△ P U k
OX * T d
NO ∧ D ⦶
SF K ▣
HM ⦵
Rt r \
Df z
L B
FJ Q
(In the above table * is The Zodiac's character that looks like pi with curved legs). The Zodiac mostly kept to this scheme with occasional errors (either deliberate or unintentional).

This leads to a possible way to attack the unsolved Zodiac 340 cipher. If The Zodiac used a similar scheme where he repeated the same sequence over and again for each letter it should be possible to find those sequences and reduce the decryption to something close to a classic substitution cipher.

An attack would consist of the following:

1. Split the Zodiac 340 cipher on each of the symbols to obtain a list of lists of sequences

2. Remove any duplicated characters from the lists (since we can assume that there will be none if the characters are being used in order as in the Zodiac 408)

3. Generate all combinations of the first list and look for the same subsequence occurring in the other lists.

Looking at the Zodiac 340 it looks like it splits on average into deduplicated strings of between 20 and 30 symbols. If we look for subsequences of length say 6 then each search would require 38,760 and 593,775 combinations to be generated. For 63 characters that sets an upper bound of roughly 37m combinations. That leads me to think that this approach could be used.

12 comments:

doranchak said...

Nice post! If you are interested, here is some other info about homophone sequences in the Zodiac ciphers:

http://wiki.zodiac-ciphers.dreamhosters.com/wiki/Homophone_sequences

I tested for the presence of quadrant-based layouts in the 340 by measuring the quality of homophone sequences in millions of transformed versions of the 340:

http://wiki.zodiac-ciphers.dreamhosters.com/wiki/Quadrant_analysis_Part_2

I recommend the paper "An Algorithmic Solution of Sequential Homophonic Ciphers" by John C King for a description of an efficient attack that exploits sequential homophones to reduce homophonic substitution ciphers to simple substitution ciphers.

Finally, I have a brute-force homophone search in the CryptoScope: http://oranchak.com/zodiac/webtoy/stats.html

Scroll down to "Brute force search for sequential homophones". More detail about this can be read here: http://wiki.zodiac-ciphers.dreamhosters.com/wiki/CipherScope_Help#Discovering_sequential_homophones

Keep up the good work!

John Graham-Cumming said...

Ah. Fascinating. I guess I stumbled across a technique others had also found. No real surprise there.

Is there a non-$ version of the actual paper available?

doranchak said...

Yes - I will contact you via email.

Nick Pelling said...

Here's some more on this Z408 / Z340 homophone stuff you might find interesting...

TravisD. said...

Hi,I've actually been researching and studying the Zodiac for not too long now and when I saw the Halloween card and envolope,it triggered a little something in My head.I believe the Zodiac changed His cipher to trip people up.I believe He is using sequences of different languge alphabets from different times and countries.

I started doing more research on Roman,Greek,Phoenician,Babylos...Ect.And all the symbols are there,including the one from the Card,except I believe it to be two symbols put into one.I just see too many similarities between them all and I'm not dismissing it.I thought I would just speak what I had to say,thank You.Travis.

TravisD. said...

Believing to be the most intelligent killer of our time and so tricky to throw the authorities and public,even the FBI off.Was He really writing this last cipher in His own code?I've studied and observed ancient and current languges of Roman,Greek,Phoenician,Etrusian and Safaite.Many writing at that time basically had some same symbols that matched each language.We know Zodiac was more intelligent than anyone took Him for.What if He wrot His last cipher out of multple symbols from these languges?Google sear "Zodiac Alphebet" and belo on the page You'll see images,check them out.It's quite surprising and curious,but it's just a theory.

TravisD. said...

Like I had mentioned on another forum,I was curious if The Zodiac,when He was always talking about taking His slaves to paradice "paradise",could He have been talking about Paradise,CA,which is only about 144 miles NE of Vallejo.Could the Zodiac live or have lived there?

doranchak said...

The ciphers wiki has moved, so here are the corrected links from my previous comments:

http://zodiackillerciphers.com/wiki/index.php?title=Homophone_sequences

http://zodiackillerciphers.com/wiki/index.php?title=Quadrant_analysis_Part_2

http://zodiackillerciphers.com/wiki/index.php?title=CryptoScope_Help#Discovering_sequential_homophones

olejeek said...

I think some of the Zodiac characters look like the alternative keys on on older keyboards (printed beside the normal alphabet) like the the Commodore64 schematic PETSCII
(http://en.wikipedia.org/wiki/PETSCII), although this was released in the 80's there may be some older typewriters or "computers" from the 60's that can have the same alternative characters.
Just had a thought that the Zodiac can have used some current 60's keyboard-layout to "decrypt" his messages, just have to find out if something like this existed back then.
I mean if no one ever found out his encrypt/decrypting method, who knows..

olejeek said...

Follow up to my previous post:
His letters and characters looks like those from Ascii and unicode tables.

Matt Brubach said...

I can't understand why the Harden Decipherment is taken seriously. Assume for a minute that it is a homophonic cipher - what scientific method did Harden use to determine that his was the correct combination? He didn't. He only looked for double symbols that he could attribute LL to, because he felt that the word "kill" would appear repeatedly. Then he built the message around those two letters. Double symbols could be: BB (abbreviation), CC (account), EE (Been), DD (add), LL (all), MM (ammonium Nitrate), OO (look), PP (application), RR (arrangement), SS (Assemble), and TT (attention).
- Harden went in with a preconceived idea of what the cipher said and randomly associated letters to symbols to fit that message. If you already think you know what a substitution cipher says, mathematically you have a very large chance of finding a near hit (although an incorrect hit). That's why you have to stick to proven scientific methods.
- Irregardless, you can't assume that there are more symbols than letters in the alphabet, since most of those symbols don't represent letters. The "Map" that Zodiac made told you that it dealt with Radians. So take a look at the 408 cipher: The first character triangle (Trine 120˚ or 2.0944 radian), second character (Novile 40˚), // (parallel), thirteenth character is letter k rotated 90˚ clockwise (Quincunx 150˚), Letter "K" that is inverse of thirteenth letter (Semi-Sextile 30˚). The symbol "Q" is (Quintile 72˚). The symbol "˄" or upside down v (Vigintile 18˚), the simple square (square 90˚). The list goes on and on.
- Even to speculate that Zodiac used more than one symbol per letter doesn't matter, because you are basing your work on the Harden false decipherment. Why not look at Frank W. Bolle and Gene Mora? Better yet, the "Community For Life, Four Pi movement" in San Fran at that time. Membership included Father P, Charles Manson, Manson family members, KD, and had muscle from the Gypsy Jokers MC. The Four P Movement has a subchapter called the "Black Cross" that deals with assasinations, sacrifices, etc. Match that to the Ordo Templi Orientis O.T.O. (the zodiac symbol circle and cross for short) since they are affiliated and all of their positions are degrees...You may find that this is more than one man, it was a collaboration. I love how they played off of the Harden false decipherment though. "Slaves Paradice" is funny if you think about it. Pair is two and two dice are called "Die". Two will Die.

jennifer D said...

has anyone noticed that the errors in the solved cipher spell "rich"?