Saturday, May 17, 2008

Breaking the Fermilab Code

A story appeared on Slashdot about a mysterious fax received at Fermilab written in an unknown code. The full story is here. I looked at it and immediately noticed a few things:

1. The first part looked like ternary (base 3) with digits 1 (|), 2(||) and 3(|||).

2. The last part looked like binary with digits 1(|) and 2(||)

3. The middle bit looked like either a weird substitution code, or I wondered if it might be machine code.

4. In the last part the digit 2 (||) never occurs more than once, perhaps it was actually a separator and the last part is not binary.

The first step was to convert the bars into numbers. Here's a copy of my marked up print out:



The first part has the numbers (or at least I thought):

323233331112132
333231322123312
111331132312233
333212123213113
311333313331111
211333323232211
232313331121231
33231312

Noticing this had 113 digits (which is a prime number) I went off on a wild goose chase around primes, and then around the interpretation of this number in hexadecimal as a string in ASCII, Unicode or binary... waste of time.

Then I started thinking about ternary again and wrote down the largest ternary numbers that can be expressed with 1, 2, 3, ... digits:

23 = 210
223 = 810
2223 = 2610
22223 = 8010

One of those stood out: with three digits the maximum number is 26 and there are 26 letters in the alphabet! Then the only question was was how to map the three digits used in the code (1, 2, 3) to the three ternary digits (0, 1, 2).

To simplify things I wrote a small Perl program that tries out all the possible mappings and outputs the ternary interpreted as a string (with 001 = A, etc.):

use strict;
use warnings;

my $top = $ARGV[0];

$top =~ tr/321/abc/;

my @chunks;

while ( $top =~ s/^([abc]{3})// ) {
push @chunks, $1;
}

my @digits = ( '0', '1', '2' );

foreach my $d0 (@digits) {
foreach my $d1 (grep {!/$d0/} @digits) {
foreach my $d2 (grep {!/[$d0$d1]/} @digits) {
print "($d0$d1$d2) ";
foreach my $c (@chunks) {
my $v = 0;
my $m = 1;
foreach my $d (reverse split( //, $c )) {
$d =~ s/a/$d0/;
$d =~ s/b/$d1/;
$d =~ s/c/$d2/;
$v += $d * $m;
$m *= 3;
}
print chr( 64 + $v );
}
print "\n";
}
}
}

With my initial interpretation of the top part of the coded message I got the following output:

(012) [email protected]@[email protected]@CJQJFBWKAF
(021) [email protected]@[email protected]@FTVTCAPSBC
(102) JDNXUMEISOZNUODMFSGYQMPNZHMJCHCPNTELP
(120) [email protected]@RMPWRWJLFUNJ
(201) THYLOZGRKUMYOUHZCKENVZWYMDZTFDFWYJGXW
(210) [email protected]@IZWPIPTXCOYT

A ha! The 021 block (which corresponds to the mapping 3 -> 0, 2 -> 2, 1 -> 1) seems to have a partial message: [email protected]@WOULD and then it's garbage. Going back to the original message I realized that 113 is not divisible by three and that I'd either missed a symbol, or had two too many.

After much fiddling around I discovered that the correct interpretation of the top block is that two of the threes are wrapped from one line to another (there appears to me some indentation in the message that indicates this, take a look at the original, but this could be just random).

323 233 331 112 132
333 231 322 123 312
111 331 132 312 233
333 212 123 213 113
311 333 313 331 113
113 333 232 322 133
231 333 112 123 133
231 312

Rerunning my Perl program output the full message:

(012) [email protected]@[email protected]@[email protected]
(021) [email protected]@[email protected]@[email protected]
(102) JDNXUMEISOZNUODMFSGYQMPNYYMCIVEMXSVEO
(120) [email protected]
(201) THYLOZGRKUMYOUHZCKENVZWYNNZFRQGZLKQGU
(210) [email protected]

So much for the first part. The second part took me off into Z-80, 6502 and 6809 machine code wondering if it was a program and then nowhere. I still don't understand what this part is trying to say.

The third part looked initially like binary but on closer examination I decided that the 2s (||) were actually separators and the message should be interpreted as number separated by 2s by counting the 1s (|). That yields:

31211112111312
32213123123331
12213111332312
23333333233123
12313123332311
33223232312312
112

(Once again there was a wrapping 'problem' in the message where a run of 8 |s was actually 3 |s then 1 || and 3 more |s.) Using the little Perl program reveals:

(012) [email protected]@[email protected]
(021) [email protected]@[email protected]
(102) OZTYSBOOMXGZLODMLNEEOMEVACOOX
(120) [email protected]@NKVMNLUUKMUDYWKKB
(201) UMJNKAUUZLEMXUHZXYGGUZGQBFUUL
(210) [email protected]@YSQZYXOOSZOHNPSSA

So, the same mapping between digits is used.

That leaves some final questions:

1. Who is Frank Shoemaker?
2. Why is base spelt incorrectly?
3. Is the extra S in BASSE a reference to the middle section where three symbols start with S.
4. If #3 is correct, then those three symbols could be intepreted as FC16 which is 252. Could this be the employee number of the author?
5. Why is the letter A missing from the middle section when all the other hexadecimal digits are there?

Labels:

If you enjoyed this blog post, you might enjoy my travel book for people interested in science and technology: The Geek Atlas. Signed copies of The Geek Atlas are available.

<$BlogCommentBody$>

<$BlogCommentDateTime$> <$BlogCommentDeleteIcon$>

Post a Comment

Links to this post:

<$BlogBacklinkControl$> <$BlogBacklinkTitle$> <$BlogBacklinkDeleteIcon$>
<$BlogBacklinkSnippet$>
Create a Link

<< Home