Skip to main content

Down the GCHQ rabbit hole (or I think there really is a part 4)

Update: Confirmation that this is correct

Early this morning I blogged that I thought there was a hidden part 4 to the GCHQ Code Challenge because of the amount of non-random data in part 2. Through someone who knows someone who knows someone at GCHQ I got the response the data was "a random filler and that they ran out of time to do anything interesting".

Well, I don't believe that's the whole story. There's a distinct pattern worth investigating.

Firstly, here are the three blocks of data that are untouched in part 2:
0150: 7d 1f 15 60 4d 4d 52 7d 0e 27 6d 10 6d 5a 06 56   }..`MMR}.'m.mZ.V
0160: 47 14 42 0e b6 b2 b2 e6 eb b4 83 8e d7 e5 d4 d9   G.B.............
0170: c3 f0 80 95 f1 82 82 9a bd 95 a4 8d 9a 2b 30 69   .............+0i
0180: 4a 69 65 55 1c 7b 69 1c 6e 04 74 35 21 26 2f 60   JieU.{i.n.t5!&/`
0190: 03 4e 37 1e 33 54 39 e6 ba b4 a2 ad a4 c5 95 c8   .N7.3T9.........
01a0: c1 e4 8a ec e7 92 8b e8 81 f0 ad 98 a4 d0 c0 8d   ................
01b0: ac 22 52 65 7e 27 2b 5a 12 61 0a 01 7a 6b 1d 67   ."Re~'+Z.a..zk.g

0200: 37 7a 07 11 1f 1d 68 25 32 77 1e 62 23 5b 47 55   7z....h%2w.b#[GU
0210: 53 30 11 42 f6 f1 b1 e6 c3 cc f8 c5 e4 cc c0 d3   S0.B............
0220: 85 fd 9a e3 e6 81 b5 bb d7 cd 87 a3 d3 6b 36 6f   .............k6o
0230: 6f 66 55 30 16 45 5e 09 74 5c 3f 29 2b 66 3d 0d   ofU0.E^.t\?)+f=.
0240: 02 30 28 35 15 09 15 dd ec b8 e2 fb d8 cb d8 d1   .0(5............
0250: 8b d5 82 d9 9a f1 92 ab e8 a6 d6 d0 8c aa d2 94   ................
0260: cf 45 46 67 20 7d 44 14 6b 45 6d 54 03 17 60 62   .EFg }D.kEmT..`b

0270: 55 5a 4a 66 61 11 57 68 75 05 62 36 7d 02 10 4b   UZJfa.Whu.b6}..K
0280: 08 22 42 32 ba e2 b9 e2 d6 b9 ff c3 e9 8a 8f c1   ."B2............
0290: 8f e1 b8 a4 96 f1 8f 81 b1 8d 89 cc d4 78 76 61   .............xva
02a0: 72 3e 37 23 56 73 71 79 63 7c 08 11 20 69 7a 14   r>7#Vsqyc|.. iz.
02b0: 68 05 21 1e 32 27 59 b7 cf ab dd d5 cc 97 93 f2   h.!.2'Y.........
02c0: e7 c0 eb ff e9 a3 bf a1 ab 8b bb 9e 9e 8c a0 c1   ................
02d0: 9b 5a 2f 2f 4e 4e 00 00 00 00 00 00 00 00 00 00   .Z//NN..........
I say three (rather than two blocks), because as you'll see below there's a real pattern that indicates that we are looking at three blocks of data encrypted using the same method and the same key.

The decryption scheme used for the other parts of part 2 (that actually get decrypted) is based on calculating the following where xn is the nth byte (0 based) of the block to be descrypted: f(xn) = xn xor ( a * n + b ). a and b are constants. The first decryption uses a = 1 and b = 0xAA. The second decryption uses a = 3 and b = 0x32.

Working under the assumption that a similar scheme was used for these three blocks I ran various tests and noticed a peculiar rhythmic pattern where whole runs of random ASCII characters would appear followed by runs of non-printable characters. It occurred to me that this could imply that (a) the underlying data is in ASCII (using printable characters) and that (b) the pattern I was seeing was a pattern in the top bit of key. If the key is following the scheme above it would go through runs of 0s followed by 1s and if the plain text is ASCII the top bit is always 0 and so for that one bit the 'key' comes through.

A quick calculation of just the top bits of the three blocks above showed the following pattern:
0150: 
000000000000000000001111111111111111111111111
0000000000000000000000000011111111111111111111111111
000000000000000
0200: 
000000000000000000001111111111111111111111111
0000000000000000000000000011111111111111111111111111
000000000000000
0270: 
000000000000000000001111111111111111111111111
0000000000000000000000000011111111111111111111111111
00000
Immediately, it's apparent this this data is not random (at least in the top bit), and that it follows a pattern of 0s and 1s alternating with the following counts:
150: 20 (0s), 25 (1s), 26 (0s), 26 (1s), 15 (0s)
200: 20 (0s), 25 (1s), 26 (0s), 26 (1s), 15 (0s)
270: 20 (0s), 25 (1s), 26 (0s), 26 (1s), 15 (0s)
(Note that the last block is short because the data is truncated and ends with 0s). It also might indicate that there are three separate blocks of data encrypted using the same key.

Theorizing that that was generated using the f(xn) function above meant finding a and b that fit the bill. It's not hard to calculate this spotting that the 25 occurs when the 0xFF boundary is passed and that the 20 occurs because we start in the middle of a 26 block. The actual coefficents are a = 5 and b = 0x1F. That generates the following sequence to use as the key:

1f 24 29 2e 33 38 3d 42 47 4c 51 56 5b 60 65 6a 6f 74 79 7e 83 88 8d 92 97 9c a1 a6 ab b0 b5 ba bf c4 c9 ce d3 d8 dd e2 e7 ec f1 f6 fb 00 05 0a 0f 14 19 1e 23 28 2d 32 37 3c 41 46 4b 50 55 5a 5f 64 69 6e 73 78 7d 82 87 8c 91 96 9b a0 a5 aa af b4 b9 be c3 c8 cd d2 d7 dc e1 e6 eb f0 f5 fa ff 04 09 0e 13 18 1d 22 27 2c 31 36 3b 40 45 4a 4f 54 59 5e 63 68 6d 72 77 7c 81 86 8b 90 95 9a 9f a4 a9 ae b3 b8 bd c2 c7 cc d1 d6 db e0 e5 ea ef f4 f9 fe 03 08 0d 12 17 1c 21 26 2b 30 35 3a 3f 44 49 4e 53 58 5d 62 67 6c 71 76 7b 80 85 8a 8f 94 99 9e a3 a8 ad b2 b7 bc c1 c6 cb d0 d5 da df e4 e9 ee f3 f8 fd 02 07 0c 11 16 1b 20 25 2a 2f 34 39 3e 43 48 4d 52 57 5c 61 66 6b 70 75 7a 7f 84 89 8e 93 98 9d a2 a7 ac b1 b6 bb c0 c5 ca cf d4 d9 de e3 e8 ed f2 f7 fc 01 06 0b 10 15 1a 1f

Which has the required sequence of 0s and 1s and begins: 20 (0s), 25 (1s), 26 (0s), 26 (1s), 25 (0s). So, at this stage I've identified a pattern and a way to generate that sequence using the encryption scheme used in part 2.

The surprising thing that happens when you decrypt those three blocks using that key stream is that not only is it 7 bit (which was bound to happen because of the way the key was generated) but the entire thing is printable ASCII. No characters below 0x20 and no 0x7F. Here are the three decrypted blocks in ASCII (I added line breaks):

b;<N~uo?Ik<F6:c<(`;p5:?t|("(|Uac|4I["Z_xZyU{a+5cE}|K?SD.Y85sjvz:
\*^[email protected],Dd=83;?e0bnP3R$ZF:V,L~O 5wS&[km?6x5M;7A+X-(^.?,%Ugu;O4x;"?
<Dh<uy<tTPYcO|ui:9S-5YhY0!vU(k3e`rL.5ms;C`~o`6hW]TA[fqh_k4smCk}{
$a;gY9_y?z76gZ'n0AOi3eY6Li\b8W%(J~cHR)j*2I3`&bu!gV;L9j4pA%^eB::{
0%qjE)RcVax:/xsk}*.=u[\[email protected]/N7aHpA_$5H'LCW76XHtRA*krs|WZxu|U;
d^&!]V"',16;@EJ

The fact that this decrypts as printable ASCII reinforces my belief that this is correct. Of course, it looks like gibberish, but it's printable. There are 89 characters used from the 95 possible. The missing characters are # 1 > G Q l.

It seems unlikely that the sequence of steps I have taken would have resulted in a block of perfectly printable ASCII characters had this not been the correct thing to do.

Perhaps this is a base-89 encoding of some other binary data. Whatever it is it has high-entropy and could be encrypted with some robust scheme, or it could be the result of compression. Of course, it could be that this data is random and was encrypted by GCHQ as I specified as a sort of dead end.

PS If anyone from GCHQ is reading... can you email me a simple 'carry on' or 'stop wasting your time'. Need to sleep...

Comments

Michael said…
This is likely something you've already noticed, but it became apparent in my feed reader: the least-significant byte (big-endian) repeats every 16 bytes:

1f 24 29 2e 33 38 3d 42 47 4c 51 56 5b 60 65 6a
6f 74 79 7e 83 88 8d 92 97 9c a1 a6 ab b0 b5 ba
bf c4 c9 ce d3 d8 dd e2 e7 ec f1 f6 fb 00 05 0a
0f 14 19 1e 23 28 2d 32 37 3c 41 46 4b 50 55 5a
5f 64 69 6e 73 78 7d 82 87 8c 91 96 9b a0 a5 aa
af b4 b9 be c3 c8 cd d2 d7 dc e1 e6 eb f0 f5 fa
ff 04 09 0e 13 18 1d 22 27 2c 31 36 3b 40 45 4a
4f 54 59 5e 63 68 6d 72 77 7c 81 86 8b 90 95 9a
9f a4 a9 ae b3 b8 bd c2 c7 cc d1 d6 db e0 e5 ea
ef f4 f9 fe 03 08 0d 12 17 1c 21 26 2b 30 35 3a
3f 44 49 4e 53 58 5d 62 67 6c 71 76 7b 80 85 8a
8f 94 99 9e a3 a8 ad b2 b7 bc c1 c6 cb d0 d5 da
df e4 e9 ee f3 f8 fd 02 07 0c 11 16 1b 20 25 2a
2f 34 39 3e 43 48 4d 52 57 5c 61 66 6b 70 75 7a
7f 84 89 8e 93 98 9d a2 a7 ac b1 b6 bb c0 c5 ca
cf d4 d9 de e3 e8 ed f2 f7 fc 01 06 0b 10 15 1a
1f

(Apologies for formatting; Blogger does not allow code or pre tags)
sep332 said…
Well the first byte also is repeated, 3x at a time. 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, ... This could very well be random filler.
@Mikester and @sep332. You are looking at the key and not the plain text.
sep332 said…
oh, i did think it was a bit odd >< sorry :)
Kid said…
I've seen this result here: http://www.r00t.cz/Misc/CanYouCrackIt

An additional observation was: "One firmware value is 0xd2ab1f05 = 1F=31 05=5 -> values for above algorithm. Coincidence?"

The other two bytes could indicate where to start and loop how many times?
Kid said…
I've seen this result here: http://www.r00t.cz/Misc/CanYouCrackIt

An additional observation was: "One firmware value is 0xd2ab1f05 = 1F=31 05=5 -> values for above algorithm. Coincidence?"

The other two values could be the starting point and how many times to loop?
Kid said…
I've seen this result here: http://www.r00t.cz/Misc/CanYouCrackIt

An additional observation was: "One firmware value is 0xd2ab1f05 = 1F=31 05=5 -> values for above algorithm. Coincidence?"

The other two values could be the starting point and how many times to loop?

Popular posts from this blog

Your last name contains invalid characters

My last name is "Graham-Cumming". But here's a typical form response when I enter it:


Does the web site have any idea how rude it is to claim that my last name contains invalid characters? Clearly not. What they actually meant is: our web site will not accept that hyphen in your last name. But do they say that? No, of course not. They decide to shove in my face the claim that there's something wrong with my name.

There's nothing wrong with my name, just as there's nothing wrong with someone whose first name is Jean-Marie, or someone whose last name is O'Reilly.

What is wrong is that way this is being handled. If the system can't cope with non-letters and spaces it needs to say that. How about the following error message:

Our system is unable to process last names that contain non-letters, please replace them with spaces.

Don't blame me for having a last name that your system doesn't like, whose fault is that? Saying "Your last name …

Importing an existing SSL key/certificate pair into a Java keystore

I'm writing this blog post in case anyone else has to Google that. In Java 6 keytool has been improved so that it now becomes possible to import an existing key and certificate (say one you generated outside of the Java world) into a keystore.

You need: Java 6 and openssl.

1. Suppose you have a certificate and key in PEM format. The key is named host.key and the certificate host.crt.

2. The first step is to convert them into a single PKCS12 file using the command: openssl pkcs12 -export -in host.crt -inkey host.key > host.p12. You will be asked for various passwords (the password to access the key (if set) and then the password for the PKCS12 file being created).

3. Then import the PKCS12 file into a keystore using the command: keytool -importkeystore -srckeystore host.p12 -destkeystore host.jks -srcstoretype pkcs12. You now have a keystore named host.jks containing the certificate/key you need.

For the sake of completeness here's the output of a full session I performe…

More fun with toys: the Ikea LILLABO Train Set

As further proof of my unsuitability to be a child minder (see previous post) I found myself playing with an Ikea LILLABO 20-piece basic set train.


The train set has 16 pieces of track (12 curves, two straight pieces and a two part bridge) and 4 pieces of train. What I wondered was... how many possible looping train tracks can be made using all 16 pieces?

The answer is... 9. Here's a picture of the 9 different layouts.


The picture was generated using a little program written in Processing. The bridge is red, the straight pieces are green and the curves are blue or magenta depending on whether they are oriented clockwise or anticlockwise. The curved pieces can be oriented in either way.

To generate those layouts I wrote a small program which runs through all the possible layouts and determines which form a loop. The program eliminates duplicate layouts (such as those that are mirror images of each other).

It outputs a list of instructions for building loops. These instructions con…