Monday, December 05, 2011

Is there another GCHQ Code Challenge?

Update: The answer is Yes, sort of.

Last week GCHQ issued a 'code challenge' as a way of attracting candidates via a web site called Can You Crack It?. I did the challenge, which actually consists of three parts and has more to do with assembly language and reverse engineering skills than cryptography (in fact, the only time you come across a cryptographic function, you can completely ignore it).

The cat is completely out of the bag now because some spoilsport published complete details on the web showing how to break it, but I think there may be one more mystery worth investigating.

Part 2 of the challenge involves writing a virtual machine that executes some code to decrypt an in memory buffer containing an HTTP command. Here's a dump of the memory of the virtual machine after execution:
0000: 31 04 33 aa 40 02 80 03 52 00 72 01 73 01 b2 50   1.3.@...R.r.s..P
0010: 30 14 c0 01 80 00 10 10 00 00 00 00 00 00 00 00   0...............
0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0100: 32 00 75 0c 31 08 33 32 40 02 80 03 52 00 72 01   2.u.1.32@...R.r.
0110: 73 03 b2 00 c3 b0 00 30 1b c0 01 ff 00 00 00 00   s......0........
0120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0130: 00 00 75 10 01 00 00 00 00 00 00 00 00 00 00 00   ..u.............
0140: cc 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0150: 7d 1f 15 60 4d 4d 52 7d 0e 27 6d 10 6d 5a 06 56   }..MMR}.'m.mZ.V
0160: 47 14 42 0e b6 b2 b2 e6 eb b4 83 8e d7 e5 d4 d9   G.B.............
0170: c3 f0 80 95 f1 82 82 9a bd 95 a4 8d 9a 2b 30 69   .............+0i
0180: 4a 69 65 55 1c 7b 69 1c 6e 04 74 35 21 26 2f 60   JieU.{i.n.t5!&/
0190: 03 4e 37 1e 33 54 39 e6 ba b4 a2 ad a4 c5 95 c8   .N7.3T9.........
01a0: c1 e4 8a ec e7 92 8b e8 81 f0 ad 98 a4 d0 c0 8d   ................
01b0: ac 22 52 65 7e 27 2b 5a 12 61 0a 01 7a 6b 1d 67   ."Re~'+Z.a..zk.g
01c0: 47 45 54 20 2f 64 61 37 35 33 37 30 66 65 31 35   GET /da75370fe15
01d0: 63 34 31 34 38 62 64 34 63 65 65 63 38 36 31 66   c4148bd4ceec861f
01e0: 62 64 61 61 35 2e 65 78 65 20 48 54 54 50 2f 31   bdaa5.exe HTTP/1
01f0: 2e 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00   .0..............
0200: 37 7a 07 11 1f 1d 68 25 32 77 1e 62 23 5b 47 55   7z....h%2w.b#[GU
0210: 53 30 11 42 f6 f1 b1 e6 c3 cc f8 c5 e4 cc c0 d3   S0.B............
0220: 85 fd 9a e3 e6 81 b5 bb d7 cd 87 a3 d3 6b 36 6f   .............k6o
0230: 6f 66 55 30 16 45 5e 09 74 5c 3f 29 2b 66 3d 0d   ofU0.E^.t\?)+f=.
0240: 02 30 28 35 15 09 15 dd ec b8 e2 fb d8 cb d8 d1   .0(5............
0250: 8b d5 82 d9 9a f1 92 ab e8 a6 d6 d0 8c aa d2 94   ................
0260: cf 45 46 67 20 7d 44 14 6b 45 6d 54 03 17 60 62   .EFg }D.kEmT..b
0270: 55 5a 4a 66 61 11 57 68 75 05 62 36 7d 02 10 4b   UZJfa.Whu.b6}..K
0280: 08 22 42 32 ba e2 b9 e2 d6 b9 ff c3 e9 8a 8f c1   ."B2............
0290: 8f e1 b8 a4 96 f1 8f 81 b1 8d 89 cc d4 78 76 61   .............xva
02a0: 72 3e 37 23 56 73 71 79 63 7c 08 11 20 69 7a 14   r>7#Vsqyc|.. iz.
02b0: 68 05 21 1e 32 27 59 b7 cf ab dd d5 cc 97 93 f2   h.!.2'Y.........
02c0: e7 c0 eb ff e9 a3 bf a1 ab 8b bb 9e 9e 8c a0 c1   ................
02d0: 9b 5a 2f 2f 4e 4e 00 00 00 00 00 00 00 00 00 00   .Z//NN..........
02e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
02f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

There are three colors used there to highlight areas: memory in blue was executed, memory in red was both read and written, memory in green was read, written and executed. As you can see the program starts running at 0x0000. There there's small program that decrypts the block of memory between 0x100 and 0x14F and jumps to 0x100.

The decrypted program decrypts the memory at 0x1C0 to reveal the HTTP command that is used to execute part 3 of the GCHQ challenge.

But what about all the rest of the unused memory? No other part of the GCHQ challenge wastes bytes, they are all used for something, but here there's a ton of memory that's filled with data and that isn't referred to anywhere else.

Is this a hidden fourth part of the GCHQ challenge? Perhaps even the real challenge?

If you look into the first block of data that's been decrypted there are two intriguing pieces of additional information. At 0x132 there are three bytes: 75 10 01. Those three bytes are actually valid code in the virtual machine. They mean
75 10   add   ds, 0x10
01      jmpr  r1

This seems like perfectly reasonable code. It moves the data segment forward by 0x10 and then jumps to R1. At the end of execution of the original program the data segment is at 0x1c. Thus moving it forward puts it at 0x2c (i.e. the block starting at 0x02c0 above). An alternative explanation is that based on the initial conditions of the VM that data segment would now be 0x20 (i.e. pointing immediately after the decrypted HTTP command at 0x0200).

Also at the end of running the program R1 contains 0x08. Doing the jump would drop straight into the decryption loop just past the initialization and into the XOR portion. That could be totally valid. The jump would take the instruction pointer to:
10:08 movm r0, [ds:r2]
10:0a xor r0, r3
10:0c movm [ds:r2], r0
10:12 cmp r2, 00
10:14 jmpe r3
10:15 cmp r0, 00
10:17 movr r0, 1b
10:19 jmpe r0
10:1a jmpr r1
`
The only other clue is that there's the byte 0xCC in the decrypted memory.

And one more thing. Notice the 10:14 jmpe r3. That never gets taken because r2 is never 00. But if you investigate the circumstances under which r2 would be 00 you find that it's when r3 would be 0x32 (i.e. when that jump would take you right to the 75 10 01 sequence that's been decrypted).

I haven't had enough free time to investigate this further. Perhaps it's a red herring, but it looks awfully suspicious. Especially given that the main loop will take that jump instruction when it's completely exhaused 0x100 bytes of data. This little 'subroutine' then moves the data segment on by 0x10 (i.e. 0x100 bytes) and the decryption will continue until a 00 byte is written. So it looks valid and is designed to cope with not having hit a 00 before the end of a segment.

PS It's been pointed out to me that the Can You Crack It? web site has been altered since the beginning to add the words "The challenge continues".

Labels:

If you enjoyed this blog post, you might enjoy my travel book for people interested in science and technology: The Geek Atlas. Signed copies of The Geek Atlas are available.

Phlash said...

Hi John,

According to my understanding, the code at offset 0x132 is already used as part of the 2nd decryption loop:

0x100: movr r2,#0
0x102: add r5,#12 <-- move ds to encrypted block
0x108: movm r0,[ds:r2] <-- return from jump here
0x10A: xor r0,r3
0x10C: movm [ds:r2],r0
0x10E: add r2,#1 <-- increment pointer
0x110: add r3,#3 <-- increment XOR
0x112: cmp r2,#0 <-- rollover?
0x114: jmpe [r3] <-- when r2==0, r3==50, thus cs:50 == 0x132 :)
0x115: cmp r0,#0 <-- terminate at zero decrypted value
0x117: movr r0,#27
0x119: jmpe [r0]
0x11A: jmp [r1]
0x11B: hlt
...