Wednesday, September 18, 2013

Why is the fifth part of the GCHQ "Can you find it?" challenge so easy

UPDATE: Read the comments below for the solution.

If you've worked your way through the GCHQ Can you find it? recruitment challenge you'll have reached the last part and discovered that you have to do... nothing. The code word is accepted and you get to give your details for the competition. But you didn't do anything in part five.

That bothers me.

And there's another thing that bothers me. The second part includes a very obvious link to a web site (apparently correct) hidden inside an RSA private key, but the private key is messed up and messed up in a very specific manner.

Here's the key file dumped out:

$ openssl rsa -in comp1.key -text -noout
Private-Key: (1022 bit)
modulus:
    37:c0:04:af:3e:8e:80:cb:75:b1:53:0c:9f:b2:dc:
    f4:d3:ce:4a:82:8b:52:f6:a8:48:e0:c5:d8:35:8b:
    26:6c:84:94:de:29:47:24:49:85:72:28:17:8e:06:
    d0:77:17:0c:2a:5d:56:ba:88:d1:07:25:e2:c5:7b:
    01:44:ea:e9:44:38:87:1a:b5:5a:75:d5:98:34:89:
    b3:1f:9e:a4:e2:bd:b7:7a:b7:cf:f3:dc:ac:ea:ac:
    59:2c:83:dc:50:8a:27:0c:69:cb:66:4e:a1:64:9b:
    ca:e8:e4:e0:dc:d8:d4:d0:cc:c8:c4:c0:bc:b8:b4:
    b0:ac:94:13:82:39:51:f1
publicExponent: 65537 (0x10001)
privateExponent:
    13:5b:5d:85:07:60:6d:41:b7:9c:99:2c:61:ea:b5:
    a3:60:43:59:45:98:60:76:fa:19:4b:ca:05:f7:19:
    58:7f:07:4d:b5:11:79:fd:14:75:fc:1c:05:89:af:
    be:04:0b:81:92:d8:13:bb:f2:b3:39:1b:23:70:d3:
    f3:ad:dd:2e:4c:26:d3:1b:a8:56:f1:83:ca:d9:13:
    95:38:e7:80:30:77:a4:f0:d9:77:f9:25:b9:c1:d7:
    8f:2a:e5:b0:31:d8:c3:0e:3a:b1:5c:39:ec:f9:90:
    b5:77:60:a9:cf:95:7e:c7:ed:b3:9c:e6:0b:d1:bb:
    04:29:e8:b4:b1:69:7b:2d
prime1:
    37:c0:04:af:3e:8e:80:cb:75:b1:53:0c:9f:b2:dc:
    f4:d3:ce:4a:82:8b:52:f6:a8:48:e0:c5:d8:35:8b:
    26:6c:84:94:de:29:47:24:49:85:72:28:17:8e:06:
    d0:77:17:0c:2a:5d:56:ba:88:d1:07:25:e2:c5:7b:
    01:44:ea:e8:55:4a:2a:2b:e4:71:8f:02:b1:61:b0:
    e4:34:bf:da:1b:d4:d0:95:ec:ff:0c:f7:da:8d:e1:
    7a:65:99:7f:f1:b3:4e:47:81:00:95:87:d6:8c:5a:
    d8:a8:a4:a0:9c:98:94:90:8c:88:84:80:7c:78:74:
    70:6c:53:d2:41:f9:3b:e4
prime2:
    77:77:2e:77:68:74:73:69:73:69:6c:67:75:6f:65:
    63:74:73:72:65:68:73:72:69:2e:65:6f:63:75:2e:
    2f:6b:6c:62:74:65:68:63:65:6c:20:79:20:20:20:
    20:20:20:20:20:20:20:20:20:20:20:20:20:20:20:
    20:20:0b:8f
exponent1:
    13:a5:24:9d:fc:2e:52:20:40:1b:50:f9:3e:65:80:
    1d:b7:b3:98:57:36:b2:ed:58:80:89:ab:a4:86:4b:
    7e:fe:c2:46:fa:6f:06:98:79:c0:2b:22:df:f6:88:
    71:df:f6:88:71:df:f6:88:71:df:f6:88:71:df:f6:
    b2:8a:b2:4f
exponent2:
    08:79:f2:58:12:97:40:a1:18:c9:40:21:cf:19:4a:
    4e:56:32:e2:c9:03:32:3d:c9:ec:ba:d1:be:72:d0:
    06:19:4f:25:65:30:d4:c9:48:a6:f5:5e:e2:c2:a4:
    c4:e2:c2:a4:c4:e2:c2:a4:c4:e2:c2:a4:c4:e2:c2:
    a4:c4:e1:4d
coefficient:
    14:89:f3:4e:c0:0e:91:ab:96:dd:ca:dd:d5:77:f1:
    32:1c:62:b5:49:1a:a5:d4:2a:97:0b:c5:85:9b:a8:
    b8:d2:32:6d:f1:0e:7d:6e:96:92:3b:60:84:10:f2:
    a9:fe:74:70:41:56:5c:c2:7b:56:4f:26:af:a7:30:
    4e:8b:0f:bd:82:94:55:72:94:09:b9:6b:7a:d2:d3:
    79:4f:79:4e:56:e4:a6:b8:b3:3e:4c:be:fb:96:fb:
    a5:0b:92:8b:79:a9:2c:c8:be:e9:58:2f:72:34:ed:
    85:f5:cf:60:d8:36:26:32:69:82:6b:5e:0b:87:de:
    95:82:ff:d8:54:c0:99:3f

When I did this it was clear that prime2 contained printable ASCII just from looking at it. All well and good if you've solved the puzzle, but what about the other parts?

Specifically,

1. modulus should be prime1 * prime2. It isn't.

2. In fact, the modulus and prime1 are very close to each other. Just compare the bytes starting at the start of each (in italics). Odd.

3. Looking a little deeper with bc I find:

$ bc
obase=16
ibase=16

modulus=37C004AF3E8E80CB75B1530C9FB2DCF4D3CE4A828B52F6A848E0C5D83
58B266C8494DE29472449857228178E06D077170C2A5D56BA88D10725E2C57B01
44EAE94438871AB55A75D5983489B31F9EA4E2BDB77AB7CFF3DCACEAAC592C83D
C508A270C69CB664EA1649BCAE8E4E0DCD8D4D0CCC8C4C0BCB8B4B0AC94138239
51F1

prime1=37C004AF3E8E80CB75B1530C9FB2DCF4D3CE4A828B52F6A848E0C5D835
8B266C8494DE29472449857228178E06D077170C2A5D56BA88D10725E2C57B014
4EAE8554A2A2BE4718F02B161B0E434BFDA1BD4D095ECFF0CF7DA8DE17A65997F
F1B34E4781009587D68C5AD8A8A4A09C9894908C8884807C7874706C53D241F93
BE4

modulus-prime1
EEEE5CEED0E8E6D2E6D2D8CEEADECAC6E8E6E4CAD0E6E4D25CCADEC6EA5C5ED6D
8C4E8CAD0C6CAD840F24040404040404040404040404040404040414040160D

(modulus-prime1)/2
77772E776874736973696C67756F656374737265687372692E656F63752E2F6B6
C6274656863656C2079202020202020202020202020202020202020A0200B06

prime2=77772E776874736973696C67756F656374737265687372692E656F6375
2E2F6B6C6274656863656C2079202020202020202020202020202020202020202
00B8F

(modulus-prime1)/2 - prime2
7FFFFF77

So, another way of finding the answer to part two is to calculate (modulus-prime1)/2 to get the same printable ASCII containing a URL.

I wonder if GCHQ originally wanted to make part two harder and then changed their mind?

And then there's the relationship between exponent1 and exponent2. There are sections with repeating patterns (and oddly those sections seem to align with the 0x20 padding bytes in prime2 (see bolded sections above)). df:f6:88:71 repeats in exponent1 and e2:c2:a4:c4 repeats in exponent2

Unfortunately, why those repeating patterns are present (indication of a simple XOR key used against repeating plaintext?) and how they are used to get further information elude me. Also, why are there some 'random' bytes after each repeating section?

Last time GCHQ had a challenge I did find something interesting which was confirmed (with help from The Register) by GCHQ. 

I reached out to the same people via a dead letter box at the tomb of Thomas Bayes in Bunhill Fields cemetery and searched for a response via the personal ads in The London Review of Books but have heard nothing.

So... has anyone else looked into this?





2 comments:

Phil Ronan said...

Yes, it was a bit of an anticlimax. I got stuck on this for a while because openssl kept choking on the private key.

We know prime1 is bogus because it's even. But if we assume the modulus value to be correct, then we can calculate the "proper" value of prime1 as (modulus / prime2). This turns out to be very close to the value of prime2:

(bc:)
prime1a = modulus / prime2
prime1a
77772E776874736973696C67756F656374737265687372692E656F63752E2F6B6C62\
74656863656C207920202020202020202020202020202020202120200A7F


Both prime1a and prime2 are probable primes according to GMP, which is encouraging.

And apparently the original (i.e., faked) value of prime1 is {modulus - prime1a - prime2 + 1)

(bc:)
prime1 - (modulus - prime1a - prime2)
1


My hunch is that the repeating patterns followed by 'random' bytes in the exponent values are just a reflection of the repeated 0x20 bytes in the two primes (followed by the additional digits needed to make them prime).

John Graham-Cumming said...

Phil:

That makes a lot of sense. So the GCHQ folks probably encoded the URL in a 512 bit number and then found two primes that were close to it (prime1a and prime2) and got the modulus from it.

They then decided to mess with prime1 to break the key itself so that OpenSSL etc. couldn't read it. Since modulus, prime1a and prime2 are all odd the +1 makes sense because it makes prime1 even (i.e. definitely not prime)

That also, as you say, explains the repeated patterns appearing in the modulus (artifact of multiplication).

Also if you calculate the totient value from prime1a and prime2 then you find that the privateExponent in the RSA key file is good (i.e. privateExponent * publicExponent mod totient = 1). Further evidence of the validity of your prime1a.

And, further, the values for exponent1 and exponent2 have patterns (as you suggest) because they are the remainder after doing mod prime1a-1 and prime2-1. I've checked them and that's where the pattern comes from.

So, I think this gets to the bottom of it. Thanks for the comment!

That only leaves me wondering why they set prime1 to modulus - prime1a - prime2 + 1.