Skip to main content

A back channel confirms that I'm right... sort of

Through a little circuitous back channel I received an unofficial follow up to a blog post about the unused machine code in the GCHQ Code Challenge part 2. The follow up assured me that the unused code was left over after some clean up was done, and that the rest of the data in the file was random filler (as I'd already heard).

But as I'd already determined that it was not actually random (at least at one level) I sent a carrier pigeon out with a question about my follow up blog post.

Listening for secret messages transmitted during The Archers I received a reply indicating that I had indeed 'broken' the encryption on that stream of ASCII data (by guessing the algorithm and reverse engineering the key stream), but that the underlying data was actually random ASCII generated and encrypted using the following Python code:
b = ''.join([chr(randint(0x20, 0x7f)) for i in range(0, 16 * 7)])
c = codec(b, 0x1f, s=5)
The first part generates 112 bytes of random ASCII text and the codec function is apparently the encryption function with the key I had identified (see the 0x1f and 5).

There are a couple of oddities about this code. The randint could generate an 0x7F which is non-printable (and doesn't appear in the decrypted text) and it generates precisely 112 bytes of data (whereas the part 2 actually contains two blocks of 112 and one of 102).

I questioned that by leaving a note in a tree in St. James's Park and received a response by exchanging briefcases at King's Cross Station to the effect that the 102 byte block had simply been truncated to make the code look interesting.

Of course, that could all be completely non-suspicious, but having been told that first it was random filler and then that actually it was blocks of encrypted random printable ASCII filler it wouldn't surprise me if the truth was even more complex (and perhaps rather mundane).

But why go to all this trouble on 'random filler' text? And why update the web site to say "The Challenge Continues"?

I am left slightly baffled, which I'd imagine is just how people working in secret places would like it to be.


ste9an05 said…
Some of us are considering possible steganography in images/codebreaker.jpg as it has symmetry, which is a little unexpected
Steve said…
I thought I'd drop you a comment on your interesting blog entries.

First up, I was bothered too about the unused data in part 2. I've independently verified that the "h75 h10 h01" sequence at h0132 makes sense as it allows decryption to continue beyond the end of the segment, should the message have been longer than the remaining 4 segments (64 bytes). I was a little more perplexed by "hCC" at h0140 - but, perhaps, that invalid byte is there to mark a segment that should never be executed... but, if so, the zero-bytes that follow it make no sense. Similarly, it makes little sense that the data it decodes should start at 01c0 - as the unencrypted byte-code in the first two segments decode only segments h10-h14.

I concur with your view that the remaining sections (h0150-h01bf and h0200-h2ff) contain non-random data. In addition to the cyclic top-bit pattern you've documented, I note:

* The 'premature' end to non-zero data at h02d6 is reminiscent of the end-of-message marker that is preserved in memory from h01f3-h01ff.
* I'm suspicious about the fact that there are 26 trailing zeros - for two reasons... firstly because it matches the equal number of top-bit-clear;top-bit-set bits in your analysis of the three sections... and also because (simply) 26 is the number of letters in our alphabet - possibly hinting some alphabetic code.
* The vast majority of the data has no two successive bytes equal. But, in the last 11 non-zero bytes there are three adjacent equal pairs... h9e at h02cb-h02cc and h2f at h02d2-h02d3 and h4e at h02d4-h02d5. It is also unexpected that the non-zero data terminates with two repeated pairs. This observation makes it very hard for me to conclude that a random source was encoded and truncated arbitrarily as 'filler'.

An avenue I explored today was to note, as you did, that the byte code decodes by exclusive or with a sequence which could be generated from init+step*i, where i indicates the index of the byte and init and step are parameter bytes... the first decode can be parametrised init=hAA step=1 and the second decode init=0 and step=3. I brute-force searched this space looking for strings with significant sequences of printable ASCII... but found nothing interesting. I concluded that, if these sections do contain messages that can be decoded, I don't think it uses the same encryption scheme.

Like yourself, I'm a bit baffled. I don't think the data is random; it seems odd to include these sections where all the rest of the information in stages one and two are used by the end of stage three. I can't believe that the data is there to make it harder to identify the position of the message - as the location of the message can be immediately obtained by identifying the first string of zero bytes.
Thanks John. Bit disappointed at the challenge, in a way -- but at least, like you, I can now get some sleep and return to the day job. :).

Was hoping it might offer a number of different routes, rather than a "one track" approach, with differing resultant keywords, to make it more inter-disciplinary and to sort out the high fliers from the 99.9% of candidates who -- like me -- were "also rans".

Got totally the wrong skill set for them. helped a bit; php, js/ajax, mysql, firebird, apache .... Not much use here.

Was also hoping that the exe itself (which can be overwritten, of course, to get it to work, or fetched from localhost) would manipulate the "supposed" keyword sent in the clear ... Nada.
Steve said…
I previously said: "I'm suspicious about the fact that there are 26 trailing zeros" - but, evidently, I can't count - there are 42. This completely undermines my 'alphabetic cypher' hypothesis - but the other points, I think, were valid.
Scrub that last comment. Got the stage 2 VM working in PHP on my server, and ste9an05 found a great link to a graphical implementation in JS. Don't know why that idea totally passed me by? :)
This comment has been removed by the author.
Junk said…
Hello guys,
I'm amazed with your technical skills. It's really beautiful example of hacker thinking. See what hit my eyes...

What can these quests tell us about the autor?
I watched videos from Dr Gareth Owen and knowledge to crack this is even far beyond thinking of technical person.
- Use of Assembler-low level to Java-high level programing. (So wide programing language skills are very rare)
- Whole contest is in English. (No Chineese, no Russian. Seeking English person)
- Use of VM, which is quite high tech
- No brute force needed to solve quests
- No cloud quest (Interesting since there is trend in using it)
- Presence of Facebook, Twitter, Google+. (The contest is set to be spread)
- All quests are connected through web (Which is major fault)

As some of you may have seen the Mercury Rising movie, this quest may not be question to get a job, but How many people can crack it?

Different approach to get to the end

As I have previously described there is major fault that all quests are linked via web. Means that compromising this server will get you to the end, even without solving the quests.
- By entering dummy code on you will get /index.asp hint. This get's you information that server is running on Microsoft ASP scripting. And we all know that it's hackable.
- The next step might be to run Eeye Retina scan on, find vulnerabilities and get to the web.

Next flaw is that results on the web are static and all solutions leads to one link.

By putting into W3C validator you will get non valid code... To my surprise!

And even more by doing Google search for "" you can find links to all quests (1st page) (1st page) (3rd page)
- No matter how stupid this is... The most easier solution is often the best.

Interesting is also that there is robots.txt, but web is indexed by Google. Possibly they added it later.

This leads me to think that there are 2 teams working on this task.
a) Quest team - very high knowledge assembler guy, java guy, VM guy, Wireshark(LAN) guy
b) Web team - which is very sloppy

And finally you don't even need to complete the quest or do the Google search to get to the end.
- On the GCHQ site click on "Careers" then "Click here to visit our recruitment portal"
- Then "Jobs" - "Cyber Security Specialists" and you are there!

The end is that this all is just PR... so sad :-(
NivagSwerdna said…
Having solved this myself over the weekend I noted with interest your observation that there might be further data. Along the same lines as Steve I note that the encryption is a simple XOR with a linear increase to the key; by scanning the bytes in the VM memory and looking for suitable candidate combinations of 3 letters I find only the previously discovered Part 2 plaintext. The absence of a 42 42 42 42 signature implies that it doesn't warrant a return to the deadbeef either.
I think I conclude that there is no further message. I hope I'm proven wrong in a few days.

Popular posts from this blog

Making an old USB printer support Apple AirPrint using a Raspberry Pi

There are longer tutorials on how to connect a USB printer to a Raspberry Pi and make it accessible via AirPrint but here's the minimal one that's just a list of commands and simple instructions. 1. Install Raspbian on a SD card 2. Mount SD card on some machine and navigate to / . Add a file called ssh and set up wpa_supplicant.conf for WiFi access. Now you have headless and don't need a keyboard or monitor. 3. Boot. Login. sudo raspi-config . Change password. 4. Connect printer via USB cable 5. Then execute the following sequence of commands to set up CUPS and make it accessible on the network. sudo apt-get update sudo apt-get full-upgrade sudo apt-get install cups sudo usermod -a -G lpadmin pi sudo cupsctl --remote-any sudo systemctl restart cups 6. Visit http://raspberrypi:631/admin and add the local printer. Make sure "sharing" is enabled on the printer. 7. Then make sure AirPrint is set up sudo apt-get install avahi-daemon sudo reboot Printer should work

How to write a successful blog post

First, a quick clarification of 'successful'. In this instance, I mean a blog post that receives a large number of page views. For my, little blog the most successful post ever got almost 57,000 page views. Not a lot by some other standards, but I was pretty happy about it. Looking at the top 10 blog posts (by page views) on my site, I've tried to distill some wisdom about what made them successful. Your blog posting mileage may vary. 1. Avoid using the passive voice The Microsoft Word grammar checker has probably been telling you this for years, but the passive voice excludes the people involved in your blog post. And that includes you, the author, and the reader. By using personal pronouns like I, you and we, you will include the reader in your blog post. When I first started this blog I avoid using "I" because I thought I was being narcissistic. But we all like to read about other people, people help anchor a story in reality. Without people your bl

Your last name contains invalid characters

My last name is "Graham-Cumming". But here's a typical form response when I enter it: Does the web site have any idea how rude it is to claim that my last name contains invalid characters? Clearly not. What they actually meant is: our web site will not accept that hyphen in your last name. But do they say that? No, of course not. They decide to shove in my face the claim that there's something wrong with my name. There's nothing wrong with my name, just as there's nothing wrong with someone whose first name is Jean-Marie, or someone whose last name is O'Reilly. What is wrong is that way this is being handled. If the system can't cope with non-letters and spaces it needs to say that. How about the following error message: Our system is unable to process last names that contain non-letters, please replace them with spaces. Don't blame me for having a last name that your system doesn't like, whose fault is that? Saying "Your