Thursday, December 22, 2011

A simple illustration of the use of goroutines and channels in Google Go

For a long time I've been meaning to spend some quality time with Google Go and I finally have the chance. For the little home project I'm working on I needed a way to generate unique IDs. In Go there's a really nice and simple way of doing this: assign a single goroutine as an ID generator and use a channel as the way to grab a new ID.

Here's the code:
    idMaker := make(chan string)

    go func() {
        var counter int64 = 0
        for {
            idMaker <- fmt.Sprintf("%x", counter)
            counter++
        }
    } ()
The first line makes a channel that will be used to communicate strings. In this case, I'd decided to have unique string IDs.

Then there's a go function (in this case an anonymous function) that contains a 64 bit integer counter that is incremented every time an ID is generated. When an ID is generated it is formatted as a hex number and made available on the channel. It just loops around forever providing IDs and only updating the counter when an ID has been used.

Any other part of the program can grab an ID like this:
    id := <-idMaker
Because reading from a channel is atomic and only one part of the program can read from the channel at once this eliminates any headaches about threads or processes needing to share some unique ID generating code. In fact, multiple goroutines can use exactly the same channel to get unique IDs trivially.

This works because Google Go takes its channel synchronization idea from CSP where communication is how synchronization is done. There's no need for any sort of locking around the var counter because it only gets updated in one place (inside the goroutine) and it only gets updated when its value has been consumed by reading from the channel.

Thursday, December 15, 2011

A pre-Christmas Plan 28 update

Despite the lack of public announcements from Plan 28 on the project to build Charles Babbage's Analytical Engine there's plenty going on behind the scenes.

The big news is the Plan 28's registration as a charity in the UK is now complete and has been published by the Charity Commission. Plan 28 is now registered charity number 1145043.

Every charity has to have 'objects' which are a statement of its aims. Plan 28's objects are as follows:
3. Objects

3.1 The Charity's objects ("The Objects") are for the public benefit

3.1.1 The advancement of education, in particular in the fields of science and the history of science, through the construction of Charles Babbage's Analytical Engine; and

3.1.2 The advancement of science, in particular through:
a) Research in the fields of computer science, engineering and mathematics; and
b) The raising of public awareness of Charles Babbage's Analytical Engine and related scientific theories and developments.

3.2 Article 3.1 may be amended by special resolution but only with the prior written consent of the commission.
What this means is that early in the New Year Plan 28 will enter a fundraising phase and all those folks who pledged money will now be able to donate it. It also means that Plan 28 will be eligible for Gift Aid for UK donors.

Here's to getting the Engine project moving in early 2012.

PS A big thank you to the solicitors at Stone King who worked on the charity registration process with me. They've been simply brilliant to work with and I would highly recommend them for other people starting a charity.

Tuesday, December 13, 2011

The Sainsbury's Wheel of Death

The British supermarket Sainsbury's labels its own food products with something that resembles a Trivial Pursuit game piece containing colored slices showing the 'healthiness' of the food. They call this the Multiple traffic-light labeling. I prefer to refer to it as the "Wheel of Death".

There are five categories on the wheel of death: total sugars, calories, fat, saturated fat and salt. When visiting Sainsbury's it's fun to see if you can find a full, all red 'Wheel of Death' on a product. I haven't managed to photograph one yet.

Here's a partial Wheel of Death:

And getting a bit closer to a full Wheel of Death on a chocolate cake mix packet:

And one more on a ready made dessert:


But has anyone seen an all red Wheel of Death? A product that combines high salt, fat, saturated fat, calories and sugar? If so, please email me.

Another piece of Apple design that drives me nuts

The Apple wireless keyboard and its side loading battery slot. Here's mine. The batteries needed replacing.

To get at the batteries you open a little circular cover on the right hand side and slide the batteries out. Or not. If one of the batteries has split open (even a small amount) it becomes jammed inside the slot and there's so little room that it's very hard to get it out.

There's a battery in there. Poking something into the hole doesn't help because you end up jamming the battery in further. The only viable option is to hit the keyboard on a hard object (e.g. the table) repeatedly and violently until the battery comes out.

It was a painful experience taking a piece of Apple hardware and whacking it on the table.

Tuesday, December 06, 2011

A back channel confirms that I'm right... sort of

Through a little circuitous back channel I received an unofficial follow up to a blog post about the unused machine code in the GCHQ Code Challenge part 2. The follow up assured me that the unused code was left over after some clean up was done, and that the rest of the data in the file was random filler (as I'd already heard).

But as I'd already determined that it was not actually random (at least at one level) I sent a carrier pigeon out with a question about my follow up blog post.

Listening for secret messages transmitted during The Archers I received a reply indicating that I had indeed 'broken' the encryption on that stream of ASCII data (by guessing the algorithm and reverse engineering the key stream), but that the underlying data was actually random ASCII generated and encrypted using the following Python code:
b = ''.join([chr(randint(0x20, 0x7f)) for i in range(0, 16 * 7)])
c = codec(b, 0x1f, s=5)
The first part generates 112 bytes of random ASCII text and the codec function is apparently the encryption function with the key I had identified (see the 0x1f and 5).

There are a couple of oddities about this code. The randint could generate an 0x7F which is non-printable (and doesn't appear in the decrypted text) and it generates precisely 112 bytes of data (whereas the part 2 actually contains two blocks of 112 and one of 102).

I questioned that by leaving a note in a tree in St. James's Park and received a response by exchanging briefcases at King's Cross Station to the effect that the 102 byte block had simply been truncated to make the code look interesting.

Of course, that could all be completely non-suspicious, but having been told that first it was random filler and then that actually it was blocks of encrypted random printable ASCII filler it wouldn't surprise me if the truth was even more complex (and perhaps rather mundane).

But why go to all this trouble on 'random filler' text? And why update the web site to say "The Challenge Continues"?

I am left slightly baffled, which I'd imagine is just how people working in secret places would like it to be.

Monday, December 05, 2011

Down the GCHQ rabbit hole (or I think there really is a part 4)

Update: Confirmation that this is correct

Early this morning I blogged that I thought there was a hidden part 4 to the GCHQ Code Challenge because of the amount of non-random data in part 2. Through someone who knows someone who knows someone at GCHQ I got the response the data was "a random filler and that they ran out of time to do anything interesting".

Well, I don't believe that's the whole story. There's a distinct pattern worth investigating.

Firstly, here are the three blocks of data that are untouched in part 2:
0150: 7d 1f 15 60 4d 4d 52 7d 0e 27 6d 10 6d 5a 06 56   }..`MMR}.'m.mZ.V
0160: 47 14 42 0e b6 b2 b2 e6 eb b4 83 8e d7 e5 d4 d9   G.B.............
0170: c3 f0 80 95 f1 82 82 9a bd 95 a4 8d 9a 2b 30 69   .............+0i
0180: 4a 69 65 55 1c 7b 69 1c 6e 04 74 35 21 26 2f 60   JieU.{i.n.t5!&/`
0190: 03 4e 37 1e 33 54 39 e6 ba b4 a2 ad a4 c5 95 c8   .N7.3T9.........
01a0: c1 e4 8a ec e7 92 8b e8 81 f0 ad 98 a4 d0 c0 8d   ................
01b0: ac 22 52 65 7e 27 2b 5a 12 61 0a 01 7a 6b 1d 67   ."Re~'+Z.a..zk.g

0200: 37 7a 07 11 1f 1d 68 25 32 77 1e 62 23 5b 47 55   7z....h%2w.b#[GU
0210: 53 30 11 42 f6 f1 b1 e6 c3 cc f8 c5 e4 cc c0 d3   S0.B............
0220: 85 fd 9a e3 e6 81 b5 bb d7 cd 87 a3 d3 6b 36 6f   .............k6o
0230: 6f 66 55 30 16 45 5e 09 74 5c 3f 29 2b 66 3d 0d   ofU0.E^.t\?)+f=.
0240: 02 30 28 35 15 09 15 dd ec b8 e2 fb d8 cb d8 d1   .0(5............
0250: 8b d5 82 d9 9a f1 92 ab e8 a6 d6 d0 8c aa d2 94   ................
0260: cf 45 46 67 20 7d 44 14 6b 45 6d 54 03 17 60 62   .EFg }D.kEmT..`b

0270: 55 5a 4a 66 61 11 57 68 75 05 62 36 7d 02 10 4b   UZJfa.Whu.b6}..K
0280: 08 22 42 32 ba e2 b9 e2 d6 b9 ff c3 e9 8a 8f c1   ."B2............
0290: 8f e1 b8 a4 96 f1 8f 81 b1 8d 89 cc d4 78 76 61   .............xva
02a0: 72 3e 37 23 56 73 71 79 63 7c 08 11 20 69 7a 14   r>7#Vsqyc|.. iz.
02b0: 68 05 21 1e 32 27 59 b7 cf ab dd d5 cc 97 93 f2   h.!.2'Y.........
02c0: e7 c0 eb ff e9 a3 bf a1 ab 8b bb 9e 9e 8c a0 c1   ................
02d0: 9b 5a 2f 2f 4e 4e 00 00 00 00 00 00 00 00 00 00   .Z//NN..........
I say three (rather than two blocks), because as you'll see below there's a real pattern that indicates that we are looking at three blocks of data encrypted using the same method and the same key.

The decryption scheme used for the other parts of part 2 (that actually get decrypted) is based on calculating the following where xn is the nth byte (0 based) of the block to be descrypted: f(xn) = xn xor ( a * n + b ). a and b are constants. The first decryption uses a = 1 and b = 0xAA. The second decryption uses a = 3 and b = 0x32.

Working under the assumption that a similar scheme was used for these three blocks I ran various tests and noticed a peculiar rhythmic pattern where whole runs of random ASCII characters would appear followed by runs of non-printable characters. It occurred to me that this could imply that (a) the underlying data is in ASCII (using printable characters) and that (b) the pattern I was seeing was a pattern in the top bit of key. If the key is following the scheme above it would go through runs of 0s followed by 1s and if the plain text is ASCII the top bit is always 0 and so for that one bit the 'key' comes through.

A quick calculation of just the top bits of the three blocks above showed the following pattern:
0150: 
000000000000000000001111111111111111111111111
0000000000000000000000000011111111111111111111111111
000000000000000
0200: 
000000000000000000001111111111111111111111111
0000000000000000000000000011111111111111111111111111
000000000000000
0270: 
000000000000000000001111111111111111111111111
0000000000000000000000000011111111111111111111111111
00000
Immediately, it's apparent this this data is not random (at least in the top bit), and that it follows a pattern of 0s and 1s alternating with the following counts:
150: 20 (0s), 25 (1s), 26 (0s), 26 (1s), 15 (0s)
200: 20 (0s), 25 (1s), 26 (0s), 26 (1s), 15 (0s)
270: 20 (0s), 25 (1s), 26 (0s), 26 (1s), 15 (0s)
(Note that the last block is short because the data is truncated and ends with 0s). It also might indicate that there are three separate blocks of data encrypted using the same key.

Theorizing that that was generated using the f(xn) function above meant finding a and b that fit the bill. It's not hard to calculate this spotting that the 25 occurs when the 0xFF boundary is passed and that the 20 occurs because we start in the middle of a 26 block. The actual coefficents are a = 5 and b = 0x1F. That generates the following sequence to use as the key:

1f 24 29 2e 33 38 3d 42 47 4c 51 56 5b 60 65 6a 6f 74 79 7e 83 88 8d 92 97 9c a1 a6 ab b0 b5 ba bf c4 c9 ce d3 d8 dd e2 e7 ec f1 f6 fb 00 05 0a 0f 14 19 1e 23 28 2d 32 37 3c 41 46 4b 50 55 5a 5f 64 69 6e 73 78 7d 82 87 8c 91 96 9b a0 a5 aa af b4 b9 be c3 c8 cd d2 d7 dc e1 e6 eb f0 f5 fa ff 04 09 0e 13 18 1d 22 27 2c 31 36 3b 40 45 4a 4f 54 59 5e 63 68 6d 72 77 7c 81 86 8b 90 95 9a 9f a4 a9 ae b3 b8 bd c2 c7 cc d1 d6 db e0 e5 ea ef f4 f9 fe 03 08 0d 12 17 1c 21 26 2b 30 35 3a 3f 44 49 4e 53 58 5d 62 67 6c 71 76 7b 80 85 8a 8f 94 99 9e a3 a8 ad b2 b7 bc c1 c6 cb d0 d5 da df e4 e9 ee f3 f8 fd 02 07 0c 11 16 1b 20 25 2a 2f 34 39 3e 43 48 4d 52 57 5c 61 66 6b 70 75 7a 7f 84 89 8e 93 98 9d a2 a7 ac b1 b6 bb c0 c5 ca cf d4 d9 de e3 e8 ed f2 f7 fc 01 06 0b 10 15 1a 1f

Which has the required sequence of 0s and 1s and begins: 20 (0s), 25 (1s), 26 (0s), 26 (1s), 25 (0s). So, at this stage I've identified a pattern and a way to generate that sequence using the encryption scheme used in part 2.

The surprising thing that happens when you decrypt those three blocks using that key stream is that not only is it 7 bit (which was bound to happen because of the way the key was generated) but the entire thing is printable ASCII. No characters below 0x20 and no 0x7F. Here are the three decrypted blocks in ASCII (I added line breaks):

b;<N~uo?Ik<F6:c<(`;p5:?t|("(|Uac|4I["Z_xZyU{a+5cE}|K?SD.Y85sjvz:
\*^p@,Dd=83;?e0bnP3R$ZF:V,L~O 5wS&[km?6x5M;7A+X-(^.?,%Ugu;O4x;"?
<Dh<uy<tTPYcO|ui:9S-5YhY0!vU(k3e`rL.5ms;C`~o`6hW]TA[fqh_k4smCk}{
$a;gY9_y?z76gZ'n0AOi3eY6Li\b8W%(J~cHR)j*2I3`&bu!gV;L9j4pA%^eB::{
0%qjE)RcVax:/xsk}*.=u[\[email protected]/N7aHpA_$5H'LCW76XHtRA*krs|WZxu|U;
d^&!]V"',16;@EJ

The fact that this decrypts as printable ASCII reinforces my belief that this is correct. Of course, it looks like gibberish, but it's printable. There are 89 characters used from the 95 possible. The missing characters are # 1 > G Q l.

It seems unlikely that the sequence of steps I have taken would have resulted in a block of perfectly printable ASCII characters had this not been the correct thing to do.

Perhaps this is a base-89 encoding of some other binary data. Whatever it is it has high-entropy and could be encrypted with some robust scheme, or it could be the result of compression. Of course, it could be that this data is random and was encrypted by GCHQ as I specified as a sort of dead end.

PS If anyone from GCHQ is reading... can you email me a simple 'carry on' or 'stop wasting your time'. Need to sleep...

Is there another GCHQ Code Challenge?

Update: The answer is Yes, sort of.

Last week GCHQ issued a 'code challenge' as a way of attracting candidates via a web site called Can You Crack It?. I did the challenge, which actually consists of three parts and has more to do with assembly language and reverse engineering skills than cryptography (in fact, the only time you come across a cryptographic function, you can completely ignore it).

The cat is completely out of the bag now because some spoilsport published complete details on the web showing how to break it, but I think there may be one more mystery worth investigating.

Part 2 of the challenge involves writing a virtual machine that executes some code to decrypt an in memory buffer containing an HTTP command. Here's a dump of the memory of the virtual machine after execution:
0000: 31 04 33 aa 40 02 80 03 52 00 72 01 73 01 b2 50   1.3.@...R.r.s..P
0010: 30 14 c0 01 80 00 10 10 00 00 00 00 00 00 00 00   0...............
0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0100: 32 00 75 0c 31 08 33 32 40 02 80 03 52 00 72 01   2.u.1.32@...R.r.
0110: 73 03 b2 00 c3 b0 00 30 1b c0 01 ff 00 00 00 00   s......0........
0120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0130: 00 00 75 10 01 00 00 00 00 00 00 00 00 00 00 00   ..u.............
0140: cc 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0150: 7d 1f 15 60 4d 4d 52 7d 0e 27 6d 10 6d 5a 06 56   }..`MMR}.'m.mZ.V
0160: 47 14 42 0e b6 b2 b2 e6 eb b4 83 8e d7 e5 d4 d9   G.B.............
0170: c3 f0 80 95 f1 82 82 9a bd 95 a4 8d 9a 2b 30 69   .............+0i
0180: 4a 69 65 55 1c 7b 69 1c 6e 04 74 35 21 26 2f 60   JieU.{i.n.t5!&/`
0190: 03 4e 37 1e 33 54 39 e6 ba b4 a2 ad a4 c5 95 c8   .N7.3T9.........
01a0: c1 e4 8a ec e7 92 8b e8 81 f0 ad 98 a4 d0 c0 8d   ................
01b0: ac 22 52 65 7e 27 2b 5a 12 61 0a 01 7a 6b 1d 67   ."Re~'+Z.a..zk.g
01c0: 47 45 54 20 2f 64 61 37 35 33 37 30 66 65 31 35   GET /da75370fe15
01d0: 63 34 31 34 38 62 64 34 63 65 65 63 38 36 31 66   c4148bd4ceec861f
01e0: 62 64 61 61 35 2e 65 78 65 20 48 54 54 50 2f 31   bdaa5.exe HTTP/1
01f0: 2e 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00   .0..............
0200: 37 7a 07 11 1f 1d 68 25 32 77 1e 62 23 5b 47 55   7z....h%2w.b#[GU
0210: 53 30 11 42 f6 f1 b1 e6 c3 cc f8 c5 e4 cc c0 d3   S0.B............
0220: 85 fd 9a e3 e6 81 b5 bb d7 cd 87 a3 d3 6b 36 6f   .............k6o
0230: 6f 66 55 30 16 45 5e 09 74 5c 3f 29 2b 66 3d 0d   ofU0.E^.t\?)+f=.
0240: 02 30 28 35 15 09 15 dd ec b8 e2 fb d8 cb d8 d1   .0(5............
0250: 8b d5 82 d9 9a f1 92 ab e8 a6 d6 d0 8c aa d2 94   ................
0260: cf 45 46 67 20 7d 44 14 6b 45 6d 54 03 17 60 62   .EFg }D.kEmT..`b
0270: 55 5a 4a 66 61 11 57 68 75 05 62 36 7d 02 10 4b   UZJfa.Whu.b6}..K
0280: 08 22 42 32 ba e2 b9 e2 d6 b9 ff c3 e9 8a 8f c1   ."B2............
0290: 8f e1 b8 a4 96 f1 8f 81 b1 8d 89 cc d4 78 76 61   .............xva
02a0: 72 3e 37 23 56 73 71 79 63 7c 08 11 20 69 7a 14   r>7#Vsqyc|.. iz.
02b0: 68 05 21 1e 32 27 59 b7 cf ab dd d5 cc 97 93 f2   h.!.2'Y.........
02c0: e7 c0 eb ff e9 a3 bf a1 ab 8b bb 9e 9e 8c a0 c1   ................
02d0: 9b 5a 2f 2f 4e 4e 00 00 00 00 00 00 00 00 00 00   .Z//NN..........
02e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
02f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
There are three colors used there to highlight areas: memory in blue was executed, memory in red was both read and written, memory in green was read, written and executed. As you can see the program starts running at 0x0000. There there's small program that decrypts the block of memory between 0x100 and 0x14F and jumps to 0x100.

The decrypted program decrypts the memory at 0x1C0 to reveal the HTTP command that is used to execute part 3 of the GCHQ challenge.

But what about all the rest of the unused memory? No other part of the GCHQ challenge wastes bytes, they are all used for something, but here there's a ton of memory that's filled with data and that isn't referred to anywhere else.

Is this a hidden fourth part of the GCHQ challenge? Perhaps even the real challenge?

If you look into the first block of data that's been decrypted there are two intriguing pieces of additional information. At 0x132 there are three bytes: 75 10 01. Those three bytes are actually valid code in the virtual machine. They mean
75 10   add   ds, 0x10
01      jmpr  r1
This seems like perfectly reasonable code. It moves the data segment forward by 0x10 and then jumps to R1. At the end of execution of the original program the data segment is at 0x1c. Thus moving it forward puts it at 0x2c (i.e. the block starting at 0x02c0 above). An alternative explanation is that based on the initial conditions of the VM that data segment would now be 0x20 (i.e. pointing immediately after the decrypted HTTP command at 0x0200).

Also at the end of running the program R1 contains 0x08. Doing the jump would drop straight into the decryption loop just past the initialization and into the XOR portion. That could be totally valid. The jump would take the instruction pointer to:
10:08 movm r0, [ds:r2]
10:0a xor r0, r3
10:0c movm [ds:r2], r0
10:0e add r2, 01
10:10 add r3, 03
10:12 cmp r2, 00
10:14 jmpe r3
10:15 cmp r0, 00
10:17 movr r0, 1b
10:19 jmpe r0
10:1a jmpr r1
The only other clue is that there's the byte 0xCC in the decrypted memory.

And one more thing. Notice the 10:14 jmpe r3. That never gets taken because r2 is never 00. But if you investigate the circumstances under which r2 would be 00 you find that it's when r3 would be 0x32 (i.e. when that jump would take you right to the 75 10 01 sequence that's been decrypted).

I haven't had enough free time to investigate this further. Perhaps it's a red herring, but it looks awfully suspicious. Especially given that the main loop will take that jump instruction when it's completely exhaused 0x100 bytes of data. This little 'subroutine' then moves the data segment on by 0x10 (i.e. 0x100 bytes) and the decryption will continue until a 00 byte is written. So it looks valid and is designed to cope with not having hit a 00 before the end of a segment.

PS It's been pointed out to me that the Can You Crack It? web site has been altered since the beginning to add the words "The challenge continues".