Monday, September 30, 2013

Mothballed

I've mothballed my blog.

All the posts are here; comments are disabled.

Taking a break from personal online things.

Monday, September 23, 2013

Debugging for High-altitude Balloon Enthusiasts (and others!)

If a person who debugs code is a debugger then the person who wrote the bugs must be a... 

I gave a talk at the UKHAS 2013 Conference in September on debugging for high-altitude balloon enthusiasts. It applies more generally than that and may be interesting to others (especially if you are relatively inexperienced at programming and debugging).


The talk covers the basics of debugging without any specific programming language and is intended to be humorous but also useful. The folks at UKHAS were super organized and recorded the talks and synchronized the slides. You can listen to me give this talk by going to the BATC File Archive and then select HAB2013: my talk is HAB2013 03 Debugging.

If you care less about debugging and more about other high-altitude balloon stuff then check out my talk from last year, called HAB Software Woes, and my GAGA-1 high altitude balloon flight. Here's a shot of the moon taken by GAGA-1:







Thursday, September 19, 2013

Slides from Go London User Group September 2013 meeting: Go Memory

I gave a talk at the Go London User Group meeting in September on Go Memory. Here's the presentation itself:



It's worth reading the CloudFlare blog post Recycling memory buffers in Go as it goes into more detail.

Wednesday, September 18, 2013

Why is the fifth part of the GCHQ "Can you find it?" challenge so easy

UPDATE: Read the comments below for the solution.

If you've worked your way through the GCHQ Can you find it? recruitment challenge you'll have reached the last part and discovered that you have to do... nothing. The code word is accepted and you get to give your details for the competition. But you didn't do anything in part five.

That bothers me.

And there's another thing that bothers me. The second part includes a very obvious link to a web site (apparently correct) hidden inside an RSA private key, but the private key is messed up and messed up in a very specific manner.

Here's the key file dumped out:

$ openssl rsa -in comp1.key -text -noout
Private-Key: (1022 bit)
modulus:
    37:c0:04:af:3e:8e:80:cb:75:b1:53:0c:9f:b2:dc:
    f4:d3:ce:4a:82:8b:52:f6:a8:48:e0:c5:d8:35:8b:
    26:6c:84:94:de:29:47:24:49:85:72:28:17:8e:06:
    d0:77:17:0c:2a:5d:56:ba:88:d1:07:25:e2:c5:7b:
    01:44:ea:e9:44:38:87:1a:b5:5a:75:d5:98:34:89:
    b3:1f:9e:a4:e2:bd:b7:7a:b7:cf:f3:dc:ac:ea:ac:
    59:2c:83:dc:50:8a:27:0c:69:cb:66:4e:a1:64:9b:
    ca:e8:e4:e0:dc:d8:d4:d0:cc:c8:c4:c0:bc:b8:b4:
    b0:ac:94:13:82:39:51:f1
publicExponent: 65537 (0x10001)
privateExponent:
    13:5b:5d:85:07:60:6d:41:b7:9c:99:2c:61:ea:b5:
    a3:60:43:59:45:98:60:76:fa:19:4b:ca:05:f7:19:
    58:7f:07:4d:b5:11:79:fd:14:75:fc:1c:05:89:af:
    be:04:0b:81:92:d8:13:bb:f2:b3:39:1b:23:70:d3:
    f3:ad:dd:2e:4c:26:d3:1b:a8:56:f1:83:ca:d9:13:
    95:38:e7:80:30:77:a4:f0:d9:77:f9:25:b9:c1:d7:
    8f:2a:e5:b0:31:d8:c3:0e:3a:b1:5c:39:ec:f9:90:
    b5:77:60:a9:cf:95:7e:c7:ed:b3:9c:e6:0b:d1:bb:
    04:29:e8:b4:b1:69:7b:2d
prime1:
    37:c0:04:af:3e:8e:80:cb:75:b1:53:0c:9f:b2:dc:
    f4:d3:ce:4a:82:8b:52:f6:a8:48:e0:c5:d8:35:8b:
    26:6c:84:94:de:29:47:24:49:85:72:28:17:8e:06:
    d0:77:17:0c:2a:5d:56:ba:88:d1:07:25:e2:c5:7b:
    01:44:ea:e8:55:4a:2a:2b:e4:71:8f:02:b1:61:b0:
    e4:34:bf:da:1b:d4:d0:95:ec:ff:0c:f7:da:8d:e1:
    7a:65:99:7f:f1:b3:4e:47:81:00:95:87:d6:8c:5a:
    d8:a8:a4:a0:9c:98:94:90:8c:88:84:80:7c:78:74:
    70:6c:53:d2:41:f9:3b:e4
prime2:
    77:77:2e:77:68:74:73:69:73:69:6c:67:75:6f:65:
    63:74:73:72:65:68:73:72:69:2e:65:6f:63:75:2e:
    2f:6b:6c:62:74:65:68:63:65:6c:20:79:20:20:20:
    20:20:20:20:20:20:20:20:20:20:20:20:20:20:20:
    20:20:0b:8f
exponent1:
    13:a5:24:9d:fc:2e:52:20:40:1b:50:f9:3e:65:80:
    1d:b7:b3:98:57:36:b2:ed:58:80:89:ab:a4:86:4b:
    7e:fe:c2:46:fa:6f:06:98:79:c0:2b:22:df:f6:88:
    71:df:f6:88:71:df:f6:88:71:df:f6:88:71:df:f6:
    b2:8a:b2:4f
exponent2:
    08:79:f2:58:12:97:40:a1:18:c9:40:21:cf:19:4a:
    4e:56:32:e2:c9:03:32:3d:c9:ec:ba:d1:be:72:d0:
    06:19:4f:25:65:30:d4:c9:48:a6:f5:5e:e2:c2:a4:
    c4:e2:c2:a4:c4:e2:c2:a4:c4:e2:c2:a4:c4:e2:c2:
    a4:c4:e1:4d
coefficient:
    14:89:f3:4e:c0:0e:91:ab:96:dd:ca:dd:d5:77:f1:
    32:1c:62:b5:49:1a:a5:d4:2a:97:0b:c5:85:9b:a8:
    b8:d2:32:6d:f1:0e:7d:6e:96:92:3b:60:84:10:f2:
    a9:fe:74:70:41:56:5c:c2:7b:56:4f:26:af:a7:30:
    4e:8b:0f:bd:82:94:55:72:94:09:b9:6b:7a:d2:d3:
    79:4f:79:4e:56:e4:a6:b8:b3:3e:4c:be:fb:96:fb:
    a5:0b:92:8b:79:a9:2c:c8:be:e9:58:2f:72:34:ed:
    85:f5:cf:60:d8:36:26:32:69:82:6b:5e:0b:87:de:
    95:82:ff:d8:54:c0:99:3f

When I did this it was clear that prime2 contained printable ASCII just from looking at it. All well and good if you've solved the puzzle, but what about the other parts?

Specifically,

1. modulus should be prime1 * prime2. It isn't.

2. In fact, the modulus and prime1 are very close to each other. Just compare the bytes starting at the start of each (in italics). Odd.

3. Looking a little deeper with bc I find:

$ bc
obase=16
ibase=16

modulus=37C004AF3E8E80CB75B1530C9FB2DCF4D3CE4A828B52F6A848E0C5D83
58B266C8494DE29472449857228178E06D077170C2A5D56BA88D10725E2C57B01
44EAE94438871AB55A75D5983489B31F9EA4E2BDB77AB7CFF3DCACEAAC592C83D
C508A270C69CB664EA1649BCAE8E4E0DCD8D4D0CCC8C4C0BCB8B4B0AC94138239
51F1

prime1=37C004AF3E8E80CB75B1530C9FB2DCF4D3CE4A828B52F6A848E0C5D835
8B266C8494DE29472449857228178E06D077170C2A5D56BA88D10725E2C57B014
4EAE8554A2A2BE4718F02B161B0E434BFDA1BD4D095ECFF0CF7DA8DE17A65997F
F1B34E4781009587D68C5AD8A8A4A09C9894908C8884807C7874706C53D241F93
BE4

modulus-prime1
EEEE5CEED0E8E6D2E6D2D8CEEADECAC6E8E6E4CAD0E6E4D25CCADEC6EA5C5ED6D
8C4E8CAD0C6CAD840F24040404040404040404040404040404040414040160D

(modulus-prime1)/2
77772E776874736973696C67756F656374737265687372692E656F63752E2F6B6
C6274656863656C2079202020202020202020202020202020202020A0200B06

prime2=77772E776874736973696C67756F656374737265687372692E656F6375
2E2F6B6C6274656863656C2079202020202020202020202020202020202020202
00B8F

(modulus-prime1)/2 - prime2
7FFFFF77

So, another way of finding the answer to part two is to calculate (modulus-prime1)/2 to get the same printable ASCII containing a URL.

I wonder if GCHQ originally wanted to make part two harder and then changed their mind?

And then there's the relationship between exponent1 and exponent2. There are sections with repeating patterns (and oddly those sections seem to align with the 0x20 padding bytes in prime2 (see bolded sections above)). df:f6:88:71 repeats in exponent1 and e2:c2:a4:c4 repeats in exponent2

Unfortunately, why those repeating patterns are present (indication of a simple XOR key used against repeating plaintext?) and how they are used to get further information elude me. Also, why are there some 'random' bytes after each repeating section?

Last time GCHQ had a challenge I did find something interesting which was confirmed (with help from The Register) by GCHQ. 

I reached out to the same people via a dead letter box at the tomb of Thomas Bayes in Bunhill Fields cemetery and searched for a response via the personal ads in The London Review of Books but have heard nothing.

So... has anyone else looked into this?





Friday, August 02, 2013

SIGINT Reading List

Recent news stories about how NSA and GCHQ operate have brought to light the world of SIGINT: signal intelligence. And many people are surprised that these secret organizations gather so much information looking for 'bad guys'. Putting aside the ethical or political considerations it's worth understanding that what's come to light does not appear to be particularly new. And so the intelligent reader might want to know more about this secret world.

There are a number of books to recommend on the subject.

Spycatcher

Although Spycatcher, Peter Wright's memoir, is mostly about his suspicion that Sir Roger Hollis was a Soviet spy, the book contains quite a bit of detail about electronic eavesdropping. Here's a small part about dealing with The Troubles:
The only major recommendation I made was that we should devise a system of tapping the telephone lines of the Irish Republic. Lines across the border were well covered, but vital Provisional IRA communications flowed back and forth from the west coast of the Republic to Dublin. I devised a scheme for intercepting the microwaves from the attic of the British Embassy in Dublin using a device no larger than a packing case, but although MI5 endorsed the plan, the Foreign Office vetoed it.

The Puzzle PalaceBody Of Secrets, The Shadow Factory

These are James Bamford's books on the history of the NSA and should be read in that order. They give a fascinating insight into the work of the NSA from its beginnings to the present day. Much of what has been recently revealed is talked about in The Shadow Factory or could be inferred from it.

One interesting US operation is Project Shamrock. Bamford writes about this in Body of Secrets:
Many of the intercepts to and from foreign embassies in Washington were acquired as a result of secret agreements between the NSA and the major US telecommunications companies, such as Western Union. Under the NSA program codenamed Shamrock, the companies agreed to illegally hand over to NSA couriers, on a daily basis, copies of all the cables sent to, from and through the US.

GCHQ

Similar in intent to James Bamford's books this covers the British equivalent of NSA, GCHQ, from its beginnings up until very recently. Here's the book discussing the work of GCHQ Bude:
One of GCHQ's largest ventures into the world of vacuuming up telephone calls was launched in Cornwall in 1967. At Goonhilly Downs on the Lizard peninsula there was a satellite receiving station for one of the world's first commercial communications satellites, Intelsat. Displaying a certain amount of barefaced cheek, GCHQ built a duplicate receiving station about sixty miles down the road, near the village of Morwenstow, on the site of a former RAF wartime airfield. Here it could scoop up the same telephone traffic by simply collecting the ‘spillage’ as commercial satellites beamed messages down to earth. This station, with its distinctive domes and satellite dishes littered along the Cornish clifftops, was initially called CSO Morwenstow, and later changed its name to GCHQ Bude. Morwenstow was a classic Anglo-American intelligence venture. NSA paid for most of the infrastructure and the technology, while GCI-IQ contributed the land and paid for the staff and running costs. The massive flow of intelligence it received was shared and processed jointly.
If you read those five books you'd have a good sense of the work of these organizations.

Friday, July 26, 2013

OSCON 2013 Keynote: Turing's Curse

O'Reilly have put up a YouTube video of my keynote from today entitled "Turing's Curse". It's a short talk about the history of computing and about how history repeats itself.


Wednesday, July 03, 2013

Your test suite is trying to tell you something

A few weeks ago I started wondering about 'the test that occasionally and randomly breaks' in a large test suite at my job. The test, called 'overloaded origin', tests a situation where a web server becomes overwhelmed with requests and a proxy server (the code being tested) has to handle the situation gracefully.

The test works by having a dummy web server that could randomly decide to (a) return a normal web page for a request (b) read the HTTP headers and then do nothing for 30 seconds and (c) read the HTTP headers, wait 30 seconds and then send a valid response. The proxy server is hit by 5,000 clients simultaneously requesting the same URL.

And sometimes, every now and again, this test failed.

And like many engineers I'd ignored it for a long time. But it kept worrying me because it must have meant something: computers are deterministic, after all. I was spurred to action by a colleague suggesting that the test be disabled because it was 'flaky'.

It took me two days of continuous work to find out what was wrong and it explained other occasional problems that had been seen with the code. And it made the test suite 100% stable on all platforms. That 'randomly failing test' was really 'a genuine bug in the code'.

But getting to that point was tricky because this was a system level test with clients, servers and the proxy and a memcached server in the middle. It turned out that the memcached server was the problem. In the end, I had to implement my own memcached server (a simple one) so that I had complete control over the environment. In doing so, I discovered the root cause of the problem.

The program has a timeout used to stop it waiting for memcached if it doesn't respond quickly (within 100s of ms).  Here are the lines of code that handle the memcached timeout (this is from inside the proxy server being tested).
var Timeout time.Duration
Timeout = time.Duration(conf("timeout", 100)) * time.Millisecond

cache := memcache.New(Servers...)
cache.Timeout = Timeout * time.Millisecond
The first two lines read the timeout value from a configuration file (with a default of 100) and convert that to a time.Duration in ms. The following lines (later in the code) use that value to set the timeout on the memcached connection

Oops!

There's another * time.Millisecond there. So, 100ms, for example, would become something much larger. To find out what you just need to know what a time.Duration is: it's a value representing a number of nanoseconds. 

So, the initial value of Timeout is 100,000,000ns (since 1ms is 1,000,000ns). Then when the second multiply happens Timeout becomes 100,000,000,000,000ns which is close to 28 hours.

The test would fail because occasionally the connection to memcached would fail resulting in the timeout starting. Instead of gracefully failing at 100ms the program was prepared to wait 28 hours.

And examination of the source code control log showed that this bug had always been there, right from time those lines of code were written.

By me.

Write good commit messages

Over the years I've become more and more verbose in commit messages. For example, here's a recent commit message for something I'm working on at CloudFlare (I've obscured some details). This is actually a one line change to a Makefile but gives a good example of what I'm aiming for.
commit 6769d6679019623a6749783ea285043d9449d009
Author: John Graham-Cumming
Date:   Mon Jul 1 13:04:05 2013 -0700

    Sort the output of $(wildcard) as it is unsorted in GNU Make 3.82+

    The Makefile was relying on the output of $(wildcard) to be sorted. This is
    important because the XXXXXXXXXXXX rules have files that are numbered and
    must be handled in order. The XXXXXXX relies on this order to build the rules
    in the correct order (and set the order attributes in the JSON files). This
    worked with GNU Make 3.81

    In GNU Make 3.82 the code that globs has been changed to add the GLOB_NOSORT
    option and so the output of $(wildcard) is no longer ordered and the build
    would break. For example,

       make clean-out && make

    would fail because the XXXXXXXXXXXXXXXX (which is used for the XXXXX action)
    which appears in XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX would not have been
    parsed before the XXXXX action was used in some other XXXXX file. That would
    generate a fatal error.

    The solution is simple: wrap the $(wildcard) in $(sort). The actual change
    uses a $(foreach) loop because it's necessary to keep the directories in the
    order specified in the Makefile and the files within the directory are sorted
    by the $(sort $(wildcard ...)). The directory order is important because
    XXXXXXXXXXXX must be processed before the other rule directories because it
    contains XXXXXXXXXXXXXXXXXXXXXXXXXXX which sets the XXXXXXXXXX thresholds.
The first line gives a brief summary of the commit. But the rest explains in detail why this change was made (a change in GNU Make 3.82 in this case), why the change in GNU Make 3.82 caused a problem, verification of what actually changed that caused the problem, how to reproduce the problem and finally a note about the specific implementation. The final note is there so that someone looking at the commit later can understand what I was thinking and assumptions that went into the change.

I've come to like long commit messages for a number of reasons.

Firstly, I tend to forget why I changed things. And I certainly forget detailed reasoning about a change.

Secondly, I'm not working alone. Other people are looking at my code (and its changes) and need to be able to understand how the code evolves.

And these long commit messages overcome the problem of code comments that get out of date. Because the commit message is tied to a specific diff (and hence state of the code) it never gets out of date.

There's another interesting effect. These log messages take just a minute or two to write, but they force me to write clearly what I've been doing. Sometimes this causes me to stop and go "Oh wait, I've forgotten X". Some part of writing down a description of what I'm doing (for someone else to read) makes my brain apply different neurons to the task.

Here's another example from a different project:
commit 86db749caf52b20c682b3230d2488dad08b7b7fe
Author: John Graham-Cumming
Date:   Mon Jul 1 10:14:49 2013 -0700

    Handle SIGABRT and force a panic

    It can be useful to crash XXXXXXX via a signal to get a stack trace of every
    running goroutine. To make this reliable have added handling of SIGABRT.

    If you do,

       kill -ABRT 

    A panic is generated with message "panic: SIGABRT called" followed by
    a stack trace of every goroutine.

Tuesday, July 02, 2013

The Plain Mail Snail: One way to make people switch to using encrypted email

Due to revelations about access to private email (and other electronic communication) by the NSA and GCHQ some people have been suggesting that we all need to start using encrypted email. I've had PGP/GPG keys since about 1995 and I have only ever received a handful of encrypted mails.

So, how do you make people send you encrypted mail? I think an 'economic' incentive is necessary.

If you send me an unencrypted email it will be delayed by 12 hours before it is delivered. Encrypted email will be delivered immediately.

This is actually pretty easy to accomplish. An SMTP server can examine the contents of an incoming email and determine if it is encrypted or not. If it's not encrypted it can be placed in a delay queue and delivered after the appropriate delay; at the same time the server can send a message warning the sender of the delay and perhaps educating them about how to send encrypted mail.

This scheme could be called the Plain Text Tarpit (PTT) or perhaps the Plain Mail Snail.

If PTT were implemented then mail clients would quickly be upgraded to automatically handle email encryption.

PS What about mailing lists?

Either they accept the 12 hour delay or they find the public key of the people they are sending to.


Wednesday, June 26, 2013

The hollow triangular numbers are divisible by three

You might be familiar with the triangular numbers: the number of objects that form equilateral triangles like this:

The sequence is 3, 6, 10, 15, 21, ... The first one, 3, is 1 + 2, the second one, 6, is 1 + 2 + 3, the third one, 10, is 1 + 2 + 3 + 4 and so on. For a triangle with a base of \(n\) blobs the number of blobs in the triangle is \(n * (n + 1) / 2\) (which can be proved by various means and was famously figured out by the mathematician Gauss while a school boy).

But what about the hollow triangular numbers? The ones where you take the middle blobs out of the triangles and just leave the border. Like this:
The pattern there is very simple. They are all multiples of three. And the formula would be that the hollow triangle with a base of size \(n\) has \(3 * (n-1)\) blobs in it (and since that's a multiple of three so is the number of blobs in a hollow triangle).

Here are three ways to prove that that formula is correct.

Subtracting the middle

Because we already have a formula for the filled in triangle (see above) we can use it to figure out the formula for the hollow triangle. Look at the triangle with a base of 6 blobs.

You can see that the number of blobs in the hollow triangle with a base of six is the number of blobs in the filled in triangle of base six minus the number of blobs in the filled in triangle of base three. Or, more, generally the number of blobs in a hollow triangle of base \(n\) is

\[\begin{aligned}
&n * (n+1) / 2 - (n-3) * (n-3 + 1) / 2 \\
&= 1/2 * (n^2 + n - n^2 + 5n - 6) \\
&= (6n - 6)/2 \\
&= 3 * (n - 1) \\
\end{aligned}\]
Which is the formula above.

Counting around the outside

Another way is to follow around the edge of a triangle and observe that, for example, for the hollow triangle with a base of six the total number of blobs is \(6 + 5 + 4\) (the blue blobs, plus the red blobs, plus the yellow blobs).

And the more general formula would be that for a hollow triangle of base \(n\) the total number of blobs is \(n + (n-1) + (n - 2)\) which simplifies to \(3 * (n - 1)\).

Mathematical induction

But a real mathematician is likely to prove this in a different way using mathematical induction. That works by beginning with some starting case that can be seen to meet the required statement and a way of going from one case to another.

For example, it's pretty clear that the hollow triangle with four blobs in its base (see above) has nine blobs in total and since nine is divisible by three the statement we are trying to prove 'all hollow triangular numbers are divisible by three' is at least true for it.

Then you just need a way of working up from there. For example, if there was a way to prove that the next hollow triangular number is divisible by three if we know that the previous one was, it would be possible to know that they all were. That's because you'd start with the hollow triangle with four blobs in the base and know that the one with five blobs was divisible by three and then you'd know that the one with six blobs was divisible by three and so on... forever.

So, all you need is to examine one step in this infinite chain of reasoning.

First let's say that \(hollow_n\) is the number of blobs in the hollow triangle with a base containing \(n\) blobs. And let's assume that it is divisible by three. For mathematical induction to be applicable we just need to show that \(hollow_{n+1}\) (the next hollow triangular number) is divisible by three. So, we just need to know how to go from \(hollow_n\) to \(hollow_{n+1}\).

Take a look at the following diagram showing the hollow triangular number with base six (on the left).

You can think of going from \(hollow_6\) to \(hollow_7\) as the same adding a row of seven blobs (on the right) and removing the inside of the old bottom row (coloured in red). So, \(hollow_7 = hollow_6 + 7 - 4\) which simplifies to \(hollow_7 = hollow_6 + 3\). Since we've added three and know that \(hollow_6\) is divisible by three we know that \(hollow_7\) is also.

That can easily be generalized \(hollow_{n+1} = hollow_n + (n+1) - (n-2)\) (added on a row of \(n+1\) at the bottom and removed the interior \(n-2\) blobs leaving just the two blobs on the outside). That simplifies to \(hollow_{n+1} = hollow_n + 3\). And so if we know that \(hollow_n\) is divisible by three then so is \(hollow_{n+1}\) and thus by mathematical induction we know that all hollow triangular numbers are divisible by three.

Back to the triangular numbers

Now that we've got the idea of mathematical induction it's possible to return to a claim a made at the start: for the filled in triangles the number of blobs is \(n * (n + 1) / 2\) when the base has \(n\) blobs. That too can be proved by induction.

The starting case is a base of two blobs. Just by looking at it we can see that it has three blobs in the triangle and \(2 * (2 + 1) / 2 = 3\). So the formula works for that case.

Then consider going from a triangle with a base of \(n\) blobs to one with \(n+1\). All that happens there is that \(n+1\) blobs get added (the new bottom row). So, if the triangle with a base of \(n\) blobs has \(n * (n + 1) / 2\) in total, the triangle with base \(n +1\) has \(n * (n + 1) / 2 + (n + 1)\) which can be simplified as follows:

\[\begin{aligned}
&n * (n + 1) / 2 + n + 1 \\
&= 1/2(n * (n +1) + 2(n+1)) \\
&= 1/2(n^2 + n + 2n + 2) \\
&= 1/2(n^2 + 3n + 2) \\
&= 1/2(n+1)(n+2) \\
&= (n+1) * ((n+1)+1)/2
\end{aligned}\]
Which is just what was expected.

PS I spotted all this staring at the duvet cover the other night. But then again there was that flight I took back in 2008.

PPS In the comments reader tz points out another way to see that the number of blobs in a hollow triangle is a multiple of three. There are three corners and three equal length sides (when ignoring the corners).


Monday, June 03, 2013

Some things I've learnt about public speaking

In the past I've written a couple of blog posts titled Some things I've learnt about programming and Some things I've learnt about writing. People seem to have enjoyed those posts (judging by the page view counts and comments) and so I thought I'd write something about public speaking and specifically about conference speaking.

0. It really does help to know what you are talking about

If you know your subject well then you will be able to talk well within the limits of your knowledge and be interesting and informative. You'll also be more comfortable and your comfort will come across to the audience.

If, on the other hand, you are giving a talk about a subject you know little about you are likely to come across poorly. This is partly because you won't have confidence in what you are saying and partly because it helps to leave audience wanting more. Also, when you are ill informed it's easy to be repetitive and therefore tedious.

I like when there's a very restricted amount of time for the talk, it forces me to figure out what's interesting to say. It's good if I don't have enough time to say everything. I can always continue with a Q&A session, individual chats with audience members or a blog post.

A common mistake amongst speakers is trying to say too much. There's no need: be interesting and you'll get the opportunity to say much more at another time.

1. It really does help to practice

Last year I gave a TEDx talk titled The greatest machine that never was. It was about Charles Babbage's Analytical Engine. As a speaker you are under pressure at a TEDx event because the audience expects you to be interesting, and because the organizers keep to a rigid timetable.

One thing that made my talk enjoyable for the audience was that the organizers of that TEDx forced me to do the talk for them live beforehand. I had been dreading doing that, but it turned out to be key to making the talk flow smoothly. With the slides (which were mostly just single images) printed on paper, I ran through the talk and was able, instantly, to feel what did and did not work. It also helped to cement the flow of the talk in my mind.

And practicing also gave me the freedom to ad lib. On the day I said a few things that were not planned; I was able to do that because the rest of the talk was solid. An entirely ad libbed talk would likely have been tedious and made me seem erratic.

And now I make a point to practice, even if it's just me talking to the teddy.

2. It's good to tell stories

Even if you are giving a technical talk about the internals of the JVM stories help the audience. They provide a framework for the points you are making and they link slides together. Without a story talks can sometimes be a jarring sequence of slides with tenuous connections. Stories act as mental lubricant that helps keep your listeners' minds working.

At StrataConf London last year I gave a talk titled The Great Railway Caper: Big Data in 1955. The talk is intended to be about Big Data and helping to define what that term means (a rather boring topic). But the talk itself is entirely about the Lyons' Electronic Office and one particular program that was written for it in the 1950s. The story of that program tells a story about Big Data today.

And the LEO story starts with another short story. The introductory story is used to tell the audience what I'm going to talk about.

3. Unless you are fascinating don't talk about yourself

You probably aren't fascinating. And the audience probably doesn't care about you or what you've done. They care about your subject matter. It's rare that your subject matter will be you.

Imagine for a moment that you are Neil Armstrong. A talk by Armstrong about the experience of being on the Moon would be more interesting that a talk by Armstrong about himself.

And, ironically, the less you talk about yourself the more interesting you'll be and the more interested the audience will be in you.

4. Passion doesn't mean shouting at the audience

Genuine passion for a topic comes across effortlessly. Pick a subject you really care about and talk about that.

Sometimes you'll see a speaker faking passion by speaking emphatically and shouting at the audience. It's often a sign that they don't have much to say; that's probably why the emphatic, over-the-top style is so popular amongst 'motivational speakers'.

5. Watch yourself

I cringe every time I watch myself giving a talk, but it's important that I do.

When you watch yourself, you can see all the mistakes you make and the mannerisms you'd rather not know you have. You can then be mindful of those problems when you speak.

6. Don't waste words on a "Table of Contents"

If you have a story to guide your talk there's little point having any "table of contents" slides. They just bore the audience and provide an unnecessary structure. The real structure of your talk should be a narrative that leads the audience through your points. Listing the points may be useful as a summary, but I'd even be wary of that. Tell a good story and they'll remember what you said and think you're a great speaker.

Wednesday, May 15, 2013

Winner of the minimum coin problem

I posted a puzzle about coins and change and promised a prize (a copy of The Geek Atlas) to one of the solutions picked by me. The following people provided solutions:

1. Matt

2. Michael Bauer

3. Asger Drewsen

4. vext01

5. Dubya

6. Josh

7. Anatoli Plotnikov

8. Matt Hickford

As that's a convenient 8 people I've chosen a winner by the roll of a d8.


 So the winner is #7, Anatoli Plotnikov.

Monday, May 13, 2013

Twitter's odd use of the word 'followed'

I logged into the Plan 28 Twitter account and it suggested that there might be some people that @plan28 would like to follow:


A couple of things are pertinent here. Firstly, @plan28 does not follow anyone (and certainly not me) but oddly Twitter is recommending a number of people to follow and indicating that they are followed by John Graham-Cumming. Clicking on my own name I get taken to my @jgrahamc Twitter account.

So, I'm not sure why Twitter thought @plan28 would care who @jgrahamc follows, but that's not the oddest part.

@jgrahamc does not follow BBC News (UK), TechCrunch, BBC Technology, Stephen Wolfram, or Tim O'Reilly. In fact, from that list @jgrahamc only follows @causata (my former employer). Scrolling further down I see that Twitter claims that @jgrahamc follows Time Magazine, Programming Wisdom, someone called Mikhail, someone called Alex Jamieson, the Mars Curiosity Rover, The Onion, and Douglas Coupland. 

I don't follow any of them.

I suspect that Twitter is keeping a list of people I followed in the past (I at least recognize having briefly followed Douglas Coupland and the Mars Curiosity Rover in the past; although I don't recall ever showing interesting in Time Magazine or Stephen Wolfram).

If I'm right that I did follow those people at some point in the past and Twitter has remembered then it highlights, yet again, one of the important facts about Internet companies: their idea of 'delete' or 'stop' or 'sign out' and most people's are not the same.

Because of this it's worth thinking of things you do on the Internet as public and irrevocable.

Sunday, May 12, 2013

The Four Power Days

Today, May 5 is written 5/12 in the US and 512 is a power of 2, it's \(2^9\). I tweeted that and received a rapid reply from Peter Inglesby that in UK-style dates it's 12/5 which is 125 which is \(5^3\). Which made me wonder whether there were other dates that had the same property in the US and UK on the same day.

Peter wrote a quick script:
and determined that there are four such dates:

March 24: US 3/24; \(324 = 18^2\). UK 24/3; \(243 = 3^5\)

May 12: US 5/12; \(512 = 2^9\). UK 12/5; \(125 = 5^3\)

June 25: US 6/25; \(625 = 5^4\). UK 25/6; \(256 = 2^8\)

December 5: US 12/5; \(125 = 5^3\). UK 5/12. \(512 = 2^9\)

Now I just need to think up a mystical significance to this, start a religion and retire on the proceeds.

Sunday, May 05, 2013

A home made periodic table

One of my slow burn projects has been to make and display a periodic table of elements using element samples that I have been able to find or make myself. This is what it currently looks like:


The poster itself is actually a blown up version of the poster that comes with the quirky book Wonderful Life with the Elements: The Periodic Table Personified by Bunpei Yorifuji. Each element is represented by a person (or robot) and characteristics of the person reflect the element itself (such as its state at 'room temperature', its radioactivity, and when it was discovered).

Having got the poster enlarged I had a custom frame made using the cheapest online source I could find. To store each element I'm using 1/2 dram clear glass vials with black polypropylene caps that are lined with polyvinyl. This vial are very small (they are pictured here with my international object sizing tool) so that the overall periodic table is not too large.


It is possible to buy samples of the elements quite easily, but I wanted to obtain them from the world around me for the fun of learning more about them and to be cheap.

I extracted nitrogen from a can of Illy coffee where it is used to pack the coffee, hydrogen came by reacting a little NaOH with Al, oxygen via electrolysis of water, tungsten from a broken halogen light bulb, iron fillings I had lying around, lead came from my roof, copper from wire I had, germanium from an old diode, etc.

It's rather slow to build the table like this, but the original motivation came from reading Oliver Sacks' Uncle Tungsten and a desire to learn more about the elements themselves. For example, even simple old iron has lots of interesting things to know about such as its allotropes and the Curie temperature.

Thursday, May 02, 2013

The two problems I had to solve in my Oxford interview

Back in the 1980s I went to Oxford University and studied Mathematics and Computation (this was almost the entire Mathematics course plus Computer Science added on; the degrees offered today are a little different).  Having sat the mathematics entrance exam and gone through all the mathematics interviews I had interviews in the Programming Research Group (now the Department of Computer Science). During those interviews two specific programming/algorithm design questions were posed. Here they are (I made up names for them).

The Z Machine

A computer is constructed with a simple memory layout. It has an unlimited amount of memory and each memory location is numbered so that a program can refer to it. Each memory location can store a single number or be uninitialized. In the following diagram memory locations that are blank are uninitialized some other memory locations have numbers in them.


The computer's CPU only has three instructions Z, I and J as follows:

Z. This instruction zeroes a memory location. For example, Z2 sets memory location 2 to 0, Z42 sets memory location 42 to 0.

I. This instruction adds one to the contents of a memory location. For example, I3 adds 1 to whatever is currently stored in location 3.

J. This instruction examines the contents of two memory locations and branches if the contents are different. For example J18,19 would compare the contents of memory locations 18 and 19, if they are the same the program continues with the next instruction, if different it branches. The branch destination is just specified by drawing an arrow to the instruction you want to go to.

When there are no more instructions the program stops.

For example, here's a loop that keeps adding one to memory location 4 until it equals memory location 20.
1. The operator of the machine places two numbers (one each) in memory locations 0 and 1. Here, for example, the operator has put 3 in location 0 and 4 in location 1. Write a program using the Z, I and J instructions to add those (arbitrary) numbers together and put the result in memory location 2.


2. Under what circumstances does this program fail?

The One-eyed Robot

(If you've studied computer science you may recognize this problem)

On the ground along Keble Road there are a line of n buckets, in each bucket is a single ball. The balls are red, green and blue. A robot is to be programming to sort the balls so that there's one ball in each bucket still but the colours are now in the order red, green, blue. So, working from left to right an observer will see all the red balls in buckets, then all the green balls in buckets, and finally all the blue balls.
The robot is only allowed to examine one ball at a time by peering into its bucket and is only allowed to examine each ball once. When it looks at a ball it must decide what to do with it. The robot has two arms and the only ball manipulation it can do is swap the balls in two buckets.

1. Program the robot to sort the balls.

2. Write out a proof that the proposed algorithm works.

Note

Of course, a whole load of other questions were asked about program running time, correctness and what I later realized was the lambda calculus. But these were the two main 'programming' tasks.

Wednesday, May 01, 2013

From the ground up (or how to encourage a school boy)

Following on from my popular blog post about coding by hand in 1985 I dug into my pile of old stuff to look at how myself and another boy at school reverse engineered the Research Machines CHAIN network and built everything from networking protocols to a network management system in assembly language.

In 1982 Research Machines in the UK launched the LINK 480Z Z80 based machine that had an optional 800kbps proprietary network called CHAIN. My upper school got a small network of them linked to a file server running MP/M. The 480Z's would boot from the file server across the network.


Unfortunately, the entire network protocol was undocumented by Research Machines and unpublished. Myself and another boy, P, decided to reverse engineer because we wanted boot control and we wanted network access. Disassembling the running operating system (often using its front panel, which was on screen and not via flashing lights) we were able to determine that the Z80 RST 8 instruction was used to send/receive packets on the network.

The CHAIN network supported up to 255 machines (each machine had an 8-bit DIP switch to set its address) and was similar in operation to Ethernet in that packets could be lost when collisions occurred. The built in operating system provided access to files on the file server; we wanted much more. Specifically, we wanted reliable remote control of any machine on the network.

Once we had decoded how RST 8 worked we quickly realized that (a) we could see every packet on the network and so could build a protocol analyzer (we built that and called it NETSCOOP) and then built a program just to watch file system access by all other users and see what files they were reading/writing and grab their contents.


But that was just passive so we went a bit further. P came up with the idea of building a reliable protocol on top of the unreliable CHAIN network. His idea was to send a sequence number in each packet and then acknowledge the packet (I don't recall whether he read about this idea or came up with it alone). If we didn't get the acknowledgement then after a delay we'd resend. He wrote up some Pascal pseudocode and I coded what we called 'safe network transactions' in assembly language:


Once we'd got that coded we were able to start building up a network OS of sorts. The first step was to build something we called NMBOS which gave us basic functions across the network to any machine running it.


The most interesting feature of that is that it operated using what we called 'worms': executable code sent directly to the machine and executed in the packet buffer. We had all this working in 1984, but it appears I added nice comments to the code and printed everything out in 1986.

The word 'worm' is interesting because this all precedes the famous Morris Worm by a few years and I'm pretty sure we got the name from John Brunner's Shockwave Rider.

This worked by simply assembling some function you wanted executed and then sticking it in the send buffer and reliably sending it to the remote machine which would simply execute it. By using self-modifying code we could prepare routines to be executed remotely and modify them before a send. Here's the actual receive handler. In the middle you can see the call into the packet buffer. We only had (if memory serves) about 128 bytes of receive space, but it was plenty.


Having built that we created the NMIOS which used the worm functionality to provide slightly higher level function (such as writing characters to the screen, insert characters into the keyboard buffer, and even reading and writing arbitrary memory locations on the remote machine!). The NMIOS could also be called from Pascal or assembler.


From there we were able to build the manager system called NETCON which allowed the teacher to control all the machines simultaneously and included a 'virtual blackboard' mode where they could write to the screens of all machines. Other functions allows them to freeze machines (to get the pupils to pay attention). After that we went on to write two-way chat programs, peer-to-peer file transfer and a nice program that linked two CHAIN networks together of a 1200 baud modem (essentially a router).


We also hacked the boot mechanism for the entire network so that our NMIOS was included in the boot image and even added password protection on top (P did that work managing to stuff the whole thing into a 'space' in the boot image). We had different levels of password protection for pupils, teachers and ourselves.

During 1984 I wrote to Research Machines and asked them for assistance. I was hoping they'd give me documentation to RST 8 so that I could have the official information not what we'd reverse engineered. I received the most wonderful reply.


That's the spirit! And very encouraging to a school boy.

Monday, April 29, 2013

How I coded in 1985

Back in 1985 I worked on the computerization of a machine designed to stick labels on bottles. The company that made the machines was using electromechanical controls to spool labels off a reel and onto products (such as bottles of shampoo) passing by on a conveyor. The entire thing needed to work with mm accuracy because consumers don't like labels that aren't perfectly aligned.

Unfortunately, electromechanical controls aren't as flexible as computer controls and so the company had contracted a local technical college (where I was studying electronics) to prototype computer control using a KIM-1. Another student had put together the machine with a conveyor, a mechanism for delivering the labels, control of stepper motors, and infrared sensors for detecting labels and products.


My job was to write the software in 6502 assembly. Unfortunately, there wasn't an assembler and the KIM-1 just had a hex keypad and small display. So, it meant writing the code by hand, hand assembling and typing it in.  The code looked like this:


It was immediately obvious that computer control was going to be more flexible. The program first did automatic calibration: it measured the length of labels on the spool itself, it measured the distance between labels itself and it enabled an operator to quickly set up the 'overhang' distance (how much of the label is sticking out so the product can catch onto it.


While running it could automatically detect how fast the conveyor was moving and compensate, and spot when a label was missing from the supply spool (which happened when one peeled off by accident).

Of course, writing code like this is a pain. You first had to write the code (the blue), then turn it into machine code (the red) and work out memory locations for each instruction and relative jumps. At the time I didn't own a calculator capable of doing hex so I did most of the calculations needed (such as for relative jumps in my head).

But it taught me two things: to get it right the first time and to learn to run code in my own head.  The latter has remained important to this day. I continue to run code in my head when debugging, typically I reach for the brain debugger before gdb or similar. On the KIM-1 there were only the most basic debugging functions and I built a few into the program, but most of the debugging was done by staring at the output (on the hex display), the behaviour (of the steppers) and by running the code through in my head.

Here's the full program for the curious.

PS A number of people have pointed out that in 1985 the KIM-1 was far from the state of the art and we had lots of niceties like compilers etc. True. In fact, prior to this I had been programming using BASIC and ZASM (Z80 assembler) under CP/M, but you go to war with the army you have: the technical college had a KIM-1 to spare, it had good I/O and it thus it made a fine prototyping system for an embedded controller.

Tuesday, April 23, 2013

The minimum coin problem (with prize)

I have in my pocket the following British coins: one £2, two £1, two 50p, three 20p, one 10p, two 5p, two 2p and two 1p. I wish to buy a copy of today's paper at a cost of £1.70.

What is the maximum number of coins I can use to pay for the paper? With the restriction that I pay in a reasonable manner (e.g. I could give exactly £1.70 or an amount greater than that but without giving extraneous coins: e.g. giving the seller all the coins would not be acceptable, but giving him one £1 an two 50ps and expecting 30p in change is OK). The reasonable manner is simply: if I give the seller a set of coins he should not be able to return any to me without the sum of the remaining coins dropping below £1.70.

Looks like it's: two 50p, three 20p, one 5p, two 2p and one 1p. That gives away 9 coins from the 15 coins I have.

The general statement of this problem is as follows.  There are \(n\) coins in a currency: \(d_0\) to \(d_{n-1}\) are the denominations of the coins (normalize all to a single unit, e.g. a £2 coin would have denomination 200) where \(d_0 < d_1 < ... < d_{n-1}\). I have \(x_0\) to \(x_{n-1}\) of each coin (\(x_i \ge 0\)). I wish to pay for an item costing \(P\).

Problem 1 (giving away a maximum number of coins with the exact amount): Find amounts \(g_0\) to \(g_{n-1}\) where \(g_i \le x_i\) and

$$\sum_{i=0}^{n-1} g_i \times d_i = P$$

and minimizes

\[\sum_{i=0}^{n-1} (x_i - g_i)\]

Problem 2 (ending up with the minimum number of coins when change is optionally given): Find amounts \(g_0\) to \(g_{n-1}\) where \(g_i \le x_i\) and

\[\sum_{i=0}^{n-1} g_i \times d_i \ge P\]

and minimizes

\[\sum_{i=0}^{n-1} (x_i - g_i) + c_i\]

where \(c_i\) are the number of coins of change given by the newspaper seller in each denomination \(d_i\).

Bonus: take into account the weight of each coin and minimize the weight in my pocket not the number of coins.

Prize: a free copy of my book, The Geek Atlas, to be picked by me on May 15, 2013 from algorithms and implementations posted in the comments of this blog post.


Monday, April 22, 2013

Update on Go 1.1 crypto performance

Some time ago I integrated OpenSSL's MD5, SHA1 and RC4 functions into Go because Go's native implementations were slow. The Go team has made great strides in fixing this (for 386 and AMD64) with Go 1.1 with faster RC4, SHA1 and MD5.

Rerunning my tester program shows the following:

MD5SHA1RC4
Native Go (pre-1.1)404 MB/s123 MB/s111 MB/s
Via OpenSSL607 MB/s636 MB/s744 MB/s
Go 1.1602 MB/s383 MB/s662 MB/s
Go 1.1 vs. OpenSSL0.99x0.6x0.89x

So, the Go team has essentially reached parity with OpenSSL on MD5, is close on RC4, but could still do work to get the SHA1 performance close. (All testing of Go 1.1 was done with 84e21a8c2137 on Ubuntu 12.04 2.4Ghz Intel Core i7; 4 cores with 2GB of RAM).

As a further comparison Go's own benchmark tests shows the following.

MD5, via OpenSSL: BenchmarkHash8K (593 MB/s), BenchmarkHash8KUnaligned (581 MB/s); MD5, Go 1.1: BenchmarkHash8K (584 MB/s), BenchmarkHash8KUnaligned (589 MB/s)

SHA1, via OpenSSL: BenchmarkHash8K (606 MB/s); SHA1, Go 1.1: BenchmarkHash8K (373 MB/s)

RC4, via OpenSSL: BenchmarkRC4_8K (740 MB/s), RC4, Go 1.1: BenchmarkRC4_8K (681 MB/s).

So, that puts Go 1.1 MD5 at parity with OpenSSL, RC4 at around 90% of the speed and SHA1 at around 60%.

PS Reader Luit points to the fact that AES performance has been greatly improved as well: 6x to 7x faster than before on AMD64.

Thursday, April 18, 2013

The importance of open code

Last February myself, Professor Darrel Ince and Professor Les Hatton had a paper published in Nature arguing for openness in the code used for scientific papers. The paper is called The Case for Open Computer Programs.

In a coda to that piece Darrel wrote the following:
Our intent was not to criticise; indeed we have admiration for scientists who have to cope with the difficult problems we describe. One thesis of the article is that errors occur within many systems and does not arise from incompetence or lack of care. Developing a computer system is a complex process and problems will, almost invariably, occur. By providing the ability for code to be easily perused improvement will happen. This is the result detailed in both the boxes in the article: the Met Office data is more accurate, admittedly by a small amount, and because of feedback to developers the geophysical software was considerably improved.
Recently, an important paper in economics has been in the news because its conclusions turn out to be inaccurate for a number of reasons. One of those reasons is a programming error using the popular Microsoft Excel program. This error, in an unreleased spreadsheet, highlights just how easy it is to make a mistake in a 'simple' program and how closed programs make reproducing results difficult.

The original paper by Reinhart and Rogoff is Growth in a Time of Debt and it concludes the following:
[...] the relationship between government debt and real GDP growth is weak for debt/GDP ratios below a threshold of 90 percent of GDP. Above 90 percent, median growth rates fall by one percent, and average growth falls considerably more.
They point to a serious problem with growth rates once the debt/GDP ratio is above 90%. As this is an important economic topic at the moment other economists have attempted to replicated their findings from the original data. One such reproduction is Does High Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart and Rogoff which finds:
Herndon, Ash and Pollin replicate Reinhart and Rogoff and find that coding errors, selective exclusion of available data, and unconventional weighting of summary statistics lead to serious errors that inaccurately represent the relationship between public debt and GDP growth among 20 advanced economies in the post-war period. They find that when properly calculated, the average real GDP growth rate for countries carrying a public-debt-to-GDP ratio of over 90 percent is actually 2.2 percent, not -0:1 percent as published in Reinhart and Rogo ff. That is, contrary to RR, average GDP growth at public debt/GDP ratios over 90 percent is not dramatically different than when debt/GDP ratios are lower.
The coding error referred to there is a mistake in an Excel spreadsheet that excluded data for certain countries. And the original authors have admitted that that this reproduction is correct:

On the first point, we reiterate that Herndon, Ash and Pollin accurately point out the coding error that omits several countries from the averages in figure 2.  Full stop.   HAP are on point.   The authors show our accidental omission has a fairly marginal effect on the 0-90% buckets in figure 2.  However, it leads to a notable change in the average growth rate for the over 90% debt group.
All this brought to mind my own discovery of errors in code (first error, second error) written by the Met Office. Code that was not released publicly.

There's a striking similarity between the two situations. The errors made by the Met Office and by Reinhart and Rogoff were trivial and in the same type of code. The Met Office made mistakes calculating averages, as did Reinhart and Rogoff. Here's the latter's spreadsheet with the error:


The reality of programming is that it is very easy to make mistakes like this. I'll repeat that: very easy. Professional programmers do it all the time (their defense against this type of mistake is to have suites of tests that double check what they are doing). We should expect errors like this to be occurring all the time.

What's vital is that scientists (including the dismal kind) consider their code (be it in Excel or another language) as an important product of their work. Publishing of data and code must become the norm for the simple reason that it makes spotting errors like this very, very quick.

If Herndon, Ash and Pollin had had access to the original Excel spreadsheet along with the data they would have very quickly been able to see the original authors' error. In this case Excel even highlights for you the cells involved in the average calculation. Without it they are forced to do a ground-up reproduction. In this particular case they couldn't get the same results as Reinhart and Rogoff and had to ask them for the original code.

An argument against openness in code is that bad code may propagate. I call this the 'scientists protecting other scientists from themselves' argument and believe it is a bad argument. It is certainly the case that it's possible to take existing code and copy it and in doing so copy its errors, but I believe that the net result of open code will be better science not worse. Errors like those created by the Met Office and Reinhart and Rogoff can be quickly seen and stamped out while others are reproducing their work. 

A good scientist will do their own reproduction of a result (including writing new code); if they can't reproduce a result then, with open code, they can quickly find out why (if the reason is a coding error). With closed code they cannot and science is slowed.

It is vital that papers be published with data and code for the simple reason that even the best organizations and scientists make rudimentary errors in code that are hard to track down when the code is closed.

PS It's a pity that one year after the Met Office argued that for open data and code the code to reproduce CRUTEM4 is yet to be released. I hope, one day, that when papers are published the code and data will be available at the same time. We have the networking technology and storage space to do this.

Tuesday, April 16, 2013

PDF and eBook version of "A non-mathematical introduction to one-way functions and their use in securing passwords"

My four blog posts on this subject (starting here) have been very popular with well over 100,000 page views in four days. Because of their popularity, and because some people may wish to share the story of Alice and Bob in a different format I've put together an 11 page PDF and eBook containing and edited, illustrated and slightly expanded version of the posts.

You can buy it for $1.99 as a PDF or as ePub. It's 11 pages of detailed explanation for less than a latte!


It's for sale, rather than free, as a way of supporting this blog. There are no ads here and I intend to keep it that way.

Bob's wife Carla lends a speedy mind

This is part 4 of a series of blog posts about one-way functions and their use in securing passwords. If you have not read the other parts start here.

Bob doesn't usually let his wife Carla help him with crosswords. That's because, due to some unusual quirk of brain topology, Carla is a savant and she's memorized the entire Shorter Oxford English Dictionary. So, she's able to recall the definition of any word almost instantly. But, like many people with similar abilities, her talent is limited, she can't perform the opposite: given a definition she's just as slow as her husband at finding the corresponding word.

Carla's ability means that she can calculate Alice's one-way function from crossword solution through five definitions very quickly. She's about 100 times faster than Bob because she has no need to find words in the physical book. But she's no faster at performing the reverse.

Alice calls and Bob challenges her for the solution to 25A. Alice knows that it's WITTINESS but Bob does not. As before Alice picks some salt, she chooses 4,3,4,5,1 and works out the word given by her one-way function (it's SINGLE, WITTINESS -> QUICK -> DOING -> PARTICULAR -> INDIVIDUAL -> SINGLE). Alice thinks her salt value means that Bob will have a hard time figuring out that the answer to 25A is WITTINESS.

But she's forgotten about Carla. Carla sits quietly and runs through the entire dictionary in her head. Starting from each word she performs Alice's one-way function using the salt 4,3,4,5,1 until (sometime after lunch) getting the answer WITTINESS. She hasn't worked backwards, she's simply worked through every word in the dictionary performing Alice's one-way function (with salt) on each word in turn.

Aside: this is the current state of the art in password cracking. Carla is replaced by a computer (or more than one computer) capable of calculating the one-way function of a long list of passwords (usually starting with the poor passwords, like 123456, that people commonly pick and then making up passwords from mixtures of letters, numbers and symbols). Even with salt it's possible to crack passwords because computers are fast, and many real, mathematical one-way functions can be computed very quickly.

In fact, many real, mathematical one-way functions were designed to be quick to compute (in one direction) and impossible to reverse. This speed of computation makes them a poor choice for password protection because a fast computer can run through millions of password and salt combinations to crack passwords. They were not actually designed for the task of protecting passwords: they are used because their one-way nature means that password databases never store actual passwords.

While Alice is on the phone she overhears Carla whispering in the background. It doesn't take her long to realize that Bob has asked for Carla's assistance and that, because her one-way function is quite quick to work out, Carla will be able to run through the dictionary and determine crossword clues from the supposedly secure words Alice gives to Bob.

Luckily, Alice has a simple solution to this. She just needs to extend the time it takes Carla to work through each individual word in the dictionary. She does that by adding a wrinkle to her one-way function: she tells Bob that the one-way function must be performed multiple times. Given that Carla took half a day to crack a single word, she tells Bob that for any word she gives (with its corresponding salt) he must perform her one-way function 10 times.

So, for WITTINESS she would give Bob the salt 4,3,4,5,1 and the end word (which is now CHEMICAL). Bob must first follow WITTINESS to SINGLE, and then start again with the same salt and work forward from SINGLE to INCLUDING, and then start again with the same salt and work forward from INCLUDING to HAVING, and so on ten times ending up at CHEMICAL.

This forces Carla to do ten times as much work; so it will take her days to crack a single word. But it only slows Bob down a little. If he actually knows the answer is WITTINESS and just wants to verify that Alice has solved the same clue, he just has to perform Alice's one-way function 10 times starting from WITTINESS (Carla's forced to do the same thing for the entire dictionary).

And that is the current state of the art in password protection. A password is entered by a user, some random salt is chosen and then a one-way function of the password and salt combination is computed many times (often 10,000 or 100,000 times since computers are fast). Or a specially designed one-way function with a 'cost': a number that tells it how hard to work in computing its result (the cost can be increased as computers get faster) is used.

This causes a small delay when you enter your password at a web site or on your smartphone since a one-way function must be computed over and over again. But that small delay adds up to a huge delay for anyone who wants to crack passwords because for every password they try they incur the small delay of repeated calculations.

And that's where this series ends. I hope it's been enjoyable and informative.

PS From time to time you hear of companies that have had their password database stolen. You might be very surprised to learn that many do not use the state of the art in password storage. Be on the look out when a password database is reported to stolen to see if the press reports that the passwords were 'salted' and 'hashed' (the term typically used for the one-way function). And even when they are salted and hashed it's worth asking (or wondering) whether they used a scheme that could beat Carla's computer equivalent by hashing multiple times. (A comment on Hacker News points out that I should really be educating people to ask if a key derivation function was used, instead of salting and hashing, since those have been designed for this purpose).

PPS Don't pick a poor password (such as secret, 123456 or iloveyou). The first thing a competent password cracker will try is a list of common passwords. Even with the best salted and repeatedly hashed password systems a password like 123456 is going to get broken.

This entire series of blog posts is available for purchase as an illustrated PDF or eBook for $1.99.

Monday, April 15, 2013

Alice strikes back against Bob's 'reverse dictionary'

This is part 3 of a series of blog posts about one way functions and their use in securing passwords. If you haven't read the first two parts start here.

Now Alice is angry. She realizes that Bob has defeated her clever one way function with his reverse dictionary and that now that Bob's created it he can use it over and again. All his initial effort will be amortized day after day as he steals crossword answers from Alice.

And so Alice strikes back.

The following day she calls Bob and says that she's making a small change to her one way function. It'll still involve looking up five words in the dictionary and following the definitions, but the order will change. In fact, she'll get to pick the order.

In the original scheme the first word of the first definition was used, the second word of the second and so on. Alice proposes to verify that she knows the answer to 4D (TANGLE) by giving Bob the final word (just as before) but also five numbers indicating which word to pick from each definition. In the original scheme the order was 1,2,3,4,5 (indicating the first word from the first definition, the second word from the second definition and so on).

But today she tells Bob that the answer is HUMAN and the order is 2,2,4,5,1. So Bob has to start from TANGLE (if he knows it!) follow the second word of the first definition, the second word of the second definition, the fourth word of the third, the fifth word of the fourth and take the first word of the fifth definition (which should be HUMAN).

This renders Bob's reverse dictionary useless. It's only good for the original order; 1,2,3,4,5. Now Alice is free to pick any five numbers from 1 to 5 before working out her one way function. Bob can still easily verify that Alice knows TANGLE, as long as she gives him the order, because he too can follow the definitions from TANGLE to HUMAN. But, of course, he has to have solved TANGLE by himself!

And five numbers in the range 1 to 5 gives Alice 5 * 5 * 5 * 5 * 5 = 3,125 possible orders in which to follow definitions. That would force Bob to create 3,125 reverse dictionaries for each of the possible combinations that Alice could use. He doesn't have the room in his house for that.

And Alice is free to pick a new sequence for every word Bob wants to verify. So Bob has to either work backwards through the one way function (which is really hard/almost impossible) or go through the entire dictionary trying to find which word Alice started with.

Because Alice can pick a different order each time she wants to disguise a word she can even make the same word turn into completely different words. With the order 2,2,4,5,1 TANGLE became HUMAN, but with 1,4,5,3,2 TANGLE becomes AUTOMATIC. So, Bob won't even be able to learn from past words.

In the real, mathematical world of one way functions and passwords the extra piece of information that Alice has chosen (the numbers) is called salt. It is typically chosen randomly and either added to the password to be stored or used as a parameter to the encryption used to store the password. It need not be numbers; in fact it's often just random characters.

The password database will contain both the salt and the result of the one way function applied to the salt and password combined. It's still possible to check a password when the user types it in (the web site computes the one way function of the salt and typed in password combination and compares it with what's stored in the database).

But if the password database is stolen an attacker is frustrated because each password will have a different salt and cracking passwords will be greatly slowed down. Pre-calculated 'reverse dictionaries' or rainbow tables will be useless and each password will have to be cracked individually (without salt if two or more people have the same password it only has to be cracked once). This means an attacker has to run through all possible passwords for each entry in the database as the salt is different for each one.

The salt value that goes along with each password is not a secret. All it needs to be is long and different for each password stored in the database. Just as Alice tells Bob the salt 'order', a well designed password system will not relying on the secrecy of the salt: the strength of the system relies on the strength of the one way function (i.e. how hard it is to calculate 'backwards').

Fortunately, or unfortunately, this isn't the end of the story. Even with salt it's possible to crack passwords.

PART 4: Bob's wife Carla lends a speedy mind

PS There's actually a way round the need to create 3,125 dictionaries which is entirely separate from what Carla's going to do! Either see if you can figure this out yourself or read this comment.

This entire series of blog posts is available for purchase as an illustrated PDF or eBook for $1.99.