Tuesday, September 14, 2010

Fooled by pseudorandomness

When people talk about codes and ciphers they get very excited about the cryptographic algorithms: everything from Enigma through DES to AES and elliptic curves excites the popular imagination. Oddly the Achilles' Heel of many secure system is much more mundane and simpler to understand.

Cryptographic systems require good random numbers. And by good I mean unpredictable. That means that whatever your source of random numbers is, I shouldn't be able to predict the next number it's going to give. And I definitely shouldn't be able to do that after seeing a few numbers it has come up with.

Back in the Second World War, Nazi Germany had a lovely cryptographic system called Lorenz (which the British referred to as Fish). It relied on generating random numbers. But, unfortunately, the numbers were predictable and the British were able to generate the same sequence of random numbers and break the code.

More recently there have been examples of poor use of random numbers that make the headlines (at least in the places I read headlines). There's the story of the US game show contestant on Press Your Luck who realized that a sequence of lights wasn't random and memorized it to win >$100k.

Another story relates how a team discovered how to predict the order of cards in online poker because of poor randomization.

And not long ago I pointed out a theoretical attack of the web site Hacker News which someone subsequently independently discovered and exploited. And the list goes on.

Fundamentally, these attacks happen because the random numbers generated are predictable (either by looking at a sequence of numbers, or by knowing how the random sequence got started). That's because common computer random number generators are based on mathematical formulae that follow a sequence of apparently random numbers. These are called pseudorandom number generators. They are great if you are doing something like Monte Carlo simulation, but bad for keeping secrets.

And they are a disaster for game shows, online poker, Nazi Germany and anything cryptographic. If you need random numbers (and you need them a lot for computer security for things like generating keys) you need a Cryptographically Secure Pseudorandom Number Generator (CSPRNG). A CSPRNG will give you unpredictability. Otherwise you are in big trouble because if you can predict the underlying random numbers used for your cryptography your entire security scheme is likely in jeopardy.

One of the first places I look for vulnerabilities is in the use of random numbers. People screw up here all the time. Which is why I wasn't surprised by something nasty lurking in a technical description that's come out with the current Haystack storm.

The main developer wrote in a private forum:

We use the Boost implementation of Mersenne Twister for our CSPRNG, which we use to generate session keys, nonces, and general purpose random numbers, though the specific CSPRNG used is subject to change, at least before release. We seed the CSPRNG with entropy from the system entropy source (CryptGenRandom under Windows and /dev/random under Unix systems).

The Mersenne Twister is emphatically not a CSPRNG. That alone makes the foundations of Haystack look very shaky (unless they are doing something else to take that output and make it into a CSPRNG).

If you are doing any work where you need random numbers, use a CSPRNG. Don't invent your own, or rely on something weak. All platforms have them available.


Anonymous said...

Very good post, with a lot of intriguing examples. I can see why one would think seeding a pseudo random number generator with something from /dev/random would make it 'more random' (for lack of a better word) - is that not the case?

John Graham-Cumming said...

The problem is that even if it starts at a random point, if the PRNG is not a CSPRNG then it's possible to predict its output by observing a number of numbers that it generates.