Thursday, September 30, 2010

Beautiful Simplicity

As a child I was fascinated by a pair of switches in our house that controlled a light upstairs. There was one switch at the bottom of the stairs and another at the top that controlled the same light. Flipping either switch could turn the light on or off. For a long time I couldn't figure out how this worked: how did the switch downstairs "know" the light was on or off and "know" how to change the light? How did the two switches "communicate" with each other?

After much experimentation I unscrewed the switch downstairs and looked at the wiring. I was surprised to discover three wires (and three upstairs as well) and finally figured out that the switches were wired as follows with two switches in blue that can flip between two poles. Flipping either switch changes the state of the lamp from on to off, or off to on.

Later I visited a friend's house and discovered that they had three switches controlling a single lamp. That requires a more complex switch that can flip between two states connecting either the two wires straight through or in an X pattern like this:

You might like to stop here and look at those two diagrams and convince yourself that flipping any switch can turn the lamp on or off. Once you've done that notice that there aren't really two different sorts of switch at all, you could do this with three switches that flip between the straight through and X positions (just with some wires left unconnected).

Once you make that step, it doesn't take a lot of imagination to realize that this scheme will work for any number of switches. You can connect an arbitrary number of switches together in this fashion to control a single lamp. Here are four such switches connected together. You might like to pause again and convince yourself that this would work:

Now you can use a mathematical technique to prove that this will work for any number of switches. The technique is induction. The idea is that if you can prove something true for some base case (that this works with one switch), and you can prove that if it works for n switches it'll work for n+1 switches then you know it works for any number switches.

First, take the base case. Does a single X/straight switch control the lamp so that flipping it changes the lamp's state? Clearly, yes, this is a simple case.

Now suppose that there are n switches joined together and that flipping any switch will change the state of the lamp. If we add a switch to the end of the switches wiring it as in the examples above is it still true that any switch will flip the lamp from one state to the other? It is.

To see that consider the two terminals at the end of the row of n switches. One has power coming out of it, the other does not. Flipping the new switch just swaps where the power is coming from and thus lights or extinguishes the lamp.

Amusingly, my doctoral thesis on computer security uses a light controlled by three switches to illustrate the security properties of a system:

Turns out that the things that you learn as a child affect the rest of your life. The big lesson I learnt is that many things of apparent complexity are built from repeated simple building blocks. And also that you can find an opportunity to apply mathematics everywhere, even in your household lighting.

PS If you need a bit more beautiful simplicity go and read about NAND logic and realize that they are all you're ever going to need.

More technical opinions on the Analytical Engine project

I had emailed Tim Robinson who's been working on Meccano versions of the Analytical Engine and Difference Engine. He graciously replied:

My website is quite out of date and I have done a bit more since I last updated it. I have a new and much better version of the control mechanism (operation cards & barrel) which is much more reliable than the one described on the website, though still probably not good enough for a real engine. It also incorporates a critical feature which was missing from the first version (the ability to branch at the last moment on a result generated in the current cycle.).

I have a working demonstration of an anticipating carriage on 12 digit numbers.

My current project is a store, but I have a long way to go. The sheer quantity of stuff needed to build this is one big problem, together with the tedium of making huge numbers of identical assemblies.

I'm sure you have realized that the biggest single problem in proposing to actually build an AE, is deciding exactly what it is!

The initial difficulty in building the engine will be around deciding what constitutes a Babbage AE. The initial approach will involve a research project and computer simulation of the AE hardware. From there the project can continue to the actual physical device.

Wednesday, September 29, 2010

How my blog looks to readers in Iran

A reader wrote with a screen shot of my blog as seen from inside Iran.

Something in Iran is messing with the Blogger iframe at the top and replacing it with part of the page Iranian's see when they are visiting a forbidden web site. I speculate that may be banned in Iran and hence that small part of my site that's served from is blocked.

This results in my blog being headlined with the words In the name of God in Arabic.

It's a pity that the computer version of Arabic doesn't capture the beauty of calligraphy. The form of the Arabic language can be used to produce beautiful patterns (that are readable to someone with a good knowledge of the language). Here, for example, is the same phrase as appears on my web site written in calligraphy:

Which can be roughly pronounced "Bismillah al-Rahman al-Rahim" (In the name of God, Most Gracious, Most Merciful).

Tuesday, September 28, 2010

What to use instead of a pie chart

Some time ago I complained about pie charts without giving a real alternative. But the alternative is actually quite clear: it's not pie, it's sausage.

Just like in iTunes:

The length of each slice of sausage is proportional to the percentage. Since this is linear it's much clearer and easier to understand than a segment of a pie.

And naturally, at Causata we use the same style in our own application.

How I handle my mail

If you mail me at your mail is routed to a GMail account and then some magic happens. Your mail will be read by my automatic mail classifying code and automatically labeled. Here's a shot of my GMail labels:

All those labels with a + as a prefix are automatic. Here's what happens: every five minutes a service I've created logs into my GMail account using IMAP and OAuth (the service doesn't know my password, it's authorized via OAuth to access my mail). The service searches for labels with a + prefix and synchronizes.

Then it looks for new messages in my Inbox (and All Mail) and uses machine learning to apply one of the + labels to the message. Next it takes a look through the labels themselves to see if new messages have been labeled manually. It does that so it can learn.

When first set up the service looked through all the messages in the + labels and built a machine learning classifier from the message contents. It immediately after that started classifying mail. But, naturally, it's not 100% perfect and so sometimes it makes mistakes (puts the wrong label on messages). Happily, all I have to do is change the label in GMail and the service will spot the change next time it logs in and will update its classifier.

The entire interface to the automatic classifier is through GMail. In fact, there's no interface at all: it watches my actions and learns. All I have to do is set up labels with a + at the beginning and my mail is magically labeled. If an labeling error occurs I just relabel.

Anyone else OCD enough to want this service? (If you are serious about wanting a service like this please email me and I'll add you to a little mailing list for when I get round to implementing it for public consumption).

A quick chat with Doron Swade

I managed to get hold of Doron Swade who led the build of Difference Engine No. 2. He graciously replied to my random email about the Analytical Engine.

I won't reproduce the entire reply because it's long, and because the most important thing is something that he didn't say. He didn't say I was nuts! In fact, he too has been thinking about the best approach to building an Analytical Engine.

The conclusion that he had come to (as had I), is that the best way to approach building the Analytical Engine would be to begin with a physical simulation using a 3D graphics program with a physics engine so that the motion could be studied. My opinion is that this simulation would be a vehicle for raising the money to build the real device.

But even building the simulation is very complex because Babbage's plans for the Analytical Engine are incomplete and there are multiple versions. Getting to the simulation would itself be a research project to decide what would constitute an authentic Analytical Engine as Babbage would have conceived it.

So to build the engine there would be three major steps (the first two would be iterative):

1. Decide on the design of an Analytical Engine from Babbage's plans.

2. Build a computer simulation of the working engine to verify operation.

3. Build the physical machine.

At all stages money would be needed. First to pay for the research on the authentic machine, and second for the building of the simulation. Finally, money would be needed for the full build. Nevertheless, I think there's a significant community component as well: much of the simulation could be built by volunteers once the plans had been studied.

Estimating the cost will be difficult, but I can give a lower bound: the Difference Engine No. 2 build cost about £250,000 in the late 1980s. If inflation is to be believed that's around £390,000 today (which is about $620,000). The Analytical Engine would be bigger and more complex and hence more expensive and more research is needed.

So, I'm guessing (and this will need to be verified) that to complete a physical machine with historical accuracy would costs a small number of millions of £.

Monday, September 27, 2010

If only Skype worked

I have a love hate relationship with Skype. On the one hand it's an amazing service giving high quality voice and video for free anywhere in the world. I've used it to keep in contact with family and friends from all over the place (and when I was working for 6 months away from home it was a treasured application).

But I also had a really nasty experience with Skype. Five years ago I was working as a consultant from home with clients all around the world. I decided that I would use Skype to save money and I signed up for SkypeIn numbers in the US and UK (I was based in France), and I used SkypeOut to minimize my overseas calling.

Despite having paid Skype lots of money, I gave up using it for business. It was simply too unreliable. It's unacceptable to have what looks like a fixed line phone number fail on you during a call. And Skype's call quality goes from marvellous (the codecs give far better voice quality than a traditional phone) to bad to unusable in seconds. And the thing that kills Skype for business use is that it's unpredictable. Unfortunately, that makes it unusable for business.

It's OK for a mobile phone to have poor call quality because it's understood that the user is moving about in an uncertain environment. The same does not apply to Skype: the expectation is that it'll work.

At the time, France Telecom had an 'all you can eat' monthly calling plan that cost about €70 (it's now apparently €39) and gave me unlimited calls to anywhere in Europe and North America. I wondered at the time if the one good thing Skype had done was driven down the cost of a fixed phone line. And France Telecom never let me down: it simply worked. And simply worked was well worth €70 a month.

Now, lest you think 'five years ago' was the problem, fast forward to today. We use Skype extensively in the office for video conferencing (room and desktop) and it hasn't improved. I hear people constantly having call quality problems, jerky video. And we've paid for high quality headsets, cameras and speaker phones. We have a very fast Internet connection and good quality switches.

All Skype calls begin with "can you hear me?" and many end with "I'll just call you, this isn't working".

Skype: if only it worked reliably.

Paddington Bear doesn't wear a yellow hat

In William Gibson's otherwise enjoyable Zero History there's an inexcusable inaccuracy. Gibson sets a great deal of the book in the UK and thus likes to talk about things that are unique to the UK. And because this is a William Gibson novel many of them are contemporary (such as Caffè Nero and Waterstones). To be honest the contemporary references get a bit tiring.

But he needs to watch it because he says:

... into the Waterstone's shopping bag, where he saw at least two Paddington Bear fuzzy toys, with their iconic yellow hats.

Only in the US, Mr Gibson.

Paddington is always depicted wearing a hat, but it is only yellow in the US. In the UK the hat is red or blue depending on the era and manufacturer. The only yellow hats were from the US toy of Paddington. In the UK it would be extremely rare to find a Paddington with a yellow hat.

Saturday, September 25, 2010

Plan 28

I've created a domain specially for my investigations of the possibility of building an Analytical Engine to (close to) Babbage's plans. It's called Plan 28. It references the fact that Babbage's most detailed plans of the Analytical Engine are in the form of two plans numbered 28 and 28a and stored in the Science Museum in London.

These plans depict a machine with a mill (the CPU, capable of doing addition, subtraction, multiplication and division), a store (the memory), and the barrels (microcode for the mill operations). It's the closest thing to a computer as we know it (although the program and the data are stored separately and programs have no way of accessing anything other than fixed memory locations: i.e. there are no references or pointers).

If you are following along and want some reading then go with the following:

  1. The Difference Engine by Doron Swade. It's an easy, quick read introduction to Babbage, the Engines and the reconstruction of Difference Engine No. 2. The latter reconstruction demonstrates that the construction techniques available to Babbage were sufficient and would have enabled the construction of the machine. The reconstruction cost £250,000 in the late 1980s.

  2. The Little Engines that Could've. This is a PhD thesis by Bruce Collier and goes into a lot of detail about the operation of Babbage's engine. I turned those web pages into PDF and had it bound so I could sit and read it. The PDF is here.

  3. Charles Babbage's Analytical Engine, 1838. IEEE article by the man who probably understood Babbage's Engines better than anyone in the 20th century, Allan Bromley. This article introduces the state of the machine as of 1838.

  4. Babbage's Analytical Engine Plans 28 and 28a-The Programmer's Interface. Also by Allan Bromley this paper covers the Analytical Engine in 1847 when Babbage stopped working on it. In particular, it details the instruction set that would have been available to the programmer.

  5. The Evolution of Babbage's Calculating Engines. More historical context from Allan Bromley putting the design of the Analytical Engine in context with the Difference Engines and modern computers.

  6. Babbage's Analytical Engine Babbage's son, Major-General H. P. Babbage, describes his construction of part of the Analytical Engine.

  7. The Analytical Engine Also by Babbage's son, this long paper describes the operation of the Analytical Engine as he understood it.

  8. Of The Analytical Engine Charles Babbage's own description of the Analytical Engine from his autobiography.

  9. Sketch of the Analytical Engine The famous paper translated by Ada Lovelace on which much of her fame rests.

And as a bonus, here are some pictures I took at the weekend in the Science Museum in London.

The first shows a trial portion of the Difference Engine:

Here are some punched cards prepared by Babbage that would have been used for programming the Analytical Engine:

And here's the trial section of the Analytical Engine that Babbage built:

And, finally, the mill of the Analyical Engine. This was built by Babbage's son after his father's death.

Friday, September 24, 2010

On being a nerd

I was reminded by flicking through an old copy of Doron Swade's The Difference Engine of a letter Charles Babbage wrote to Alfred, Lord Tennyson concerning an error in one of the poet's, then recently published, poems:

In your otherwise beautiful poem one verse reads,

Every moment dies a man,
Every moment one is born

If this were true the population of the world would be at a standstill. In truth, the rate of birth is slightly in excess of that of death. I would suggest that the next edition of your poem should read:

Every moment dies a man
Every moment 1 1/16 is born

Strictly speaking the actual figure is so long I cannot get it into a line, but I believe the figure 1 1/16 will be sufficiently accurate for poetry.

I can't help smiling at this because it illustrates the great difficulty persons of great computational ability (which I shall refer to as nerds) have in overlooking small matters of inaccuracy. It's not uncommon to see computer folk arguing over fine points of semantics or mathematics, or deliberately playing with words and puns, or laughing about the minutiae of some program, machine or situation.

This comes about because nerds spend all their time worrying about details. Computers are exceedingly finnicky things. They do precisely what they are told (with heavy emphasis on precisely). Thus anyone who works with them (and by that I mean anyone who actually deals with computers rather than mere users) ends up training themselves to spot minute details that are incorrect or out of place.

Unfortunately, that attention to tiny detail seems to change the brain so that it is searched for everywhere. I think that's because it takes great mental effort to hold in your head the details of any computer system and be able to spot problems. Often you are looking for the tiniest needle in an enormous haystack of information.

By the time Babbage wrote that letter he had spent 20 years on his calculating/computing engines often working 11 hour days. It is no wonder that Tennyson's error stood out to him. I have seen Babbage's letter described as insulting, humorous and over-zealous. To me it is none of these things. Tennyson was a great supporter of 19th century science and Babbage was merely pointing out a technical inaccuracy.

PS Sadly I have not been able to find in any of Alfred, Lord Tennyson's letters a reply to Babbage.

Thursday, September 23, 2010

Another technical opinion on building the Analytical Engine

I wrote to Tony Hoare to ask his opinion on this idea and he was kind enough to reply:

Would you be interested in doing an animation of the engine first? I've much enjoyed computer animations of the old Elliott 803, with the original tape decks, paper tape readers and punches, and all making the right motions and the right noises! Of course I also like revisiting the real machine, even if stationary.

The simulation would be a good way of collecting money and enthusiasm for the real Engine. But the trouble with the real thing is that if you demonstrate it, it will wear out. OK for a private owner, but a bit disappointing as a museum piece -- unless supported by simulations.

I assure you that even if the real thing is built, the simulation will be essential to check the design before construction can begin. And it will be the main intellectual challenge in the project -- and great fun.

Clearly, he's correct. The first step in building the Analytical Engine would be to create a working prototype on a computer so that bugs in its design (which will need to be reconstructed from fragments and incomplete descriptions) can be worked out and a viable machine created in some CAD package.

From that a real machine could be created. Looking at the plans for the Analytical Engine the minimum machine would likely to quite large: around the size of a steam locomotive.

So, any experts on building this type of simulation out there?

If you want to get press mention Iran

Over the last couple of days there's been a lot of noise about the Stuxnet worm. Most of that noise has been because of the claim that it was designed to attack the Iranian nuclear reactor at Bushehr and speculation that the worm was written by Israel. Unfortunately, that part covers up the most interesting part of the story: this worm was really sophisticated and designed to attack industrial control systems that could have real-world impact.

But the mentioning Iran strategy has worked for many people in the past, see the Scacco/Beber affair and the Haystack mess. (Hmm. I've complained about bogus information about Iran enough that it's starting to look like I'm an Iranian agent :-).

Put aside the target of Stuxnet and there's a much more interesting side to the story. Stuxnet really does look like it could have been created by a nation to attack someone. i.e. it could be that Stuxnet is a weapon. Oddly, the press is reporting people saying things like:

"What we're seeing with Stuxnet is the first view of something new that doesn't need outside guidance by a human – but can still take control of your infrastructure," says Michael Assante, former chief of industrial control systems cyber security research at the US Department of Energy's Idaho National Laboratory. "This is the first direct example of weaponized software, highly customized and designed to find a particular target."

Perhaps it's the first example of this in the wild (although I doubt that also), but it's not something that's gone unimagined. In fact, Richard Clarke wrote a fascinating book on the subject called Cyber War: The Next Threat to National Security and What To Do About It in which he details exactly this type of virus attacking industrial control systems.

He even mentions that the US tested the ability to destroy a turbine via software from the Internet. He writes:

To test whether a a cyber warrior could destroy a generator, a federal government lab in Idaho set up a standard control network and hooked it up to a generator. In the experiment, code named Aurora, the test's hackers made it into the control network from the Internet and found the program that sends rotation speeds to the generator. Another keystroke and the generator would have severely damaged itself.

(Aside: the book is fascinating as it combines technical information like that with policy recommendations). And it does mention that this sort of cyber-attacking has been going on for years. Clarke's book relates the story of the Siberian pipeline sabotage by the CIA.

The oddest part of the Stuxnet story is that claim that it's attacking Iran. There doesn't seem to be much presented evidence of this. There's nothing in the story about how Stuxnet picks the systems it attacks to suggest that it knows the fingerprint of some system in Iran. The only evidence appears to be a map that Microsoft produced of Stuxnet infections showing that there were lots in India, Indonesia and Iran. I suppose the Iran narrative is the most interesting.

No one should be surprised that a computer worm or hacking could damage equipment in the real world. But I suppose they are. Perhaps Stuxnet will be a wake up call and make people realize that cyberwar is actual war: i.e. it can have the same effects as so-called kinetic weapons like bombs.

And here's the danger with cyberweapons: they are a form of asymmetric warfare. Some countries are more vulnerable to cyberattack than others. For example, the US is most vulnerable because of the density of computer networks and computer control systems and lack of control inside computer networks for cultural reasons. On the other hand, China is less vulnerable because they can monitor their entire Internet, cut it off, and take control. Thus cyberwar is advantageous for China over the US.

Clarke's book expands on that theme and talks about how to deal at a policy level with this threat.

PS There are two papers about Stuxnet at the Virus Bulletin 2010 conference next week. These are last minute submissions and should have gory details. Let's wait until then to really understand what it is and is not. The papers are An in depth look at Stuxnet and Unraveling Stuxnet.

Wednesday, September 22, 2010


Seriously, I write this blog for two simple reasons: the freebies and the groupies. Without them it wouldn't be worth it. And what great freebies: a DisplayLink USB video adapter and more RelaxZen than you can shake a stick at.

And the groupies. Well I have the photos, but I'm not posting them.

Which brings me to this week's awesome freebie: a copy of John Calcote's Autotools: A Practitioner's Guide To GNU Autoconf, Automake, AND Libtool. Yeah, that's what you get for living the GNU Make life and writing a self-published book about it.

If there was ever a tool that needed a book it's Autoconf (and related scripts). Happily, Calcote has done a great job of describing, in detail, a collection of tools that can seem opaque at first glance (also at second glance). This book is vital for anyone who needs to work with Autotools and I wish I'd had it years ago.

The final chapter (A catalog of tips and reusable solutions for creating great projects) is fantastic because it dishes up a collection of practical, real solutions to problems users of Autotools will encounter. Above all, the book shows that it was written by someone who truly understands the set of tools, and thankfully is able to write clearly.

He doesn't shy away from getting into difficult details (like the M4 macro language) and chapters 8 and 9 are an exposition of the use of Autotools for an actual, large project showing what a real-world use of the tools looks like. Those 50 pages are probably the most valuable in the entire book.

Highly recommended for anyone who needs to use Autotools.

Babbage machines in Meccano

Here's a model of the Difference Engine No. 2

And here's part of the Analytical Engine

More from the creator of these lovely devices.

First opinion on whether the Analytical Engine could be built

I asked John Walker who maintains a lovely web site of information about the Analytical Engine whether he knew of any serious attempt to build it. He was kind enough to reply.

To my knowledge, nobody has made a serious attempt to build the Analytical Engine or even a seriously scaled down version of it. I think the general consensus (which, in part, informed the various British commissions which decided not to fund the project) is that it is unlikely in the extreme that a machine which would be necessarily so large would not fall victim to "tolerance creep", where tolerances in individual components would eventually add to make large scale interfaces (for example, between the Mill and the Store) unreliable.

Babbage was aware of this problem and addressed it in his papers. His solution was to design the machinery so that it would jam in case of error, but then the question is, how often would it jam? If it jammed every second and a typical computation took several hours, a room full of people with log tables could out-compute the Analytical Engine.

I'd think it would be beyond crazy to try to raise the funds to construct the complete Analytical Engine. After the Singularity, when we're all 10^16 times as wealthy as at present and can build diamonoid machinery with atomic precision, I'd say go for it, but then tens of thousands of people will have done so within the first 24 hours after the Transition.

Gen. Henry P. Babbage's description of an attempt to build just a component of the Engine:

is instructive of the problems of mechanical tolerances. We can build things much more precisely than in his day (although much of the progress in our technology has been in coping with sloppiness, not improving precision), but ultimately large scale computation depends upon robust digital storage which is immune to noise:

Any macroscopic mechanical system has at best a modest level of noise immunity, and when you imagine a machine the size of an auditorium with hundreds of thousands of parts, the challenge seems overwhelming.

I'm a balding engineer, and I've seen many great ideas founder on the rocky shore of reality. I think the British were *right* not to fund the Analytical Engine; it was a superb idea a century before its time.

So go prove me wrong.

(And ask yourself, as I often do, "What are the superb ideas we have today which are a century before their time?")

I shall continue to investigate. If you care about this topic you can follow the label babbage. If you have an informed technical opinion on this please contact me.

Tuesday, September 21, 2010

It's time to build the Analytical Engine

Normal people have a small part of their brain that acts as a sort of automatic limiter. They get some crazy idea like writing a book or campaigning for a government apology or calculating the number of legal track layouts for a cheap train set and their limiter goes: "Don't be ridiculous" and they go back to normal life.

Unfortunately, I was born with that piece missing.

So, it's not without trepidation that I say that it's time Britain built the Analytical Engine. After the wonderful reconstruction of the Difference Engine we need to finish Babbage's dream of a steam-powered, general-purpose computer.

The Analytical Engine has all the hallmarks of a modern computer: it has a program (on punched cards), a CPU (called the 'mill') for doing calculations and it has memory. Of course, it's not electric, it's powered by steam. But the principles that underlie the Analytical Engine are the same that underlie the computer I'm writing this on.

From Flickr user csixty4

What a marvel it would be to stand before this giant metal machine, powered by a steam engine, and running programs fed to it on a reel of punched cards. And what a great educational resource so that people can understand how computers work. One could even imagine holding competitions for people (including school children) to write programs to run on the engine. And it would be a way to celebrate both Charles Babbage and Ada Lovelace. How fantastic to be able to execute Lovelace's code!

From Flickr user gastev

Of course, Babbage and his family only ever made parts of the engine (see the picture above). But that shouldn't stop us from constructing it now. All that's needed is money. I'd imagine there are plenty of people who'd want to work on the project.

Unfortunately, I think it would cost a lot of money. The construction of a second Difference Engine for Nathan Myhrvold is said to have cost $1m and that was after all the hard work of figuring out how to make it was completed. It also took years.

But that shouldn't hold us back.

If sufficient money could be raised I'd jump at the chance to run this project as a charity that would donate the completed machine to either London's Science Museum or the National Museum of Computing. Clearly, I can't do this in my free time (and nor could others) so sufficient money would need to be raised to pay a reasonable salary to those involved. And I'd imagine that the materials cost would be very large as well.

Am I mad? Would you donate to make the Analytical Engine an oily, steamy, brass and iron reality? Can we live up to Lovelace's words when she wrote: "We may say most aptly, that the Analytical Engine weaves algebraical patterns just as the Jacquard-loom weaves flowers and leaves. "

PS A commenter asked about pledging money to the project. I'm not quite ready to start accepting cash! :-) But people can pledge by either sending me an email or simply writing a comment here. That'll give me an idea of interest in doing this.

PPS UPDATE. Please visit Plan 28 for more on this topic.

Monday, September 20, 2010

GAGA-1 Source Code Repository

To keep things clean and manageable I've created a source code repository on Github for this project. You can find it here.

The only thing committed so far is a skeleton directory structure and the LUA code for running the camera.

Note that everything in the repository is covered by the General Public License, v2.

Top blog content for August 2010

The following blog stories were hits around the web:

  1. Is The Times making you stupid? 24% of page views

  2. Shut up and ship 20% of page views

  3. Percentage of top grossing US films by decade that depict killing 10% of page views

  4. How many clicks does it take? 7% of page views

  5. The bandwidth of a fully laden 747 5% of page views

"We've read your code"

Some years back when I was still actively working on POPFile and speaking at the Spam Conference at MIT I was invited to be on the technical advisory board of a company doing anti-spam work.

I flew up to the company's headquarters and met everyone and sat down for an interview with the head of engineering. At no time did he ask me a technical question. We discussed what was working and what wasn't working in the fight against spam. We talked about The Spammers' Compendium. But no one asked about my code.

I found out why towards the end of the interview when he said: "We've read your code".

Because I'd been doing all the development of POPFile in the open the company had evaluated my technical ability just by reading my code. And there's the great advantage of being open: others can evaluate you and choose you without you needing to apply for anything. And if you do apply then make sure people can find your contributions.

I've been lucky because "John Graham-Cumming" is close to a unique ID. There is one other who's a Detective Constable in Manchester (which is odd, because I seriously considered entering the police force after my doctorate). But, if you are "John Smith" then it's worth making sure that you have a clear online identity.

For years, I've cultivated my own identity through the web site I registered the domain back in 1997 to make sure I had my own place on the web, and to make sure that my email address would always be valid. And more recently I've built up places on Twitter and LinkedIn that potential employers (and others) can use to find out about me.

It's worth doing the same if you want to be Googleable. And my bet is that you will want to be, because if you don't then your online identity will consist of the things you haven't cultivated. Your identity will only look its best if you curate it.

And while places like Github and LinkedIn are important parts of your identity, there's nothing more important than a web site that you control. If you haven't registered your domain today, do it.

Sunday, September 19, 2010

GAGA-1: Capsule paint job

So with a couple of hours to work on GAGA-1 this weekend I did one simple thing: painted the capsule fluorescent yellow. Most of the other high-altitude balloon projects that I've read about have used fairly simple capsules that are covered in duct tape, or simply left white.

I decided to make the GAGA-1 capsule really visible for two reasons:

1. I want to be able to find it. Since the balloon will most likely come down somewhere in Norfolk or Suffolk (both of which are largely rural), it's going to end up in a field somewhere. The flight and recovery computers should tell me where, but I still want to be able to see it clearly when I get close.

2. If the capsule with its parachute should come down near people or vehicles, I want them to be able to see if coming. I'll be using a brightly coloured parachute, fluorescent rope and, now, a fluorescent capsule. (I'll do another blog post of safety later).

The capsule is made of expanded polystyrene which is light and insulating, but a pain to cut and paint. It's not possible to use any solvent-based paints because they eat polystyrene in seconds. Here's a shot of a polystyrene ball sprayed briefly with paint. Note the ball next to it for comparison.

That damage was done in under five minutes.

So, to paint polystyrene you need to do two things: use a water based paint and get that paint to stick. To get it to stick the polystyrene should be prepared by sanding it. I tried two things: sanding with fine wire wool (bad idea because the wire wool sticks to the polystyrene and is impossible to remove) and sanding with 240 grit sandpaper.

The sandpaper worked great. In no time at all the surface was ready for painting. I used unthinned acrylic paint from a local art shop.

And here's the final result. These pictures don't capture the full 'my eyes, my eye!' brightness of the fluorescent paint. Here you can see the yellow but not how fluorescent it is. With this paint job it should be visible.

Of course, painting the capsule adds to the weight, but I think that's going to be offset by the holes that I need to cut in it for the camera lens and antennae.

Once the paint has had time to really dry I am going to stencil on contact information so that if the capsule is lost someone can get in contact if they find it.

Friday, September 17, 2010

A tale of two cultures

In recent days there's been much discussion of Digg and reddit that has concentrated on the technical and community aspects of the two sites. This commentary has overlooked a vital component to the difference between these two social news sites: company culture.

I had first hand experience of how company culture affects users when I 'hacked' both Digg and reddit in 2006/2007. The reactions of the people at the two sites was instructive.

1. Digg

On July 26, 2006 I realized that I could create a loop between a Digg story and a reddit story because the URL that would be assigned to a Digg story was predictable and reddit didn't check links for 404 status when submitting. So I was able to submit a story to reddit (that didn't yet exist, but with a predicted Digg URL) and then submit the reddit generated URL to Digg.

I used this to create a silly 'recursion' prank. The Digg story is Recursion defined (see reddit) and the reddit story is Recursion defined (see Digg). They point to each other.

It was just a bit of silliness. It got me banned from Digg and I ended up having to appeal to Kevin Rose (via an acquaintance) to get my account restored.

I was banned because I was apparently a spammer. When I submitted the story to Digg I made a mistake in the title and created the wrong URL. So I immediately used the Bury functionality of Digg to kill it and I resubmitted with the perfect title to make the hack work. Unfortunately, the folks at Digg took this to mean I was a spammer.

They killed my account and then a Digg employee went public with libelous claims about me: "The problem was, then he submitted the story multiple times and then created multiple fake accounts and dugg his own stories." The multiple fake accounts claim was simply untrue. The only time I created a new account was after 'jgrahamc' was killed.

Through Leo Laporte I reached Kevin Rose who restored my old account. At the same time reddit honored me with a Golden Reddit and my very own reddit logo.

2. reddit

Looking at reddit's markdown one day I realized that it would be possible to implement an entire tic tac toe game inside reddit comments by drawing ASCII boards in comments, linking between them, and enumerating every possible board position. The vestiges of this game can be found here in the test comment I used. The idea was that one would start at an empty board with x ready to place a tile. The user would click on a board position, a different comment would be displayed showing the x in place and then the mouse would be handed to the person playing o and the game would continue.

There are less than 20,000 (I don't have the actual figure in front of me, but it's less than 3^9 since not all possible combinations of x, o or blank are reachable) different valid, reachable board positions in tic tac toe, so all I needed to do was create all of them as separate comments on reddit. I wrote a small program that used WWW::Mechanize to perform the comment posting. It worked its way back from all the possible winning board positions back up to the starting blank board.

This killed reddit.

Or rather, it killed the process that was doing indexing of reddit comments. Here's where things got interesting in the Digg vs. reddit battle. reddit didn't ban me. They killed all the comments I'd created because it was slowing them down, they temporarily limited my IP address and they sent me an email.

I received a friendly email from Steve Huffman telling me what they'd done and asking me what I was up to. When I explained he replied that 'reddit likes hacks', and that my little game had exposed a problem they'd been aware of with comment performance that they'd been meaning to fix.

I never did go back and create the full game, but I was left with a very different impression of Digg and reddit.

Ultimately, I think the cultural difference comes down to a difference between hackers and pseudo-hackers. Reddit's beginnings were very much about being a hacker, and the open, liberal culture that goes with that continues to infect the site. Digg's beginnings are bound up in the frat-boy, dark-tipper "Kevin Rose" persona. If you want to change Digg you probably have to get rid of "Rose".

PS I put "Kevin Rose" in quotes there because I don't know him personally and I'm referring specifically to a persona that is visible in things like Diggnation, not to the man himself.

PPS For those who love the details. Here's the actual email exchange between myself and Steve Huffman:

Date: Sun, 14 Jan 2007 11:56:13 -0800
From: "Steve Huffman"
To: "John Graham-Cumming"
Subject: tic-tac-toe and such

I removed the tic-tac-toe thing this morning. I wouldn't normally, but
the 5000 comment thread was killing the server.

I would have just asked you to stop, but I couldn't reach you quick enough.

I didn't mean to break your account, I'll fix that right now.


Date: Sun, 14 Jan 2007 20:59:31 +0100
From: "John Graham-Cumming"
To: "Steve Huffman"
Subject: Re: tic-tac-toe and such

On 1/14/07, Steve Huffman wrote:
> I removed the tic-tac-toe thing this morning. I wouldn't normally, but
> the 5000 comment thread was killing the server.

Omigod. I had no idea that a 5,000 comment thread would mess up your
server. For that, I am truly sorry. I rate limited the code so that
it would take a long time to make all 5,000 (and it was done in a
single thread) with one new comment about every 7 seconds.

I feel very bad about messing with the server.


Date: Sun, 14 Jan 2007 12:02:16 -0800
From: "Steve Huffman"
To: "John Graham-Cumming"
Subject: Re: tic-tac-toe and such

No problem, I just saw automated comments coming in and the only thing
I could do at the moment was kill that thread. My intention was just
to kill that thread, not mess up your user.

Once I get your user fixed, I'll bug you about what you were actually
trying to do :)


Date: Sun, 14 Jan 2007 12:20:12 -0800
From: "Steve Huffman"
To: "John Graham-Cumming"
Subject: Re: tic-tac-toe and such

On 1/14/07, John Graham-Cumming wrote:
> Steve Huffman wrote:
> > Are things better now?
> Yes, that looks great.
> Thanks!
> And sorry I caused you all that trouble.

No prob. We don't mind pranks. Just be aware that sometimes your user
might get whacked in the process.

The way we do comment threading is we keep the entire thread in
memory, and precompute all the sorts whenever someone comments. That
huge thread was making the (usually fairly quick) sorting take

We're going to be rewriting things in the near future, and I think
this is one of the problems that'll go away.


Date: Sun, 14 Jan 2007 12:24:52 -0800
From: "Steve Huffman"
To: "John Graham-Cumming"
Subject: Re: tic-tac-toe and such

On 1/14/07, John Graham-Cumming wrote:
> Steve Huffman wrote:
> > The way we do comment threading is we keep the entire thread in
> > memory, and precompute all the sorts whenever someone comments. That
> > huge thread was making the (usually fairly quick) sorting take
> > forever.
> Interesting. Was the situation made worse by the fact that my comments
> were replies to other comments in the same thread, or was it just the
> sheer number?

Both, I presume. But a portion of the algorithm is recursive, so the
deep nests were probably extra deadly. I'll dig into it at some point
and let you know if it becomes prank-safe.

Date: Sun, 14 Jan 2007 21:27:57 +0100
From: "John Graham-Cumming"
To: "Steve Huffman"
Subject: Re: tic-tac-toe and such

On 1/14/07, Steve Huffman wrote:
> Both, I presume. But a portion of the algorithm is recursive, so the
> deep nests were probably extra deadly. I'll dig into it at some point
> and let you know if it becomes prank-safe.

Don't go to extra work on my account... after all I could just email
you the appropriate INSERT INTO statements and the thread could get
built in a couple of minutes :-)


After the Kevin Rose intervention I got the following from Digg:

Date: Fri, 28 Jul 2006 21:27:03 +0200
From: Digg Abuse
To: John Graham-Cumming
Subject: Re: banning of jgrahamc for misuse

Your account has been unbanned. Your account was banned for violating
digg Terms Of Use, submitting the exact same story in less then 3
minutes time, that's what spammers usually do on digg. As a consequence,
we banned your account. Your account was NOT banned for linking to or for submitting a joke.

-The Digg Watch Team.

Thursday, September 16, 2010

What happened to the Open Graph Protocol?

Back in April, Facebook announced a new 'protocol' (actually metadata markup for web pages) called Open Graph Protocol. The 'protocol' was yet another set of metadata for marking up web pages (akin to RDFa and microformats and HTML5 microdata) to give them some machine readable semantic information (BTW How long before the term 'semantic web' goes the way of 'artificial intelligence'?).

In its initial incarnation the Open Graph Protocol had one major use: enabling the sprinkling of Facebook Like buttons all over the web. And the metadata proposed in the 'protocol' was very Facebook specific. For example, a person could be identified as one of 'actor, athlete, author, director, musician, politician, or public_figure'. So, no butcher, baker or candlestick maker.

And here's where every single metadata/semantic web system runs into trouble: how do you define a schema for the meaning of things? How do you decide what's meaningful and what the right taxonomy or ontology is? Whilst Open Graph Protocol proposed a simple and easy to implement markup syntax for web pages, it didn't attack the real problem: what is the metadata itself? In fact, any fool can come up with a markup language, that's the easy part.

One really well developed metadata standard is the Dublin Core (which Open Graph Protocol is 'inspired by'). It's been worked on since 1994. 16 years into Dublin Core there's a workable set of metadata from describing things like the title of a page, or the author of a book. The lesson is that if you mess with metadata be in for a long haul: getting it right is hard.

One metadata standard is the Dewey Decimal System that tries to give a numerical meaning to any book. Here are the subcategories under '200 Religion':

210 Philosophy & theory of religion
220 The Bible
230 Christianity & Christian theology
240 Christian practice & observance
250 Christian pastoral practice & religious orders
260 Christian organization, social work & worship
270 History of Christianity
280 Christian denominations
290 Other religions

This is precisely the problem that Open Graph Protocol has today: its metadata is heavily influenced by the thinking of its creators. In the Dewey Decimal System it's clear that christianity was important and everything else was an "Other religion". For Facebook, metadata that matters to Facebook is what's important.

Unfortunately, a read of the Open Graph Protocol mailing list (which contains just 390 messages in total) shows that the problem of defining good metadata isn't being attacked. And the mailing list's traffic is faltering:

Admittedly, half way through September there have been 13 messages (more than the 8 in August).

My take is that Open Graph Protocol hasn't been widely adopted because its underlying real goal is not better semantic data for web sites, but better semantic data for Facebook.

Wednesday, September 15, 2010

Babbage's Debugger

In 1826 Charles Babbage realized that understanding the internal state of his engines was an arduous task and in an extensive paper on the subject (On a method of expressing by signs the action of machinery) he writes:

In the construction of an engine, [...], I experienced great delay and inconvenience from the difficulty of ascertaining from the drawings the state of motion or rest of any individual part at any given instant in time: and if it became necessary to enquire into the state of several parts at the same moment the labour was much encreased.
The difficulty of retaining in mind all the cotemporaneous and successive movements of a complicated machine, and the still greater difficulty of properly timing movements which had already been provided for, induced me to seek for some method by which I might at the glance of the eye select any particular part, and find at any given time its state of motion or rest, its relation to the motions of any other part of the machine, and if necessary trace back the sources of its movement through all its successive stages to the original moving power.

In the paper he goes on to develop a notation that allows him to draw something similar to a sequence diagram for a machine. But his diagram is at a very low level: it describes the motion of individual parts of a machine.

And he uses the notation to analyze the operation of a clock, drawing a large picture of the motion of each part of the clock and how motion of one piece influences another. From the diagram he is able to trace back the movement of the clock's minute hand to its original source of power.

He concluded by saying:

The signs [...] will form as it were an universal language; and to those who become skilful in their use, they will supply the means of writing down at sight even the most complicated machine, and of understanding the order and succession of the movements of any engine of which they possess the drawings and the mechanical notation. In contriving machinery, in which it is necessary that numerous wheels and levers, deriving their motion from distant part of the engine, should concur at some instant of time, or in some precise order, for the proper performance of a particular operation, it furnishes important assistance; and I have myself experienced the advantages of its application to my own calculating engine, when all other methods appeared nearly hopeless.

Since, at that time, Babbage was concerned with creating non-programmable machines such as the Difference Engine, his notation is the closest thing possible to a debugger. It allowed him to understand the state of the machine at any moment in time and trace back how that state was reached.

Clearly, that's not quite the same thing as the way debuggers are used today, but for Babbage he needed to debug prior to making the machine. He was using a form of static analysis to ensure that a machine would work.

The Myth of the Boy Wizard

I woke up this morning to a media storm of reports about the demise of Haystack. It's unfortunate that the subtext of many of these articles will be that Haystack's leader, Austin Heap, pulled the wool over the media's eyes. The real story should be the media's reaction to the software in the first place.

Haystack and Heap were lauded by the press. Newsweek wrote about the pair in glowing terms in August and the BBC also reported the same story. Earlier in the year The Guardian made Heap Innovator of the Year and posted a cringeworthy video interview with him. They wrote one story under the subhead

Austin Heap, the programmer from California, explains how he created Haystack, the software that broke the grip of Iran's censors after the disputed 2009 election

All these stories are light on actual facts and opinions from anyone other than Heap himself. And Heap does seem to have skirted the truth (or at least done a good publicist's job of exaggerating). He allowed The Guardian to write "Heap is the creator of Haystack, a piece of software which was a key technology used by Iranians to disseminate information outside the country in the protests that followed the disputed election result in June 2009" (and say the same thing in the video interview) without correcting them.

We know today that Haystack doesn't yet exist. There's a test version that's been disseminated to a small number of people in Iran, and that test version doesn't live up to the hype (from the media and from Heap himself). But it's worth putting aside Heap's narcissism and looking at the media response. Journalists deal with people pushing stories and press releases every single day. Part of their job is to look through the claims and dig out the reality. That didn't happen for Haystack.

It's notable that the news stories around Haystack focus on interviews with Heap, and don't include quotes or opinions from any reputable computer security folks. Or anyone who's worked on the problem of hiding Internet traffic in the past. That's simply a failure to be a journalist. There's no excuse for not asking a computer security expert for an opinion before publishing.

Today the press is reporting the opinion of computer security experts, why didn't they ask when the story first broke?

The answer, I think comes in the form of The Myth of the Boy Wizard. For the media the David/Goliath story of a cocky kid who takes on a government is irresistible. The media loves these "14 year old writes winning iPhone app", "16 year old Indian boy invents solar panel" etc. The story is not the fact that Austin Heap took on Iran, the story is Austin Heap.

The Boy Wizard is a potent image for the media because tied up in it are our own fears of aging and our hopes for salvation. The idea that the young are smarter than the old, and that the young will somehow save the old from their own problems, makes a wonderful subtext that draws readers in. Who hasn't read a story about a youthful genius and shaken their head and thought: "He's so young!" or "I could never have done that" or even "I wish I had the free time to do that"?

And so the media ignored people like me with their greying hair and ran with the story. At the same time Heap was ignoring us also. We know now that many people who know something about computer security tried to contact him and offer help. They were ignored or rebuffed. When I first saw the story of Haystack in Newsweek my BS detector went wild and I blogged my response and I wrote directly to Newsweek saying:

In your article "Computer Programmer Takes On the World's Despots" you appear to have taken the author of the supposed Haystack program at his word. There are no quotes from people who've used the software, nor from people who've seen the software. How do we know that Austin Heap is telling the truth, and, more importantly, how do we know that the software works as advertised?

Surely, it's very basic journalism to have talked to more than one person about this subject.

I received no response.

Had I known about The Guardian's articles back in March I would have been banging on them as well. It's not just bad journalism to take someone at their word and publish glowing articles, in this case it's downright dangerous. Real people inside Iran could have been endangered by this over-hyped piece of software.

The Guardian, and others, need to think hard about their actions, and not simply publish "we were duped" stories in response. Any reputable computer security person would have told you of their doubts about Haystack, but it looks like many media outlets, The Guardian included, just didn't bother to ask.

Tuesday, September 14, 2010

Fooled by pseudorandomness

When people talk about codes and ciphers they get very excited about the cryptographic algorithms: everything from Enigma through DES to AES and elliptic curves excites the popular imagination. Oddly the Achilles' Heel of many secure system is much more mundane and simpler to understand.

Cryptographic systems require good random numbers. And by good I mean unpredictable. That means that whatever your source of random numbers is, I shouldn't be able to predict the next number it's going to give. And I definitely shouldn't be able to do that after seeing a few numbers it has come up with.

Back in the Second World War, Nazi Germany had a lovely cryptographic system called Lorenz (which the British referred to as Fish). It relied on generating random numbers. But, unfortunately, the numbers were predictable and the British were able to generate the same sequence of random numbers and break the code.

More recently there have been examples of poor use of random numbers that make the headlines (at least in the places I read headlines). There's the story of the US game show contestant on Press Your Luck who realized that a sequence of lights wasn't random and memorized it to win >$100k.

Another story relates how a team discovered how to predict the order of cards in online poker because of poor randomization.

And not long ago I pointed out a theoretical attack of the web site Hacker News which someone subsequently independently discovered and exploited. And the list goes on.

Fundamentally, these attacks happen because the random numbers generated are predictable (either by looking at a sequence of numbers, or by knowing how the random sequence got started). That's because common computer random number generators are based on mathematical formulae that follow a sequence of apparently random numbers. These are called pseudorandom number generators. They are great if you are doing something like Monte Carlo simulation, but bad for keeping secrets.

And they are a disaster for game shows, online poker, Nazi Germany and anything cryptographic. If you need random numbers (and you need them a lot for computer security for things like generating keys) you need a Cryptographically Secure Pseudorandom Number Generator (CSPRNG). A CSPRNG will give you unpredictability. Otherwise you are in big trouble because if you can predict the underlying random numbers used for your cryptography your entire security scheme is likely in jeopardy.

One of the first places I look for vulnerabilities is in the use of random numbers. People screw up here all the time. Which is why I wasn't surprised by something nasty lurking in a technical description that's come out with the current Haystack storm.

The main developer wrote in a private forum:

We use the Boost implementation of Mersenne Twister for our CSPRNG, which we use to generate session keys, nonces, and general purpose random numbers, though the specific CSPRNG used is subject to change, at least before release. We seed the CSPRNG with entropy from the system entropy source (CryptGenRandom under Windows and /dev/random under Unix systems).

The Mersenne Twister is emphatically not a CSPRNG. That alone makes the foundations of Haystack look very shaky (unless they are doing something else to take that output and make it into a CSPRNG).

If you are doing any work where you need random numbers, use a CSPRNG. Don't invent your own, or rely on something weak. All platforms have them available.

Haystack project responds to 'security concerns', looks like it's falling apart

In a rather ranty post of mine I criticized the Haystack project for a lack of openness. Happily, there's an official blog post indicating that they are stopping testing because of security concerns:

Recently, there has been a vigorous debate in the security community regarding Haystack’s transparency and security. We believe that many of the points made in this debate were valid. As a result, and in order to ensure Haystack’s security, we have halted ongoing testing of Haystack in Iran pending a security review. We have begun contacting users of Haystack to tell them to cease using the program. We will not resume testing until this third party review is completed and security concerns are addressed in an open and transparent way.

It would be nice if they pointed to this debate, talked about which points they found valid and told us who was doing the third-party review etc. They really need to engage people who've been involved in this sort of thing to make sure that their code is going to work.

Roll on the openness and transparency.

Update: Oh wait, a read of Jacob Applebaum's Twitter feed makes it look like he's analyzed Haystack and the results are not good at all. And here's what he appears to have to say:

Hi - I have analyzed Haystack. It is total garbage and Austin Heap has pulled one over on the world.

I spoke with Heap on Friday and he promised that the network was disabled before we spoke on Friday. I was very sad to need to prove to a few specific people that it was still on late Sunday evening.

My findings are the reason that the Haystack network has now been shut off, his lead developer apparently turned the network down and locked him out of the machines. His advisory board has resigned as of today according to my sources

An ugly situation. Probably not good that Danny O'Brien wrote the following on Twitter:

never been angrier than right now. I can't actually describe how broken @haystacknetwork is, because to do so would put people at risk.

And the main developer has apparently quit:

What I am resigning over is the inability of my organization to operate effectively, maturely, and responsibly. We have been disgraced. I am resigning over dismissing pointed criticism as nonsense. I am resigning over hype trumping security. I am resigning over being misled, and over others being misled in my name.

Update: Here's a good summary of the situation. And here's a great summary of all the glowing media at the time.

Wonder if BBC, Newsweek, The Guardian etc. will apologize? They should. It's shameful to see this sort of reporting. Shameful.

Monday, September 13, 2010

GAGA-1: Weight Budget

A quick tally of various components and their weights.

I need to keep the entire thing under 1kg so that the balloon I plan to use (a 1200g latex sounding balloon) can give sufficient lift for a fast ascent: 5m/s.

Capsule 200g
Camera 215g
Arduino Duemilanove 28g
Telit GSM862 34g
3.7V LiPo Battery 28g
4 AA Batteries 88g
GSM Antenna 46g
2 x GPS Antenna 40g
Lassen IQ GPS 20g

That's roughly 700g leaving me 300g for bits of wiring, a PCB for the flight computer, the radio module, and some discrete components and the rope.

Every thing about this project is a challenge: weight, extreme temperatures, dealing with the wind, a combination of radio and computer work, an autonomous camera, ...

GAGA-1: The Capsule

The other thing that I did over the weekend was some initial work on the capsule. It consists of an expanded polystyrene box from Ferribox. Specifically, the XPS-2 which is a 245mm cube with an interior 150mm cube (i.e. very thick walls!) and weighs 200g (I'll dedicate another post to the weight budget but the whole thing needs to weigh less than 1kg). It has a fitted lid with a lip to keep it in place. Here's what it looks like:

A quick sanity check of space inside shows that the cube will easily contain the recovery GPS (bottom right), main flight computer (top right) and camera (left). Clearly I'm not showing all the wiring, antennae, probes and power, but there's plenty of room:

I plan to paint the capsule (with non-solvent based paint--those eat polystyrene) a very bright colour (probably fluorescent yellow) for easy recovery when it lands. The capsule will be attached to the balloon/parachute/radar reflector assembly by some nice nautical rope at each of the four corners. I have some strong, light-weight rope like this:

Since the stratosphere is at -55C there's clearly some temperature change to be worried about, hence the insulated box. Like everything else in GAGA-1 I've been testing and this weekend I did a rough and ready test of expected internal temperatures by abusing physics wildly and using my home freezer.

Here's the physics abuse: suppose the capsule is launched at ground level at 20C and rises at a constant rate to an altitude where the temperature is -55C and that the temperature gradient between ground and sky is constant then the average temperature encountered is (20 + -55) / 2 = -17.5C. Convenientely my freezer is at -18C so I decided to shove the box in the freezer for two hours to see what happens. Inside the box was a digital thermometer transmitting the temperature to an external monitor. I wrote down the temperature every 15 minutes.

Here's a chart of the actual temperatures (green) and the predicted temperature trend using Newton's Law of Cooling (red). X-axis is elapsed time in minutes, Y-axis is internal temperature in C.

Given a maximum flight time of 3 hours the prediction would be for the internal temperature to drop to somewhere close to -15C. That's acceptable.

As well as abusing physics this calculation ignores two other pertinent facts:

1. There's actually a source of heat inside the capsule: the electronics. The camera, GPS and radio transmitter all heat up while operating so there will be some internal warming.

2. The box won't be completely sealed as holes will be pierced for the four antennae and the camera lens.

Will be interesting to see how the temperature really drops on the day.

Kindle Kovers

Here's an idea that I don't have time to work on, but perhaps you do: plastic covers for the Kindle that have the cover of a book you are reading printed on them.

I love the Kindle, but I lose the whole display behavior of people seeing what I'm reading. So here's a way to solve that: Kindle Kovers. Imagine a slim plastic jacket for the Kindle. On one side there's a space for the screen and keyboard, on the other there's a laser printed image of a book cover that everyone can see while I'm reading.

Such as service should be quite doable. Kindle users could forward their Kindle receipts to Kindle Kovers and then get sent a cover in return with the appropriate image on it. Since the Kindle receipt contains the book name and the buyer's name and address the service would be simple to use.

To get the images it's just a matter of parsing the book name and grabbing and printing the large size image from itself. All you'd need are a supply of plastic jackets of the right Kindle size and a color printer.

To make money you use affiliate links. When people send a request through you send them a link to confirm the dispatch of the cover (correct address, Kindle size, etc.). One that page you show related books (since you know the book they are buying) with affiliate links.


If you implement this idea, slip me 10% :-)

Sunday, September 12, 2010

GAGA-1: 2,766 tedious photographs and a log file

I couldn't take the uBASIC code for controlling the GAGA-1 camera any longer and I rewrote the entire thing in the other language that CHDK supports: Lua. This has resulted in a much better program with additional functionality: a proper log file stored on the camera's SD card, reading of battery levels and internal temperatures and a simple routine to check that the camera has enough memory space for the projected flight time.

Last night I ran a test taking 10 pictures per minute with a 16Gb memory card. The result is a pair of drained Energizer Ultimate Lithium batteries, 2,766 tedious photographs out the back window of the house, and a log file indicating that the camera ran from 1704 to 0005 before failing. The camera actually took less than 10 pictures per minute taking a few seconds per image as the night wore on. The gap between cycles remained at 6 seconds.

So the camera ran for 7h01m on new batteries and took a picture every 9.1s on average.

The camera's log file begins:

20100911170411,GAGA Camera Control
20100911170411,Self-check started
20100911170411,Assert property (49) -32764 == -32764 (Not in manual mode)
20100911170411,Assert property (5) 0 == 0 (AF Assist Beam should be Off)
20100911170412,Assert property (6) 0 == 0 (Focus Mode should be Normal)
20100911170412,Assert property (8) 0 == 0 (AiAF Mode should be On)
20100911170412,Assert property (21) 0 == 0 (Auto Rotate should be Off)
20100911170412,Assert property (29) 0 == 0 (Bracket Mode should be None)
20100911170412,Assert property (57) 0 == 0 (Picture Mode should be Superfine)
20100911170412,Assert property (66) 0 == 0 (Date Stamp should be Off)
20100911170412,Assert property (95) 0 == 0 (Digital Zoom should be None)
20100911170412,Assert property (102) 0 == 0 (Drive Mode should be Single)
20100911170412,Assert property (133) 0 == 0 (Manual Focus Mode should be Off)
20100911170413,Assert property (143) 2 == 2 (Flash Mode should be Off)
20100911170413,Assert property (149) 100 == 100 (ISO Mode should be 100)
20100911170413,Assert property (218) 0 == 0 (Picture Size should be L)
20100911170413,Assert property (268) 0 == 0 (White Balance Mode should be Auto)
20100911170413,Assert 2010 > 2009 (Unexpected year)
20100911170413,Assert 17 > 6 (Hour appears too early)
20100911170413,Assert 17 < 20 (Hour appears too late)
20100911170413,Assert 3138 > 2700 (Batteries seem low)
20100911170413,Assert 5078 > 1800 (Insufficient card space)
20100911170413,Self-check complete
20100911170424,Starting picture capture
20100911170426,Picture 1 taken
20100911170426,Temperatures: 34, 34, 30
20100911170426,Battery level 3017
20100911170434,Picture 2 taken
20100911170434,Temperatures: 34, 34, 30
20100911170434,Battery level 2983
20100911170442,Picture 3 taken

The time stamp is YYYYMMDDhhmmss. The three temperatures are the optical components, the CCD and the battery compartment. The camera was warm when I started because I'd been playingwith it. The battery level is in mV.

The temperature peaked early on and then settled down to a steady warm glow: the camera was warm to the touch while running. This'll be a little different on the flight since the external temperature of the capsule will drop to -55C. Here's a chart showing the CCD and Battery temperatures:

The battery declined steadily throughout the run:

And here's the code for anyone who wants to try the same thing:

-- GAGA Camera Control Code
-- Copyright (c) 2010 John Graham-Cumming
-- Performs the following steps:
-- Performs a self-check
-- Waits for a predetermined amount of time
-- Enters loop doing the following:
-- Take a number of photographs in succession
-- Wait a predetermined amount of time

@title GAGA Camera Control

@param s Start-up delay (secs)
@default s 10

@param c Pictures per iteration
@default c 1

@param i Iteration delay (secs)
@default i 6

@param f Flight time (hrs)
@default f 3


function stamp()
return string.format("%4d%02d%02d%02d%02d%02d",

ok = 1

function log(m)
l ="A/gaga.log", "ab")
if ( l ~= nil ) then
l:write(string.format("%s,%s\n", stamp(), m))

function assert_error(e)
er = string.format("ERROR: %s", e)
log( er )
ok = 0

function assert_prop(p, v, m)
pp = get_prop(p)
log( string.format("Assert property (%i) %i == %i (%s)", p, pp, v, m))
if ( pp ~= v ) then

function assert_eq(a, b, m)
log( string.format("Assert %i == %i (%s)", a, b, m))
if ( a ~= b ) then

function assert_gt(a, b, m)
log( string.format("Assert %i > %i (%s)", a, b, m))
if ( a <= b ) then

function assert_lt(a, b, m)
log( string.format("Assert %i < %i (%s)", a, b, m))
if ( a >= b ) then

-- The sleep function uses microseconds do the s and i need
-- to be converted

ns = (f * 60 * 60 * c) / i
s = s * 1000
i = i * 1000

log( "GAGA Camera Control" )

-- Now enter a self-check of the manual mode settings

log( "Self-check started" )

assert_prop( 49, -32764, "Not in manual mode" )
assert_prop( 5, 0, "AF Assist Beam should be Off" )
assert_prop( 6, 0, "Focus Mode should be Normal" )
assert_prop( 8, 0, "AiAF Mode should be On" )
assert_prop( 21, 0, "Auto Rotate should be Off" )
assert_prop( 29, 0, "Bracket Mode should be None" )
assert_prop( 57, 0, "Picture Mode should be Superfine" )
assert_prop( 66, 0, "Date Stamp should be Off" )
assert_prop( 95, 0, "Digital Zoom should be None" )
assert_prop( 102, 0, "Drive Mode should be Single" )
assert_prop( 133, 0, "Manual Focus Mode should be Off" )
assert_prop( 143, 2, "Flash Mode should be Off" )
assert_prop( 149, 100, "ISO Mode should be 100" )
assert_prop( 218, 0, "Picture Size should be L" )
assert_prop( 268, 0, "White Balance Mode should be Auto" )
assert_gt( get_time("Y"), 2009, "Unexpected year" )
assert_gt( get_time("h"), 6, "Hour appears too early" )
assert_lt( get_time("h"), 20, "Hour appears too late" )
assert_gt( get_vbatt(), 3000, "Batteries seem low" )
assert_gt( get_jpg_count(), ns, "Insufficient card space" )

log( "Self-check complete" )

if ( ok == 1 ) then
log( "Starting picture capture" )

n = 0

while ( 1 ) do
tc = c
while ( tc > 0 ) do
n = n + 1
log( string.format("Picture %i taken", n ))
tc = tc - 1
log( string.format("Temperatures: %i, %i, %i",
get_temperature(0), get_temperature(1), get_temperature(2) ))
log( string.format("Battery level %i", get_vbatt()))

log( "Done" )

The only remaining worry was that the camera was showing the 'shake' symbol while taking pictures. This turned out to be simply because night was falling and I have the camera locked into manual mode at ISO 100 with no flash. The shake symbol was a warning of a low f-stop and long shutter open period. Will be a bit different in the ever sunny stratosphere.