Friday, May 14, 2010

If you're going to search the web, make an intelligent guess first

Today, on Hacker News there was an irresistable question posed by a user: What format is this? with a snippet of some sort of computer code:

{D1531,1000,1501|} {C|} {U2;0130|} {D1531,1000,1501|}
{AX;+000,+000,+00|} {AY;+05,0|}
{PC000;0922,0555,15,15,H,11,B|} {RC00;LABELTITLE|}
{PC001;0865,0555,15,15,H,11,B|} {RC01;VOLTAGE|}

I looked at it and couldn't resist the challenge of figuring it out. Looking at it, it reminded me of the sort of code used to drive printers and plotting devices. There appeared to be references to labels and x and y axes. It made me think of the old Epson MX-80 command sequences, and Postscript, of Hayes modem commands and NMEA 0183, and other much more obscure things I've seen for talking to embedded microprocessors.

It looked like the structure was a sequence of commands enclosed in { } with a command letter or letters at the beginning.

But how exactly should you go searching for this sort of thing? Try Googling the first command {D1531,1000,1501|} and you'll find nothing of any use. So, this is where an intelligent guess comes in.

Of all the commands in the block given {AX;+000,+000,+00|} looked like the most findable to me. The others seemed to have very specific arguments to the commands that are unlikely to turn up, but I guessed that {AX;+000,+000,+00|} was something to do with setting an x coordinate or x axis to a default position of all zeroes. That seemed like a very common thing to do and worth a google.

The very first result is for a PDF from Century Systems, Inc. for a manual called the Century Eagle 4, Century Eagle 5 Basic Interpreter White Paper which shows a BASIC program outputting commands to a Century Eagle label printer.

Google "Century Eagle 4" and you get to the product page which contains a link to a PDF of the Eagle 4 Programmer Manual that details all the commands asked about by the original poster.

The key to this search efficiency is making an intelligent guess about what will be findable and what will be distinguishing. In this case, it's the findable that matters because the command sequences are so obscure that it's unlikely that you'll be wading through pages of almost, but not quite the right results.

In other cases, it's about both finding and distinguishing. But that's the subject of an older blog post of mine.

No comments: