Monday, July 10, 2006

A simple code for entering latitude and longitude to GPS devices

This post proposes a coding system for entering any location on earth with 10m of accuracy using a 10 character code that includes features to prevent errors in entering the code.

The idea is that any one could publish their location by writing something like VUF DDC F8UG. This short code could be entered into a GPS device giving you any spot on the globe.

I'm calling it the SOC: Simple Orientation Code.

Some example uses:
  1. I could print my company's SOC on my business cards and visitors could punch it into their car navigation system and come visit
  2. A restaurant could publish its SOC along with its phone number (after all it's the same length as a phone number so it's something people can easily grok) making the restaurant easy to find
  3. Geocachers could publish SOC trails for people hunting down caches
  4. SCUBA divers could refer to dive sites by their SOC (10m of accuracy is enough surface accuracy for most people)
Here's how the code works.

First you need the latitude and longitude of the location you are talking about to 4 decimal places of accuracy. 4 decimal places gives about 10m of accuracy. So treating latitude as ranging from 0 to 180 degrees (basically change it from -90 to 90 degrees by adding 90) and longitude as from 0 to 360 degrees (ignoring east/west or +/- values) and then treating the two numbers as integers (i.e. take the 4 decimal place latitude or longitude and multiply by 10000) you get two numbers: La and Lo.

La varies from 0 to 1,799,999 and Lo from 0 to 3,599,999. These two numbers can be combined to form a single number that I call P (your position) like this:

P = La * 3600000 + Lo

Extracting the La and Lo from P is simply a matter of dividing P by 3,600,000 (to get La) and calculating the remainder (to get Lo).

P varies from 0 to 6,479,998,200,000 which can be stored in 43 bits.

Now encoding P in some form typeable by a human requires an alphabet. The SOC alphabet consists of the following 32 characters:

ABCDEFGHJKLMNPQRTUVWXY0123456789

This is the standard English alphabet plus Arabic numerals 0 through 9 with the following letters removed: I, O, S, and Z. These are removed because I is easily confused with both 1 and J; O is easily confused with 0; S is easily confused with 5 and Z is easily confused with 2. These characters are removed to ensure that the code is minimally affected by bad handwriting.

Moreover an implementation using the SOC should silently perform the following translations: I becomes 1; O becomes 0; S becomes 5 and Z becomes 2. This way the user will not have to correct a poorly written SOC.

Each character in the alphabet represents a number between 0 and 31.

A(0) B(1) C(2) D(3) E(4) F(5) G(6) H(7) J(8) K(9) L(10) M(11) N(12) P(13)
Q(14) R(15) T(16) U(17) V(18) W(19) X(20) Y(21) 0(22) 1(23) 2(24) 3(25)
4(26) 5(27) 6(28) 7(29) 8(30) 9(31)

P can be encoded using 10 characters from this alphabet. Since each character contains 5 bits of information and only 43 bits are needed for the position that leaves 7 bits for an error checking code. The algorithm used to generate the check digit is a variant of the scheme used for ISBNs.

The 43 bit P is broken into 11 4 bit numbers with a zero padded on the left of P. The 11 numbers are p0 through p10. A check digit C is calculated as follows:

C = ( p0 * 37 + p1 * 31 + p2 * 29 + p3 * 23 + p4 * 17 + p5 * 13 + p6 * 11 + p7 * 7 + p8 * 5 + p9 * 3 + p10 * 2 ) mod 127

C is then appended to P to create the SOC.

Now for some Perl code that implements the coding and encoding of SOCs.

Converting a latitude and longitude to a SOC:

use strict;

if ( $#ARGV != 1 ) {
die "Usage: to-soc ";
}

my ( $lat, $lon ) = @ARGV;

my $alpha = 'ABCDEFGHJKLMNPQRTUVWXY0123456789';
my @alphabet = split(//,$alpha);

$lat += 90;
$lon += 180;

$lat *= 10000;
$lon *= 10000;

my $p = $lat * 3600000 + $lon;

my $soc_num = $p * 128;

my @primes = ( 2, 3, 5, 7, 11, 13, 17, 23, 29, 31, 37 );

my $c = 0;

foreach my $prime (@primes) {
$c += ($p % 32) * $prime;
$p = int($p / 32);
}

$c %= 127;

$soc_num += $c;

my $digits = 10;

my $soc = '';

while ( $digits > 0 ) {
my $d = $soc_num % 32;
$soc = $alphabet[$d] . $soc;
$soc_num = int($soc_num/32);
--$digits;
}

print "$soc\n";

Converting a SOC back to a latitude and longitude:

use strict;

if ( $#ARGV != 0 ) {
die "Usage: from-soc <10-digit-soc>";
}

my $soc = uc($ARGV[0]);
$soc =~ tr/IOSZ/1052/;

my $alphabet = 'ABCDEFGHJKLMNPQRTUVWXY0123456789';

my $soc_num = 0;

foreach my $letter (split(//, $soc)) {
$soc_num *= 32;
$alphabet =~ /(.*)$letter/;
$soc_num += length($1);
}

my $p = int($soc_num / 128);
my $check = $soc_num % 128;

my $lon = $p % 3600000;
my $lat = int($p / 3600000);

$lat /= 10000;
$lon /= 10000;

$lat -= 90;
$lon -= 180;

my @primes = ( 2, 3, 5, 7, 11, 13, 17, 23, 29, 31, 37 );

my $c = 0;

foreach my $prime (@primes) {
$c += ($p % 32) * $prime;
$p = int($p / 32);
}

$c %= 127;

if ( $check != $c ) {
die "Incorrect SOC";
} else {
print "$lat $lon\n";
}

This idea and code is being released by me into the public domain.

Those of you with a twisted mind like to try to find points on the globe that have human-readable SOCs. For example, by picking coordinates that contain a word in the SOC. Challenge: find a location on the blog that's something along the lines of TREASURE or STARTHERE.

7 comments:

hutsefluts said...

I'd thought you ought to use:
P = La * 3600000 + Lo

If La = 1 and Lo = 0 with your proposed 3599999 would not differ from La=0 and Lo=3599999

Nice idea, but you might want to stress what is so wrong with numericals.

Hope this helps,

JP

John Graham-Cumming said...

I hang my head in shame. Of course it should be 3600000 and not 3599999. Silly, silly me.

The basic problems with numericals are:

1. Too long
2. No error checking

John.

Harley said...

In your examples, #2:
"making the restaurant easy to fine"

How much money would you fine them?
:-)

John Graham-Cumming said...

Thanks I've corrected the typo.

John.

Ray Marron said...

Removal of all vowels from your encoding alphabet would also remove the potential for "embarassing" SOC codes. The subject of their replacements is another discussion.

Nicholas said...

I just saw this today, and wanted to test it out and show a few people, but running perl scripts isn't easy for most web users. So that's what I've made a php viewer:

http://www.udhaonline.net/scripts/soc2lat.php

And a few pre-made places to view:

JGBEKQN59P - Brisbane Parliment House

JMHWYA5LXY - Johannesburg Soccer Stadium

VND34MHNYL - Nürburgring, Germany

Ericeira Pirates said...

Hi,
Would it be possible to get me the coordinates for this M20CNACRXW ?
Thanks in advance
Best regards
J.