Skip to main content

Animated solution to the "Never gonna give you up" program problem

I mentioned the fun "smallest program to output the lyrics of 'Never gonna give you up'" the other day.  It being a bank holiday weekend I've come up with my own solution which has 589 bytes in Perl.

It works by building a dictionary from which to compress the lyrics and then there's a simple decompressor.  Like many of the solutions this works by having the dictionary be indexed by single bytes and be built into a single string that expands to the result.

Here's the compressor.
use strict;
use warnings;

my $n =<<EOF;
We're no strangers to love
You know the rules and so do I
A full commitment's what I'm thinking of
You wouldn't get this from any other guy
I just wanna tell you how I'm feeling
Gotta make you understand

Never gonna give you up
Never gonna let you down
Never gonna run around and desert you
Never gonna make you cry
Never gonna say goodbye
Never gonna tell a lie and hurt you

We've known each other for so long
Your heart's been aching but
You're too shy to say it
Inside we both know what's been going on
We know the game and we're gonna play it
And if you ask me how I'm feeling
Don't tell me you're too blind to see

Never gonna give you up
Never gonna let you down
Never gonna run around and desert you
Never gonna make you cry
Never gonna say goodbye
Never gonna tell a lie and hurt you

Never gonna give you up
Never gonna let you down
Never gonna run around and desert you
Never gonna make you cry
Never gonna say goodbye
Never gonna tell a lie and hurt you

(Ooh, give you up)
(Ooh, give you up)
(Ooh)
Never gonna give, never gonna give
(Give you up)
(Ooh)
Never gonna give, never gonna give
(Give you up)

We've know each other for so long
Your heart's been aching but
You're too shy to say it
Inside we both know what's been going on
We know the game and we're gonna play it

I just wanna tell you how I'm feeling
Gotta make you understand

Never gonna give you up
Never gonna let you down
Never gonna run around and desert you
Never gonna make you cry
Never gonna say goodbye
Never gonna tell a lie and hurt you

Never gonna give you up
Never gonna let you down
Never gonna run around and desert you
Never gonna make you cry
Never gonna say goodbye
Never gonna tell a lie and hurt you

Never gonna give you up
Never gonna let you down
Never gonna run around and desert you
Never gonna make you cry
Never gonna say goodbye
Never gonna tell a lie and hurt you
EOF

my @b = map(chr, (reverse 129..254));

foreach my $i (reverse 2..173) {
    my $found = 1;
    while($found) {
        my %saving;
        $found = 0;

        foreach my $j (0..length($n)-$i) {
            my $s = substr($n, $j, $i);
            my $t = $n;
            $t =~ s/\Q$s\E/0/g;
            $saving{$s} = length($n) - length($t) - $i - 1;
        }

        my $top = (sort { $saving{$b} <=> $saving{$a} } keys %saving)[0];
        if ($saving{$top} > 0) {
            my $nb = shift(@b);
            $n =~ s/\Q$top\E/$nb/g;
            $n = $top . "$nb$n";
            $found = 1;
        }
    }
}

print $n;
It works by looking for all strings starting at length 173 (which I determined by looping and corresponds to the entire chorus) and working backwards. It computes the space saved by extracting that string and replacing it with a dictionary entry and removes the strings that provide the most space saving. When there are no more left for a specific length it moves onto the next length (one smaller). The result is 514 byte string containing a bunch of high-ASCII characters like this:
 gÐngÑveÒe ÓthÔouÕo Önd× yÕØÖloÙiÑ Út's Û(OohÜe'reÝ
YÕÞ knowß,ÐivàÚoá I'm â tÖsãtell ä a× åØ æeåçay it
è äértØêß ëÐonna ì oÔer íæupîmakeæïÛbeen ðëÔÓñÕ'rÓtoÖòeî)
óeÒrìô
We'Òßõ
NôöóÜ÷ôgiÒø howâfeeliÑ
ù
Nøú÷)ú, nø
(Givû
I just wannaéyÕùGotta ïu×ersta×
þü eachífor sÙÑÞr hearðachÚbut
YòshyãèInsidÓwÓboÔëwhaðgoán
WeñgamçwÝìplèýúîöletædownörun arÕ×ådeseêöïcryösayÐoodbyeöäa liçhuê
þWÝ nÖstraÑers tÙÒÞñrulesåsÖdÖI
A full commitmenÛwhatâÔinkáfÞ wÕldn'tÐet Ôis from anyíguyüõnýA× 
ifæask meùDon'témÓyòbli×ãee
þþ
Üà÷àûûóõýüþþ
To understand the compression look at the start. The first string to be decompressed will be " g", then "ng", then "ve" and so on. The high-ASCII characters are both the dictionary separators (between entries) and the characters to be replaced with the corresponding entry in the dictionary. Thus each occurrence of "Ð" is replaced with " g". A complete program to decompress to the lyrics is as follows:
$b=<<E;
 gÐngÑveÒe >ÓthÔouÕo Önd× yÕØÖloÙiÑ Út's Û(OohÜe'reÝ
YÕÞ knowß,ÐivàÚoá I'm â tÖsãtell ä a× åØ æeåçay it
è äértØêß ëÐonna ì oÔer íæupîmakeæïÛbeen ðëÔÓñÕ'rÓtoÖòeî)
óeÒrìô
We'Òßõ
NôöóÜ÷ôgiÒø howâfeeliÑ
ù
Nøú÷)ú, nø
(Givû
I just wannaéyÕùGotta ïu×ersta×
þü eachífor sÙÑÞr hearðachÚbut
YòshyãèInsidÓwÓboÔëwhaðgoán
WeñgamçwÝìplèýúîöletædownörun arÕ×ådeseêöïcryösayÐoodbyeöäa liçhuê
þWÝ nÖstraÑers tÙÒÞñrulesåsÖdÖI
A full commitmenÛwhatâÔinkáfÞ wÕldn'tÐet Ôis from anyíguyüõnýA× 
ifæask meùDon'témÓyòbli×ãee
þþ
Üà÷àûûóõýüþþ
E
map{($a,$b)=split($c=chr,$b,2);$b=~s/$c/$a/g}(208..254);print$b
The final line does all the work progressively splitting off the dictionary entry and then using a regular expression substitution to transform the string. There's a bunch of Perl abuse going on. Notice there's no ; at the end of the program or in the { } block. The $t=chr works because chr reads the $_ variable set from the map and it both sets $c and returns the set value as an argument to split. Also notice how its valid Perl to do print$b with no space. Since it's fairly likely that some of those characters got messed up in the web version, here's a Base64 encoded version of the program for people to test with.
JGI9PDxFOwogZ9BuZ9F2ZdJlINN0aNRvddVvINZu
ZNcgedXY1mxv2WnRINp0J3Mg2yhPb2jcZSdyZd0K
WdXeIGtub3ffLNBpduDab+EgSSdtIOIgdNZz43Rl
bGwg5CBh1yDl2CDmZeXnYXkgaXQK6CDk6XJ02Orf
IOvQb25uYSDsIG/UZXIg7eZ1cO5tYWtl5u/bYmVl
biDw69TT8dUnctN0b9byZe4pCvNl0nLs9ApXZSfS
3/UKTvT289z39Gdp0vggaG934mZlZWxp0Qr5Ck74
+vcp+iwgbvgKKEdpdvsKSSBqdXN0IHdhbm5h6XnV
+UdvdHRhIO9112Vyc3Rh1wr+/CBlYWNo7WZvciBz
2dHeciBoZWFy8GFjaNpidXQKWfJzaHnj6Eluc2lk
03fTYm/U63doYfBnb+FuCldl8Wdhbed33exwbOj9
+u72bGV05mRvd272cnVuIGFy1dflZGVzZer272Ny
efZzYXnQb29kYnll9uRhIGxp52h16gr+V90gbtZz
dHJh0WVycyB02dLe8XJ1bGVz5XPWZNZJCkEgZnVs
bCBjb21taXRtZW7bd2hhdOLUaW5r4WbeIHfVbGRu
J3TQZXQg1GlzIGZyb20gYW557Wd1efz1bv1B1yBp
ZuZhc2sgbWX5RG9uJ3TpbdN58mJsadfjZWUK/v4K
3OD34Pv78/X9/P7+CkUKbWFweygkYSwkYik9c3Bs
aXQoJGM9Y2hyLCRiLDIpOyRiPX5zLyRjLyRhL2d9
KDIwOC4uMjU0KTtwcmludCRiCg==
And to make it totally clear what's happening, here's a little animation of the decompression process.

Comments

Will said…
Beautiful :)

Byte-Pair Encoding http://en.wikipedia.org/wiki/Byte_pair_encoding
Will said…
Beautiful :)

Byte-Pair Encoding http://en.wikipedia.org/wiki/Byte_pair_encoding
Francis Turner said…
I had a bunch of problems with the compressed sting. Here was my way to resolve them (t.pl was your compressor script):

$ echo '$b=<tt.pl
$ perl t.pl >>tt.pl
$ echo ''>>tt.pl
$ echo 'E'>>tt.pl
$ echo 'map{($a,$b)=split($c=chr,$b,2);$b=~s/$c/$a/g}(208..254);print$b'>>tt.pl
$ perl tt.pl

I suspect that cunning use of bit ranges would allow you to compress further (you have 49? compressed strings and the characters are a subset of [\n a-zA-Z'()] So some cunning tr would allow you to use 7 bits instead of 8 per character in the compressed string.

The only question is whether you could hack perl's pack/unpack functions (and tr) to function work in less than 60 total characters to get a net gain on the compression

Popular posts from this blog

Your last name contains invalid characters

My last name is "Graham-Cumming". But here's a typical form response when I enter it:


Does the web site have any idea how rude it is to claim that my last name contains invalid characters? Clearly not. What they actually meant is: our web site will not accept that hyphen in your last name. But do they say that? No, of course not. They decide to shove in my face the claim that there's something wrong with my name.

There's nothing wrong with my name, just as there's nothing wrong with someone whose first name is Jean-Marie, or someone whose last name is O'Reilly.

What is wrong is that way this is being handled. If the system can't cope with non-letters and spaces it needs to say that. How about the following error message:

Our system is unable to process last names that contain non-letters, please replace them with spaces.

Don't blame me for having a last name that your system doesn't like, whose fault is that? Saying "Your last name …

All the symmetrical watch faces (and code to generate them)

If you ever look at pictures of clocks and watches in advertising they are set to roughly 10:10 which is meant to be the most attractive (smiling!) position for the hands. They are actually set to 10:09.14 if the hands are truly symmetrical. CC BY 2.0image by Shinji
I wanted to know what all the possible symmetrical watch faces are and so I wrote some code using Processing. Here's the output (there's one watch face missing, 00:00 or 12:00, because it's very boring):



The key to writing this is to figure out the relationship between the hour and minute hands when the watch face is symmetrical. In an hour the minute hand moves through 360° and the hour hand moves through 30° (12 hours are shown on the watch face and 360/12 = 30).
The core loop inside the program is this:   for (int h = 0; h <= 12; h++) {
    float m = (360-30*float(h))*2/13;
    int s = round(60*(m-floor(m)));
    int col = h%6;
    int row = floor(h/6);
    draw_clock((r+f)*(2*col+1), (r+f)*(row*2+1), r, h, floor(m…

The Elevator Button Problem

User interface design is hard. It's hard because people perceive apparently simple things very differently. For example, take a look at this interface to an elevator:


From flickr

Now imagine the following situation. You are on the third floor of this building and you wish to go to the tenth. The elevator is on the fifth floor and there's an indicator that tells you where it is. Which button do you press?

Most people probably say: "press up" since they want to go up. Not long ago I watched someone do the opposite and questioned them about their behavior. They said: "well the elevator is on the fifth floor and I am on the third, so I want it to come down to me".

Much can be learnt about the design of user interfaces by considering this, apparently, simple interface. If you think about the elevator button problem you'll find that something so simple has hidden depths. How do people learn about elevator calling? What's the right amount of informati…