## Tuesday, May 03, 2016

### What the "Silicon Valley" Easter Egg code does and how

In the TV series Silicon Valley there was apparently a snippet of code presented that is part of a compression algorithm. The code (or at least part of it) can be executed and the the program will output:

DREAM_ON_ASSHOLES

The code itself has been published first as a screen shot, then as text and got some press. But I hadn't seen a good explanation of how it works. It looks pretty complicated but most of the code (written in C) is unused and the part that is actually executed is pretty simple (if you've spent years coding in C-like languages). Here it as presented:

#include <stdio.h>
#include <stdlib.h>

typedef unsigned long u64;

/* Start here */
typedef void enc_cfg_t;
typedef int enc_cfg2_t;
typedef __int128_t dcf_t;

enc_cfg_t _ctx_iface(dcf_t s, enc_cfg2_t i){
int c = (((s & ((dcf_t)0x1FULL << i * 5)) >> i * 5) + 65);
printf("%c", c); }
enc_cfg2_t main() {
for (int i=0; i<17; i++){
_ctx_iface(0x79481E6BBCC01223 + ((dcf_t)0x1222DC << 64), i);
}
}
/* End here */

The writers have tried to make this look more complicated by using a bunch of typedefs and deliberately messing with indentation so that the main() function is kind of hidden.  After a bit of clean up the code looks like this:

#include <stdio.h>
#include <stdlib.h>

void _ctx_iface(__int28_t s, int i){
int c = (((s & ((__int128_t))0x1FULL << i * 5)) >> i * 5) + 65);
printf("%c", c); 
}

void main() {
for (int i=0; i<17; i++){
_ctx_iface(0x79481E6BBCC01223 + ((__int128_t)0x1222DC << 64), i);
}
}


So, main() calls _ctx_iface() 17 times with i set to 0, 1, ... 16 and passed in a weird looking number 0x79481E6BBCC01223 + ((__int128_t)0x1222DC << 64) which seems to have been obscured a bit with a left shift of 64 bits and an addition. The parameter is actually just 0x1222DC79481E6BBCC01223. Still looks odd but it's pretty simple: that's DREAM_ON_ASSHOLES encoded where each 5 bits indicating one upper case character or symbol (A = 0, B = 1, C= 2, ...).

If you express that number in binary you can separate it out into five bit chunks like this: 10010 00100 01011 01110 00111 10010 10010 00000 11110 01101 01110 11110 01100 00000 00100 10001 00011. You can even manually discover that this represents S E L O H S S A _ N O _ M A E R D if you take a guess that 11110 (30 = _).

The _ctx_iface() function extracts each character of DREAM_IN_ASSHOLES (17 characters) one by one and prints them. If does that with the code (((s & ((dcf_t)0x1FULL << i * 5)) >> i * 5) + 65) which uses i to indicate which 5 bits to extract and then shifts 0x1f (which is 11111 in binary) to the right position in the string, does a binary AND to extract just those five bits and then shifts the five bits back so that they become a number between 0 and 31.

The +65 turns a number between 0 and 31 into the characters ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_ because 65 is the ASCII character for A. Note that this encoding also explains why the program outputs DREAM_ON_ASSHOLES and not DREAM ON ASSHOLES. There's no way for the encoding to result in a space (which is ASCII 32).

All a bit of a disappointment really.

The rest of the code is more fun. There's a function called HammingCtr which computes the Hamming Weight of a binary string and even references and uses bit twiddling tricks from seander to make the code run quickly.

The other function is ConvolutedMagic which I have yet to decipher. Readers?

If you enjoyed this blog post, you might enjoy my travel book for people interested in science and technology: The Geek Atlas. Signed copies of The Geek Atlas are available.

<$BlogCommentBody$>

<$BlogCommentDateTime$> <$BlogCommentDeleteIcon$>

<$BlogBacklinkControl$> <$BlogBacklinkTitle$> <$BlogBacklinkDeleteIcon$>
<$BlogBacklinkSnippet$>